Experimental feature: video call support

Thanks to the XiVO web/desktop assistant with WebRTC support you don’t need the phone on your desk anymore. All you need to receive your calls is a headset and your PC. Using the VPN, you can work from anywhere, as long as your connectivity is good enough. We’ve done some calls even using the french TGV onboard WiFi network!

Let’s move a step ahead and have a look on video calls. We’ve added the video call support as an experimental feature, a kind of beta which allows us to get some feedback before going further. As an experimental feature it’s disabled by default, your xucmgt needs to be started with ENABLE_VIDEO environment variable set to true to make video calls.

Where we are

For the first iteration we defined a minimum viable product, we allow one video call, with the hold/resume feature, possibility to go fullscreen and fallback on audio call when the remote is not video capable. We also protected interactions between audio and video calls like call transfer.

How does it look like

We included the media type indication to the incoming call popup, so an uncomfortable video call can be rejected.

IncomingCall

Once a video call is established, you get as usual a big remote video and a small video feedback. We disabled all the standard video controls to introduce only one centered button handling the fullscreen switching.

VideoCall

How does it work

Make a video call is quite easy - you just need to ask for video call when initiating the session. We played with different codecs, but currently using Asterisk 13 and Chrome only with VP8 the video call establishment was reliable. So we added VP8 support to our WebRTC Asterisk peers together with video support and that’s enough. Asterisk video support is passthrough only, so don’t expect any transcoding or other processing.

On incoming call things get harder, because currently we rely on PhoneEvents sent by the XUC server, and we have no indication of media type in these events. This strategy is interesting as we have a single source of “truth”. As our UI is completely based on these events, we need to correlate these events with our WebRTC library call notifications, to associate the media type with the right call. Currently we’re using a workaround based on the SIP CallId integrated to the PhoneEvents and used to identify the call in the WebRTC library. We should investigate whether the XUC can learn the media type, it would simplify the frontend implementation.

Use of PhoneEvents introduces one more issue - when creating or accepting a call, you need to pass the HTML elements for play media. Audio elements are injected to the webpage dynamically by our xc_webrtc library. The injection is not used for video elements, because these need a precise placement and styling. Use XUC events to dynamically create UI elements introduces a delay, the video element is created after the call is presented. Fortunately the session configuration can be updated during the call, so we added to the xc_webrtc library an update of these elements on ringing and connect to be sure the video is bind to the right element.

What we learned

Our strategy of relying on PhoneEvents from XUC server allows us to use the same code for phones and WebRTC peers, but has its limits. We need to think a little bit about the coordination between PhoneEvents and WebRTC events. A basic video support was not so hard, we spent more time on UI to ensure a good ergonomics and look and feel and there is still plenty of space for improvements. And currently we support only Chrome browser for WebRTC features, it would be even longer wider support was required. When mixing audio and video calls you have to design a strategy to handle transfers and other telephone features.

Next steps

We have already some ideas how to improve user experience, like call control buttons integration into the video, which takes its importance mainly in fullscreen, some more UI improvements like disabling the video feedback, but we will listen to our users before going further.

Share