Building Cross-Browser Video Chat Apps: An Introduction
Daily we face issues due to the number of platform-specific video chatting apps we use – Google Meet, Zoom for office meetings or FaceTime, WhatsApp for social interactions! The solution is WebRTC video chatting apps that can run directly from browsers and do not need specific hardware.
WebRTC technologies can also be used to build hybrid applications. These offer native app experiences like Gmail, Evernote or Uber apps, but are less expensive to develop within a short period of time.
In this blog, we will learn about building hybrid video chat apps running on WebRTC technologies
Core Features of WebRTC
Core characteristic of WebRTC is peer-to-peer communication between browsers. There are no third-party frameworks, additional applications or plugins needed to run such apps. It supports Firefox, Chrome and Opera on mobile platforms like Android and iOS. API-driven WebRTC does not require development tools and captures audio, video, data stream, network information gathering, reporting errors, and beginning and closing sessions.
In a hybrid app, an HTML5 app is bundled within a native WebRTC app, running within a web browser. Hence, development is quick and cost-effective as developers do not have to be specialists in a particular platform.
The functioning of WebRTC is defined by these components – Media stream (getUserMedia), RTCPeerConnection, RTCDataChannel and Signaling.
The MediaStream API will allow access to streams of media for audio and video data from the webcam gathering their parameters:
- constraints defining if the user needs to use audio, video or both or video resolution
- success callback after it is passed by MediaStream
- an error callback raised within the case of errors
This component provides communication of data between the peers. It also handles codec, and security issues.
This component supports transmission of all types of data – audio/video streams, transfer of files or text by using the same API as the WebSockets for secure data transmission with built-in DTLS protocol.
The most important factor in WebRTC is that browsers communicate with each other. An important technology which manages and coordinates the communication is ‘signaling.’ A signaling server is used for exchanging session messages, network configuration, and codecs over ICE Framework.
During the exchange of the network information and connecting peers using the UDP protocol, the ICE framework becomes very important. The caller will create the new RTCPeerConnection object using a handler – onicecandidate. When a network candidate becomes available, the handler is called. Then, the caller can send ‘serialized candidate data’ to the callee using WebSocket or a similar mechanism. In its turn, after getting the candidate message, the callee will use addIceCandidate to add the candidate to the remote peer description.
Using Session Description Protocol, WebRTC applications will exchange the session descriptions. The importance of this protocol is that it does not deliver any media using one peer or another. The primary purpose is to describe streaming media to initialize parameters for ‘negotiation’ between callee and caller. The core process is to ensure media type, properties and format are compatible.
The above functions are used in building a hybrid video chatting app…
Creating a Video Chatting App
The first step to creating a video chatting app is the coding process. While WebRTC does not need any third-party frameworks, a signaling server is needed. The choice of the framework will determine the signaling server that can be used, though WebRTC does not specify any such server technology. The common choice is Node.js
Two elements are added at this stage. The first will show a media stream from the webcam. The second will show media streamed from the person receiving the call. Two buttons are added to initiate the call and end the call.
- The prepareCall method is used to create a new RTCPeerConnection instance and will assign the needed event listeners.
- A new connection is created by using the Video Call button: peerConn = newRTCPeerConnection (peerConnCfg). These will analyze the state of the variable and decide the next action.
- When a new ICE candidate is received from the remote peer, we have to use the addIceCandidate to add new remote candidates to the RTCPeerConnection and remote description.
- Next, the createAndSendOffer, as well as the corresponding Answer method, is used to exchange the media information between peers.
- The final step is to define endcall() function.
With this function, the RTCPeerConnection is closed. This will lead to the shutdown of the video streams and will also reset the ‘video’ element state. Next, we have to activate the Video Call to make a user make a new call and disable the End Call button.
Creating a Signaling Server
To establish a WebRTC connection between two users or devices a signaling server has to be used. It will help to resolve how the connection can be made over the internet. It acts as the intermediary for the two peers to find each other and establish a connection. At the same time, they minimize the exposure of private information to outside elements.
The most common options are WebSocket to XMLHttpRequest to coordinate the exchange of signaling information between the peers. Even though a Session Description Protocol is used, there is no need to interpret the content carried by the signaling data.
While WebRTC does not specify the signaling mechanisms like protocols, and methods, we have to build it ourselves. Socket.IO is used for signaling.
An important factor is that WebRTC works only with the Secure Sockets Layer (SSL). Hence the functionality is built using appropriate coding.
Only the metadata of media is exchanged between the peers in the form of offers and answers. These two components are part of the Session Description Protocol (SDP).
Finally, the app is run with the nodejs server.js command
The above sections discuss the building of a simple WebRTC hybrid app for video chatting. It has basic functionality and supports one-to-one video chat. Since only limited functionalities are used there are some limitations like no exchange in text messages or files. There are no components that make the app look attractive.