Peer-to-peer video chat based on WebRTC. WebRTC technology: audio and video chat in the browser Supported desktop browsers

All tips

WebRTC (Web Real-Time Communications) is a technology that allows Web applications and sites to capture and selectively transmit audio and/or video media streams, as well as exchange arbitrary data between browsers, without necessarily using intermediaries. The set of standards that WebRTC technology includes allows you to exchange data and conduct peer-to-peer teleconferences without the user having to install plugins or any other third-party software.

WebRTC consists of several interconnected application programming interfaces (APIs) and protocols that work together. The documentation you'll find here will help you understand the basics of WebRTC, how to set up and use a connection for data and media streaming, and much more.

Compatibility

Since the WebRTC implementation is still in its infancy and every browser has WebRTC functionality, we strongly recommend using Google's Adapter.js polyfill library before starting to work on your code.

Adapter.js uses wedges and polyfills to seamlessly bridge differences in WebRTC implementations among the contexts that support it. Adapter.js also handles vendor prefixes and other property naming differences, making it easier to develop on WebRTC with the most compatible results. The library is also available as an NPM package.

To further explore the Adapter.js library, take a look.

WebRTC Concepts and Usage

WebRTC is multi-purpose and, together with , provides powerful multimedia capabilities for the Web, including support for audio and video conferencing, file sharing, screen capture, identity management, and interoperability with legacy telephone systems, including support for DTMF tone dialing. Connections between nodes can be created without the use of special drivers or plugins, and often without intermediate services.

The connection between two nodes is represented as an RTCPeerConnection interface object. Once a connection is established and opened, using the RTCPeerConnection object, media streams ( MediaStream s) and/or data channels ( RTCDataChannel s) can be added to the connection.

Media streams can consist of any number of tracks (tracks) of media information. These tracks are represented by MediaStreamTrack interface objects, and can contain one or more types of media data, including audio, video, text (such as subtitles or chapter titles). Most streams consist of at least only one audio track (one audio track), or video track, and can be sent and received as streams (real-time media) or saved to a file.

You can also use a connection between two nodes to exchange arbitrary data using the RTCDataChannel interface object, which can be used to transmit service information, stock market data, game status packages, file transfer or private data channels.

more details and links to relevant guides and tutorials needed

WebRTC interfaces

Because WebRTC provides interfaces that work together to perform different tasks, we have divided them into categories. See the sidebar index for quick navigation.

Connection setup and management

These interfaces are used to configure, open and manage WebRTC connections. They represent single-layer media connections, data channels, and interfaces that are used to exchange information about the capabilities of each node to select the best configuration for establishing a two-way multimedia connection.

RTCPeerConnection Represents a WebRTC connection between a local computer and a remote node. Used to handle successful data transfer between two nodes. RTCSessionDescription Represents the session parameters. Each RTCSessionDescription contains descriptions of type , indicating which part (offer/response) of the negotiation process it describes, and an SDP descriptor for the session. RTCIceCandidate Represents the Internet Connection Establishment (ICE) server candidate for establishing an RTCPeerConnection connection. RTCIceTransport Represents Internet Connectivity Facility (ICE) information. RTCPeerConnectionIceEvent Represents events that occur on ICE candidates, typically RTCPeerConnection . One type is passed to this event object: icecandidate. RTCRtpSender Controls the streaming and transmission of data through an object of type MediaStreamTrack for an object of type RTCPeerConnection . RTCRtpReceiver Controls the reception and decoding of data through an object of type MediaStreamTrack for an object of type RTCPeerConnection . RTCTrackEvent Indicates that a new incoming MediaStreamTrack object has been created and an RTCRtpReceiver object has been added to the RTCPeerConnection object. RTCCertificate Represents a certificate that uses the RTCPeerConnection object. RTCDataChannel Represents a bidirectional data channel between two connection nodes. RTCDataChannelEvent Represents events that are raised when an object of type RTCDataChannel is attached to an object of type RTCPeerConnection datachannel . RTCDTMFSender Controls the encoding and transmission of dual-tone multi-frequency (DTMF) signaling for an object of type RTCPeerConnection . RTCDTMFToneChangeEvent Indicates an incoming Dual Tone Multi Frequency (DTMF) tone change event. This event does not bubble (unless otherwise specified) and is not cancelable (unless otherwise specified). RTCStatsReport Asynchronously reports the status for the passed object of type MediaStreamTrack . RTCIdentityProviderRegistrar Registers an identity provider (idP). RTCIdentityProvider Enables the browser's ability to request the creation or verification of an identity declaration. RTCIdentityAssertion Represents the remote node identifier of the current connection. If the node has not yet been installed and confirmed, the interface reference will return null . Does not change after installation. RTCIdentityEvent Represents an identity provider (idP) declaration event object. Event of an object of type RTCPeerConnection. One type is passed to this identityresult event. RTCIdentityErrorEvent Represents an error event object associated with an identity provider (idP). Event of an object of type RTCPeerConnection. Two types of error are passed to this event: idpassertionerror and idpvalidationerror. Guides WebRTC Architecture Overview Underneath the API, which developers use to create and use WebRTC, lies a set of network protocols and connection standards. This review is a showcase of these standards. WebRTC allows you to organize a connection in a node-to-node mode to transfer arbitrary data, audio, video streams, or any combination of them in the browser. In this article, we'll take a look at the life of a WebRTC session, starting with the connection being established and going all the way until it terminates when it's no longer needed. WebRTC API Overview WebRTC consists of several interrelated application programming interfaces (APIs) and protocols that work together to support the exchange of data and media streams between two or more nodes. This article provides a brief overview of each of these APIs and what purpose they serve. WebRTC Basics This article will walk you through creating a cross-browser RTC application. By the end of this article, you should have a working point-to-point data and media channel. WebRTC Protocols This article introduces the protocols that complement the WebRTC API. This guide describes how you can use a node-to-node connection and linked

WebRTC (Web Real Time Communications) is a standard that describes the transmission of streaming audio data, video data and content from and to the browser in real time without installing plugins or other extensions. The standard allows you to turn your browser into a video conferencing terminal; you just need to open a web page to start communicating.

What is WebRTC?

In this article we will look at everything you need to know about WebRTC technology for the average user. Let's look at the advantages and disadvantages of the project, reveal some secrets, tell you how it works, where and what WebRTC is used for.

What you need to know about WebRTC?

The evolution of video communication standards and technologies Sergey Yutsaitis, Cisco, Video+Conference 2016 How WebRTC worksOn the client side

The user opens a page containing an HTML5 tag.
The browser requests access to the user's webcam and microphone.
The JavaScript code on the user page controls the connection parameters (IP addresses and ports of the WebRTC server or other WebRTC clients) to bypass NAT and Firewall.
When receiving information about the interlocutor or about the stream from the conference mixed on the server, the browser begins to negotiate the audio and video codecs used.
The encoding process begins and the transfer of streaming data between WebRTC clients (in our case, between the browser and the server).

On the WebRTC server side

A video server is not required to exchange data between two participants, but if you need to combine several participants in one conference, a server is required.

The video server will receive media traffic from various sources, convert it and send it to users who use WebRTC as a terminal.

Also, the WebRTC server will receive media traffic from WebRTC peers and transmit it to conference participants who use applications for desktop computers or mobile devices, if any.

Advantages of the standard

No software installation required.
Very high quality of communication, thanks to:
- Use of modern video (VP8, H.264) and audio codecs (Opus).
- Automatic adjustment of stream quality to connection conditions.
- Built-in echo and noise reduction system.
- Automatic adjustment of the sensitivity level of participant microphones (AGC).
High level of security: all connections are protected and encrypted using TLS and SRTP protocols.
There is a built-in mechanism for capturing content, for example, the desktop.
Possibility of implementing any management interface based on HTML5 and JavaScript.
The ability to integrate the interface with any back-end systems using WebSockets.
An open source project - you can implement it into your product or service.
True cross-platform: the same WebRTC application will work equally well on any operating system, desktop or mobile, provided that the browser supports WebRTC. This significantly saves resources on software development.

Disadvantages of the standard

To organize group audio and video conferences, a video conferencing server is required that would mix video and sound from the participants, because The browser does not know how to synchronize multiple incoming streams with each other.
All WebRTC solutions are incompatible with each other, because... the standard describes only methods for transmitting video and audio, leaving the implementation of methods for addressing subscribers, tracking their availability, exchanging messages and files, scheduling, and other things to the vendor.
In other words, you will not be able to call from a WebRTC application of one developer to a WebRTC application of another developer.
Mixing group conferences requires large computing resources, so this type of video communication requires purchasing a paid subscription or investing in your infrastructure, where each conference requires 1 physical core of a modern processor.

WebRTC Secrets: How Vendors Benefit from Breakthrough Web Technology

Tsachi Levent-Levi, Bloggeek.me, Video+Conference 2015 WebRTC for the videoconferencing marketincreasing the number of videoconferencing terminals

WebRTC technology has had a strong influence on the development of the video conferencing market. After the release of the first browsers with WebRTC support in 2013, the potential number of video conferencing terminals around the world immediately increased by 1 billion devices. In fact, each browser has become a videoconferencing terminal, not inferior to its hardware counterparts in terms of communication quality.

Use in specialized solutions

Using various JavaScript libraries and cloud service APIs with WebRTC support makes it easy to add video communication support to any web projects. Previously, to transmit data in real time, developers had to study the principles of protocol operation and use the developments of other companies, which most often required additional licensing, which increased costs. WebRTC is already actively used in services such as “Call from the site”, “Online chat support”, etc.

Ex-users of Skype for Linux

In 2014, Microsoft announced the end of support for the Skype for Linux project, which caused great irritation among IT specialists. WebRTC technology is not tied to the operating system, but is implemented at the browser level, i.e. Linux users will be able to see WebRTC-based products and services as a full-fledged replacement for Skype.

Competition with Flash

WebRTC and HTML5 were a death blow to Flash technology, which was already going through its worst years. Since 2017, leading browsers have officially stopped supporting Flash and the technology has completely disappeared from the market. But we must give Flash its due, because it was it that created the web conferencing market and offered the technical capabilities for live communication in browsers.

WebRTC video presentations

Dmitry Odintsov, TrueConf, Video+Conference October 2017

Codecs in WebRTCAudio codecs

WebRTC uses Opus and G.711 codecs to compress audio traffic.

G.711 is the oldest high-bitrate (64 kbps) voice codec, which is most often used in traditional telephony systems. The main advantage is the minimal computational load due to the use of lightweight compression algorithms. The codec has a low level of compression of voice signals and does not introduce additional audio delay during communication between users.

G.711 is supported by a large number of devices. Systems that use this codec are easier to use than those based on other audio codecs (G.723, G.726, G.728, etc.). In terms of quality, G.711 received a score of 4.2 in MOS testing (a score between 4-5 is the highest and means good quality, similar to the quality of ISDN voice traffic and even higher).

Opus is a codec with low encoding latency (from 2.5 ms to 60 ms), variable bitrate support and high compression levels, ideal for streaming audio over variable bandwidth networks. Opus is a hybrid solution that combines the best characteristics of the SILK (voice compression, elimination of distortion of human speech) and CELT (audio data coding) codecs. The codec is freely available; developers who use it do not need to pay royalties to copyright holders. Compared to other audio codecs, Opus undoubtedly wins in many respects. It has eclipsed quite popular low bitrate codecs such as MP3, Vorbis, AAC LC. Opus restores the sound “picture” closer to the original than AMR-WB and Speex. This codec is the future, which is why the creators of WebRTC technology included it in the mandatory range of supported audio standards.

Video codecs

The issues of choosing a video codec for WebRTC took the developers several years, and in the end they decided to use H.264 and VP8. Almost all modern browsers support both codecs. Video conferencing servers only need to support one to work with WebRTC.

VP8 is a free video codec with an open license, characterized by high video stream decoding speed and increased resistance to frame loss. The codec is universal and easy to implement into hardware platforms, which is why developers of video conferencing systems very often use it in their products.

The paid H.264 video codec became known much earlier than its brother. This is a codec with a high degree of compression of the video stream while maintaining high video quality. The high prevalence of this codec among hardware video conferencing systems suggests its use in the WebRTC standard.

Google and Mozilla are actively promoting the VP8 codec, and Microsoft, Apple and Cisco are actively promoting H.264 (to ensure compatibility with traditional video conferencing systems). And here a very big problem arises for developers of cloud WebRTC solutions, because if all participants in a conference use the same browser, then it is enough to mix the conference once with one codec, and if the browsers are different and Safari / Edge are among them, then the conference will have to be encoded twice different codecs, which will double the system requirements for the media server and, as a result, the cost of subscriptions to WebRTC services.

WebRTC API

WebRTC technology is based on three main APIs:

(responsible for the web browser receiving audio and video signals from cameras or the user’s desktop).
RTCPeerConnection (responsible for the connection between browsers for “exchanging” media data received from the camera, microphone and desktop. Also, the “responsibilities” of this API include signal processing (cleaning it from extraneous noise, adjusting the microphone volume) and control over the audio and video codecs used ).
RTCData Channel (provides two-way data transmission over an established connection).

Before accessing the user's microphone and camera, the browser requests permission to do so. In Google Chrome, you can configure access in advance in the “Settings” section; in Opera and Firefox, devices are selected directly at the time of gaining access, from a drop-down list. The permission request will always appear when using the HTTP protocol and only once if using HTTPS:

RTCPeerConnection. Each browser participating in a WebRTC conference must have access to this object. Thanks to the use of RTCPeerConnection, media data from one browser to another can even pass through NAT and firewalls. To successfully transmit media streams, participants must exchange the following data using a transport such as web sockets:

the initiating participant sends to the second participant an Offer-SDP (data structure with the characteristics of the media stream that it will transmit);
the second participant generates a “response” - Answer-SDP and sends it to the initiator;
then an exchange of ICE candidates is organized between the participants, if any are detected (if the participants are behind NAT or firewalls).

After successful completion of this exchange, the direct transfer of media streams (audio and video) is organized between the participants.

RTCData Channel. Support for the Data Channel protocol appeared in browsers relatively recently, so this API can only be considered when using WebRTC in Mozilla Firefox 22+ and Google Chrome 26+ browsers. With its help, participants can exchange text messages in the browser.

WebRTC connectionSupported desktop browsers

Google Chrome (17+) and all browsers based on the Chromium engine;
Mozilla FireFox (18+);
Opera (12+);
Safari (11+);

Supported mobile browsers for Android

Google Chrome (28+);
Mozilla Firefox (24+);
Opera Mobile (12+);
Safari (11+).

WebRTC, Microsoft and Internet Explorer

For a very long time, Microsoft remained silent about WebRTC support in Internet Explorer and its new Edge browser. The guys from Redmond don't really like to put technologies that they don't control into the hands of users, that's their policy. But gradually the matter moved from a dead point, because... It was no longer possible to ignore WebRTC, and the ORTC project, a derivative of the WebRTC standard, was announced.

According to the developers, ORTC is an extension of the WebRTC standard with an improved set of APIs based on JavaScript and HTML5, which, translated into ordinary language, means that everything will be the same, only Microsoft, not Google, will control the standard and its development. The set of codecs has been expanded with support for H.264 and some audio codecs of the G.7ХХ series, used in telephony and hardware video conferencing systems. There may be built-in support for RDP (for content transfer) and messaging. By the way, Internet Explorer users are out of luck; ORTC support will only be available in Edge. And, of course, this set of protocols and codecs easily interfaces with Skype for Business, which opens up even more business applications for WebRTC.

Technologies for making calls from the browser have been around for many years: Java, ActiveX, Adobe Flash... In the last few years, it has become clear that plugins and left-handed virtual machines do not shine with convenience (why should I install anything at all?) and, most importantly, security . What to do? There is an exit!

Until recently, IP networks used several protocols for IP telephony or video: SIP, the most common protocol, H.323 and MGCP coming off the scene, Jabber/Jingle (used in Gtalk), semi-open Adobe RTMP* and, of course, closed Skype. The WebRTC project, initiated by Google, is trying to revolutionize the world of IP and web telephony by making all softphones, including Skype, unnecessary. WebRTC not only implements all communication capabilities directly inside the browser, which is now installed on almost every device, but also tries to simultaneously solve a more general problem of communication between browser users (exchange of various data, screen broadcasting, collaboration with documents, and much more).

WebRTC from the web developer's perspective

From a web developer's point of view, WebRTC consists of two main parts:

management of media streams from local resources (camera, microphone or local computer screen) is implemented by the navigator.getUserMedia method, which returns a MediaStream object;
peer-to-peer communication between devices generating media streams, including defining communication methods and directly transmitting them - RTCPeerConnection objects (for sending and receiving audio and video streams) and RTCDataChannel (for sending and receiving data from the browser).

What do we do?

We will figure out how to organize a simple multi-user video chat between browsers based on WebRTC using web sockets. We’ll start experimenting in Chrome/Chromium, as the most advanced browsers in terms of WebRTC, although Firefox 22, released on June 24, has almost caught up with them. It must be said that the standard has not yet been adopted, and the API may change from version to version. All examples were tested in Chromium 28. For simplicity, we will not monitor the cleanliness of the code and cross-browser compatibility.

MediaStream

The first and simplest WebRTC component is MediaStream. It gives the browser access to media streams from the local computer's camera and microphone. In Chrome, for this you need to call the function navigator.webkitGetUserMedia() (since the standard is not yet finalized, all functions come with a prefix, and in Firefox the same function is called navigator.mozGetUserMedia()). When you call it, the user will be asked to allow access to the camera and microphone. It will be possible to continue the call only after the user gives his consent. The parameters of the required media stream and two callback functions are passed as parameters to this function: the first will be called if access to the camera/microphone is successfully obtained, the second - in case of an error. First, let's create an HTML file rtctest1.html with a button and an element:

WebRTC - first introduction video ( height: 240px; width: 320px; border: 1px solid gray; ) getUserMedia

Microsoft CU-RTC-Web

Microsoft wouldn't be Microsoft if it didn't immediately respond to Google's initiative by releasing its own incompatible non-standard option called CU-RTC-Web (html5labs.interoperabilitybridges.com/cu-rtc-web/cu-rtc-web.htm). Although IE's share, already small, continues to decline, the number of Skype users gives Microsoft hope to displace Google, and it can be assumed that this standard will be used in the browser version of Skype. The Google standard is focused primarily on communication between browsers; at the same time, the bulk of voice traffic still remains on the regular telephone network, and gateways between it and IP networks are needed not only for ease of use or faster distribution, but also as a means of monetization that will allow more players to develop them . The emergence of another standard may not only lead to the unpleasant need for developers to support two incompatible technologies at once, but also in the future give the user a wider choice of possible functionality and available technical solutions. Wait and see.

Enabling Local Stream

Inside the tags of our HTML file, let's declare a global variable for the media stream:

Var localStream = null;

The first parameter to the getUserMedia method must specify the parameters of the requested media stream - for example, simply enable audio or video:

Var streamConstraints = ("audio": true, "video": true); // Request access to both audio and video

Or specify additional parameters:

Var streamConstraints = ( "audio": true, "video": ( "mandatory": ( "maxWidth": "320", "maxHeight": "240", "maxFrameRate": "5"), "optional": ) );

The second parameter to the getUserMedia method must be passed to the callback function, which will be called if it is successful:

Function getUserMedia_success(stream) ( console.log("getUserMedia_success():", stream); localVideo1.src = URL.createObjectURL(stream); // Connect the media stream to the HTML element localStream = stream; // and save it in a global variable for further usage )

The third parameter is a callback function, an error handler that will be called in case of an error

Function getUserMedia_error(error) ( console.log("getUserMedia_error():", error); )

The actual call to the getUserMedia method is a request for access to the microphone and camera when the first button is pressed

Function getUserMedia_click() ( console.log("getUserMedia_click()"); navigator.webkitGetUserMedia(streamConstraints, getUserMedia_success, getUserMedia_error); )

It is not possible to access a media stream from a file opened locally. If we try to do this, we will get the error:

NavigatorUserMediaError (code: 1, PERMISSION_DENIED: 1)"

Let's upload the resulting file to the server, open it in the browser and, in response to the request that appears, allow access to the camera and microphone.

You can select the devices that Chrome will have access to in Settings, Show advanced settings link, Privacy section, Content button. In Firefox and Opera browsers, devices are selected from a drop-down list directly when access is allowed.

When using the HTTP protocol, permission will be requested each time the media stream is accessed after the page has loaded. Switching to HTTPS will allow you to display the request once, only the very first time you access the media stream.

Notice the pulsating circle in the bookmark icon and the camera icon on the right side of the address bar:

RTCMediaConnection

RTCMediaConnection is an object designed to establish and transmit media streams over the network between participants. In addition, this object is responsible for generating a media session description (SDP), obtaining information about ICE candidates for traversing NAT or firewalls (local and using STUN), and interacting with the TURN server. Each participant must have one RTCMediaConnection per connection. Media streams are transmitted using the encrypted SRTP protocol.

TURN servers

There are three types of ICE candidates: host, srflx and relay. Host contains information received locally, srflx - what the node looks like to an external server (STUN), and relay - information for proxying traffic through the TURN server. If our node is behind NAT, then host candidates will contain local addresses and will be useless, srflx candidates will only help with certain types of NAT and relay will be the last hope to pass traffic through an intermediate server.

Example of an ICE candidate of type host, with address 192.168.1.37 and port udp/34022:

A=candidate:337499441 2 udp 2113937151 192.168.1.37 34022 typ host generation 0

General format for specifying STUN/TURN servers:

Var servers = ( "iceServers": [ ( "url": "stun:stun.stunprotocol.org:3478" ), ( "url": "turn:user@host:port", "credential": "password" ) ]);

There are many public STUN servers on the Internet. There is a large list, for example. Unfortunately, they solve too few problems. There are practically no public TURN servers, unlike STUN. This is due to the fact that the TURN server passes through media streams, which can significantly load both the network channel and the server itself. Therefore, the easiest way to connect to TURN servers is to install it yourself (obviously, you will need a public IP). Of all the servers, in my opinion, the best is rfc5766-turn-server. There is even a ready-made image for Amazon EC2.

With TURN, not everything is as good as we would like, but active development is underway, and I would like to hope that after some time WebRTC, if not equal to Skype in terms of quality of passage through address translation (NAT) and firewalls, is at least noticeable will come closer.

RTCMediaConnection requires an additional mechanism for exchanging control information to establish a connection - although it generates this data, it does not transmit it, and transmission to other participants must be implemented separately.

The choice of transfer method rests with the developer - at least manually. As soon as the exchange of necessary data takes place, RTCMediaConnection will install media streams automatically (if possible, of course).

offer-answer model

To establish and change media streams, the offer/answer model (described in RFC3264) and the SDP (Session Description Protocol) are used. They are also used by the SIP protocol. In this model, there are two agents: Offerer - the one who generates the SDP description of the session to create a new one or modify an existing one (Offer SDP), and Answerer - the one who receives the SDP description of the session from another agent and responds with its own session description (Answer SDP). At the same time, the specification requires a higher-level protocol (for example, SIP or its own over web sockets, as in our case), which is responsible for transmitting SDP between agents.

What data needs to be passed between two RTCMediaConnections so that they can successfully establish media streams:

The first participant initiating the connection forms an Offer in which it transmits an SDP data structure (the same protocol is used for the same purpose in SIP) describing the possible characteristics of the media stream that it is about to begin transmitting. This block of data must be transferred to the second participant. The second participant forms an Answer, with its SDP, and sends it to the first.
Both the first and second participants perform the procedure of determining possible ICE candidates with the help of which the second participant can transmit a media stream to them. As candidates are identified, information about them should be passed on to another participant.

Formation Offer

To generate an Offer, we need two functions. The first one will be called if it is successfully formed. The second parameter of the createOffer() method is a callback function called in case of an error during its execution (provided that the local thread is already available).

Additionally, two event handlers are needed: onicecandidate when defining a new ICE candidate and onaddstream when connecting a media stream from the far side. Let's go back to our file. Let's add another one to the HTML after the lines with elements:

createOffer

And after the line with the element (for the future):

Also at the beginning of the JavaScript code we will declare a global variable for RTCPeerConnection:

Var pc1;

When calling the RTCPeerConnection constructor, you must specify STUN/TURN servers. For more information about them, see the sidebar; as long as all participants are on the same network, they are not required.

Var servers = null;

Parameters for preparing Offer SDP

Var offerConstraints = ();

The first parameter of the createOffer() method is a callback function called upon successful formation of the Offer

Function pc1_createOffer_success(desc) ( console.log("pc1_createOffer_success(): \ndesc.sdp:\n"+desc.sdp+"desc:", desc); pc1.setLocalDescription(desc); // Set RTCPeerConnection generated by Offer SDP using the setLocalDescription method. // When the far side sends its Answer SDP, it will need to be set using the setRemoteDescription method // Until the second side is implemented, we do nothing // pc2_receivedOffer(desc); )

The second parameter is a callback function that will be called in case of an error

Function pc1_createOffer_error(error)( console.log("pc1_createOffer_success_error(): error:", error); )

And let’s declare a callback function to which ICE candidates will be passed as they are determined:

Function pc1_onicecandidate(event)( if (event.candidate) ( console.log("pc1_onicecandidate():\n"+ event.candidate.candidate.replace("\r\n", ""), event.candidate); // Until the second side is implemented, we do nothing // pc2.addIceCandidate(new RTCIceCandidate(event.candidate)); ) )

And also a callback function for adding a media stream from the far side (for the future, since for now we only have one RTCPeerConnection):

Function pc1_onaddstream(event) ( console.log("pc_onaddstream()"); remoteVideo1.src = URL.createObjectURL(event.stream); )

When you click on the “createOffer” button, we will create an RTCPeerConnection, set the onicecandidate and onaddstream methods and request the formation of an Offer SDP by calling the createOffer() method:

Function createOffer_click() ( console.log("createOffer_click()"); pc1 = new webkitRTCPeerConnection(servers); // Create RTCPeerConnection pc1.onicecandidate = pc1_onicecandidate; // Callback function for processing ICE candidates pc1.onaddstream = pc1_onaddstream; // Callback function called when a media stream appears from the far side. There is none yet pc1.addStream(localStream); // Let's transmit the local media stream (assuming that it has already been received) pc1.createOffer(// And actually request the formation of the Offer pc1_createOffer_success , pc1_createOffer_error, offerConstraints); )

Let's save the file as rtctest2.html, upload it to the server, open it in a browser and see in the console what data is generated during its operation. The second video will not appear yet, since there is only one participant. Let us recall that SDP is a description of the parameters of a media session, available codecs, media streams, and ICE candidates are possible options for connecting to a given participant.

Formation of Answer SDP and exchange of ICE candidates

Both the Offer SDP and each of the ICE candidates must be transferred to the other side and there, after receiving them, RTCPeerConnection calls the setRemoteDescription methods for the Offer SDP and addIceCandidate for each ICE candidate received from the far side; similarly in the opposite direction for Answer SDP and remote ICE candidates. The Answer SDP itself is formed similarly to the Offer; the difference is that it is not the createOffer method that is called, but the createAnswer method, and before that the RTCPeerConnection method setRemoteDescription is passed to the Offer SDP received from the caller.

Let's add another video element to the HTML:

And a global variable for the second RTCPeerConnection under the declaration of the first one:

Var pc2;

Processing Offer and Answer SDP

The formation of Answer SDP is very similar to Offer. In the callback function called upon successful formation of an Answer, similar to Offer, we will give a local description and pass the received Answer SDP to the first participant:

Function pc2_createAnswer_success(desc) ( pc2.setLocalDescription(desc); console.log("pc2_createAnswer_success()", desc.sdp); pc1.setRemoteDescription(desc); )

The callback function, called in case of an error when generating Answer, is completely similar to Offer:

Function pc2_createAnswer_error(error) ( console.log("pc2_createAnswer_error():", error); )

Parameters for forming Answer SDP:

Var answerConstraints = ( "mandatory": ( "OfferToReceiveAudio":true, "OfferToReceiveVideo":true ) );

When the second participant receives the Offer, we will create an RTCPeerConnection and form an Answer in the same way as the Offer:

Function pc2_receivedOffer(desc) ( console.log("pc2_receiveOffer()", desc); // Create an RTCPeerConnection object for the second participant in the same way as the first one pc2 = new webkitRTCPeerConnection(servers); pc2.onicecandidate = pc2_onicecandidate; // Set the event handler when it appears ICE candidate pc2.onaddstream = pc_onaddstream; // When a stream appears, connect it to HTML pc2.addStream(localStream); // Transfer the local media stream (in our example, the second participant has the same one as the first) // Now, when the second RTCPeerConnection is ready, we will pass it the received Offer SDP (we passed the local stream to the first one) pc2.setRemoteDescription(new RTCSessionDescription(desc)); // Request the second connection to generate data for the Answer message pc2.createAnswer(pc2_createAnswer_success, pc2_createAnswer_error, answerConstraints); )

In order to transfer Offer SDP from the first participant to the second in our example, let’s uncomment it in the pc1 function createOffer success() call line:

Pc2_receivedOffer(desc);

To implement the processing of ICE candidates, let’s uncomment in the ICE candidate readiness event handler of the first participant pc1_onicecandidate() its transfer to the second:

Pc2.addIceCandidate(new RTCIceCandidate(event.candidate));

The second participant's ICE candidate readiness event handler is mirror-like to the first:

Function pc2_onicecandidate(event) ( if (event.candidate) ( console.log("pc2_onicecandidate():", event.candidate.candidate); pc1.addIceCandidate(new RTCIceCandidate(event.candidate)); ) )

Callback function for adding a media stream from the first participant:

Function pc2_onaddstream(event) ( console.log("pc_onaddstream()"); remoteVideo2.src = URL.createObjectURL(event.stream); )

Ending the connection

Let's add another button to the HTML

Hang Up

And a function to terminate the connection

Function btnHangupClick() ( // Disconnect local video from HTML elements, stop the local media stream, set = null localVideo1.src = ""; localStream.stop(); localStream = null; // For each participant, disable video from HTML elements, close the connection, set the pointer = null remoteVideo1.src = ""; pc1.close(); pc1 = null; remoteVideo2.src = ""; pc2.close(); pc2 = null; )

Let's save it as rtctest3.html, upload it to the server and open it in the browser. This example implements two-way transmission of media streams between two RTCPeerConnections within the same browser tab. To organize the exchange of Offer and Answer SDP, ICE candidates between participants and other information through the network, instead of directly calling procedures, it will be necessary to implement the exchange between participants using some kind of transport, in our case - web sockets.

Screen broadcast

The getUserMedia function can also capture the screen and stream as a MediaStream by specifying the following parameters:

Var mediaStreamConstraints = ( audio: false, video: ( mandatory: ( chromeMediaSource: "screen"), optional: ) );

To successfully access the screen, several conditions must be met:

enable screenshot flag in getUserMedia() in chrome://flags/,chrome://flags/;
the source file must be downloaded via HTTPS (SSL origin);
the audio stream should not be requested;
Multiple requests should not be executed in one browser tab.

Libraries for WebRTC

Although WebRTC is not yet finished, several libraries based on it have already appeared. JsSIP is designed to create browser-based softphones that work with SIP switches such as Asterisk and Camalio. PeerJS will make it easier to create P2P networks for data exchange, and Holla will reduce the amount of development required for P2P communications from browsers.

Node.js and socket.io

In order to organize the exchange of SDP and ICE candidates between two RTCPeerConnections via the network, we use Node.js with the socket.io module.

Installing the latest stable version of Node.js (for Debian/Ubuntu) is described

$ sudo apt-get install python-software-properties python g++ make $ sudo add-apt-repository ppa:chris-lea/node.js $ sudo apt-get update $ sudo apt-get install nodejs

Installation for other operating systems is described

Let's check:

$ echo "sys=require("util"); sys.puts("Test message");" > nodetest1.js $ nodejs nodetest1.js

Using npm (Node Package Manager) we will install socket.io and the additional express module:

$ npm install socket.io express

Let's test it by creating a nodetest2.js file for the server side:

$ nano nodetest2.js var app = require("express")() , server = require("http").createServer(app) , io = require("socket.io").listen(server); server.listen(80); // If port 80 is free app.get("/", function (req, res) ( // When accessing the root page res.sendfile(__dirname + "/nodetest2.html"); // send the HTML file )) ; io.sockets.on("connection", function (socket) ( // When connecting socket.emit("server event", ( hello: "world" )); // send a message socket.on("client event", function (data) ( // and declare an event handler when a message arrives from the client console.log(data); )); ));

And nodetest2.html for the client side:

$ nano nodetest2.html var socket = io.connect("/"); // Websocket server URL (the root page of the server from which the page was loaded) socket.on("server event", function (data) ( console.log(data); socket.emit("client event", ( " name": "value" )); ));

Let's start the server:

$ sudo nodejs nodetest2.js

and open the page http://localhost:80 (if running locally on port 80) in the browser. If everything is successful, in the browser's JavaScript console we will see the exchange of events between the browser and the server upon connection.

Exchange of information between RTCPeerConnection via web sockets Client part

Let's save our main example (rtcdemo3.html) under the new name rtcdemo4.html. Let's include the socket.io library in the element:

And at the beginning of the JavaScript script - connecting to websockets:

Var socket = io.connect("http://localhost");

Let's replace the direct call to the functions of another participant by sending him a message via web sockets:

Function createOffer_success(desc) ( ... // pc2_receivedOffer(desc); socket.emit("offer", desc); ... ) function pc2_createAnswer_success(desc) ( ... // pc1.setRemoteDescription(desc); socket .emit("answer", desc); ) function pc1_onicecandidate(event) ( ... // pc2.addIceCandidate(new RTCIceCandidate(event.candidate)); socket.emit("ice1", event.candidate); .. . ) function pc2_onicecandidate(event) ( ... // pc1.addIceCandidate(new RTCIceCandidate(event.candidate)); socket.emit("ice2", event.candidate); ... )

In the hangup() function, instead of directly calling the functions of the second participant, we will transmit a message via web sockets:

Function btnHangupClick() ( ... // remoteVideo2.src = ""; pc2.close(); pc2 = null; socket.emit("hangup", ()); )

And add message receiving handlers:

Socket.on("offer", function (data) ( console.log("socket.on("offer"):", data); pc2_receivedOffer(data); )); socket.on("answer", function (data) (е console.log("socket.on("answer"):", data); pc1.setRemoteDescription(new RTCSessionDescription(data)); )); socket.on("ice1", function (data) ( console.log("socket.on("ice1"):", data); pc2.addIceCandidate(new RTCIceCandidate(data)); )); socket.on("ice2", function (data) ( console.log("socket.on("ice2"):", data); pc1.addIceCandidate(new RTCIceCandidate(data)); )); socket.on("hangup", function (data) ( console.log("socket.on("hangup"):", data); remoteVideo2.src = ""; pc2.close(); pc2 = null; ) );

Server part

On the server side, save the nodetest2 file under the new name rtctest4.js and inside the io.sockets.on("connection", function (socket) ( ... ) function we will add receiving and sending client messages:

Socket.on("offer", function (data) ( // When we receive the "offer" message, // since there is only one client connection in this example, // we will send the message back through the same socket socket.emit("offer" , data); // If it were necessary to forward the message over all connections, // except the sender: // soket.broadcast.emit("offer", data); )); socket.on("answer", function (data) ( socket.emit("answer", data); )); socket.on("ice1", function (data) ( socket.emit("ice1", data); )); socket.on("ice2", function (data) ( socket.emit("ice2", data); )); socket.on("hangup", function (data) ( socket.emit("hangup", data); ));

In addition, let's change the name of the HTML file:

// res.sendfile(__dirname + "/nodetest2.html"); // Send the HTML file res.sendfile(__dirname + "/rtctest4.html");

Starting the server:

$ sudo nodejs nodetest2.js

Despite the fact that the code of both clients is executed within the same browser tab, all interaction between the participants in our example is completely carried out over the network and “separating” the participants does not require any special difficulties. However, what we did was also very simple - these technologies are good because they are easy to use. Even if sometimes deceptive. In particular, let's not forget that without STUN/TURN servers our example will not be able to work in the presence of address translation and firewalls.

Conclusion

The resulting example is very conventional, but if you slightly universalize the event handlers so that they do not differ between the caller and the called party, instead of two objects pc1 and pc2, make an RTCPeerConnection array and implement the dynamic creation and removal of elements, then you will get a completely usable video chat. There are no special specifics associated with WebRTC, and an example of a simple video chat for several participants (as well as the texts of all examples in the article) is on the disk that comes with the magazine. However, you can already find many good examples on the Internet. In particular, the following were used in preparing the article: simpl.info getUserMedia, simpl.info RTCPeerConnection, WebRTC Reference App.

It can be assumed that very soon, thanks to WebRTC, there will be a revolution not only in our understanding of voice and video communications, but also in the way we perceive the Internet as a whole. WebRTC is positioned not only as a technology for browser-to-browser calls, but also as a real-time communication technology. The video communication that we have discussed is only a small part of the possible options for its use. There are already examples of screencasting and collaboration, and even a browser-based P2P content delivery network using RTCDataChannel.

The purpose of this article is to use a demo sample of peer-to-peer video chat (p2p video chat) to familiarize yourself with its structure and operating principle. For this purpose, we will use the multi-user peer-to-peer video chat demo webrtc.io-demo. It can be downloaded from the link: https://github.com/webRTC/webrtc.io-demo/tree/master/site.

It should be noted that GitHub is a site or web service for the collaborative development of Web projects. On it, developers can post the codes of their developments, discuss them and communicate with each other. In addition, some large IT companies post their official repositories on this site. The service is free for open source projects. GitHub is a repository for open, free source code libraries.

So, we will place the demo sample of peer-to-peer video chat downloaded from GitHub on the C drive of a personal computer in the created directory for our application “webrtc_demo”.

Rice. 1

As follows from the structure (Fig. 1), peer-to-peer video chat consists of client script.js and server server.js scripts, implemented in the JavaScript programming language. Script (library) webrtc.io.js (CLIENT) - provides the organization of real-time communications between browsers using a peer-to-peer scheme: "client-client", and webrtc.io.js (CLIENT) and webrtc.io.js (SERVER), Using the WebSocket protocol, they provide duplex communication between the browser and the web server using a client-server architecture.

The webrtc.io.js (SERVER) script is included in the webrtc.io library and is located in the node_modules\webrtc.io\lib directory. The video chat interface index.html is implemented in HTML5 and CSS3. The contents of the webrtc_demo application files can be viewed using one of the html editors, for example "Notepad++".

We will check the operating principle of the video chat in the PC file system. To run the server (server.js) on a PC, you need to install the node.js runtime environment. Node.js allows you to run JavaScript code outside of the browser. You can download node.js from the link: http://nodejs.org/ (version v0.10.13 as of 07/15/13). On the main page of the node.org website, click on the download button and go to http://nodejs.org/download/. For Windows users, first download win.installer (.msi), then run win.installer (.msi) on the PC, and install nodejs and "npm package manager" in the Program Files directory.

Rice. 2

Thus, node.js consists of an environment for developing and running JavaScript code, as well as a set of internal modules that can be installed using the manager or npm package manager.

To install modules, you need to run the command on the command line from the application directory (for example, "webrtc_demo"): npm install module_name. During the installation of modules, the npm manager creates a node_modules folder in the directory from which the installation was performed. During operation, nodejs automatically connects modules from the node_modules directory.

So, after installing node.js, open the command line and update the express module in the node_modules folder of the webrtc_demo directory using the npm package manager:

C:\webrtc_demo>npm install express

The express module is a web framework for node.js or a web platform for application development. To have global access to express, you can install it this way: npm install -g express .

Then update the webrtc.io module:

C:\webrtc_demo>npm install webrtc.io

Then on the command line we launch the server: server.js:

C:\webrtc_demo>node server.js

Rice. 3

That's it, the server is running successfully (Figure 3). Now, using a web browser, you can contact the server by IP address and load the index.html web page, from which the web browser will extract the client script code - script.js and the webrtc.io.js script code, and execute them. To operate peer-to-peer video chat (to establish a connection between two browsers), you need to contact the signal server running on node.js from two browsers that support webrtc.

As a result, the interface of the client part of the communication application (video chat) will open with a request for permission to access the camera and microphone (Fig. 4).

Rice. 4

After clicking the "Allow" button, the camera and microphone are connected for multimedia communication. In addition, you can communicate via text data through the video chat interface (Fig. 5).

Rice. 5

It should be noted that. The server is a signaling server, and is mainly designed to establish connections between user browsers. Node.js is used to operate the server.js server script that provides WebRTC signaling.

WebRTC is an API provided by the browser and allows you to organize a P2P connection and transfer data directly between browsers. There are quite a few tutorials on the Internet on how to write your own video chat using WebRTC. For example, here is an article on Habré. However, they are all limited to connecting two clients. In this article I will try to talk about how to organize connection and exchange of messages between three or more users using WebRTC.

The RTCPeerConnection interface is a peer-to-peer connection between two browsers. To connect three or more users, we will have to organize a mesh network (a network in which each node is connected to all other nodes).
We will use the following scheme:

When opening the page, we check for the presence of the room ID in location.hash

If the room ID is not specified, generate a new one

We send the signaling server a message that we want to join the specified room

Signalling server sends a notification about a new user to other clients in this room

Clients already in the room send the newcomer an SDP offer

Newbie responds to offers

0. Signaling server

As you know, although WebRTC provides the possibility of P2P connection between browsers, its operation still requires additional transport for exchanging service messages. In this example, the transport used is a WebSocket server written in Node.JS using socket.io:

Var socket_io = require("socket.io"); module.exports = function (server) ( var users = (); var io = socket_io(server); io.on("connection", function(socket) ( // Want a new user to join the room socket.on("room ", function(message) ( var json = JSON.parse(message); // Add the socket to the list of users users = socket; if (socket.room !== undefined) ( // If the socket is already in some room , exit it socket.leave(socket.room); ) // Enter the requested room socket.room = json.room; socket.join(socket.room); socket.user_id = json.id; // Send to other clients in this room a message about the joining of a new participant socket.broadcast.to(socket.room).emit("new", json.id); )); // Message related to WebRTC (SDP offer, SDP answer or ICE candidate) socket.on("webrtc", function(message) ( var json = JSON.parse(message); if (json.to !== undefined && users !== undefined) ( // If the message specifies a recipient and that recipient known to the server, we send the message only to it... users.emit("webrtc", message); ) else ( // ...otherwise we consider the message to be broadcast socket.broadcast.to(socket.room).emit("webrtc", message); ) )); // Someone has disconnected socket.on("disconnect", function() ( // When a client disconnects, notify others about it socket.broadcast.to(socket.room).emit("leave", socket.user_id); delete users; )); )); );

1.index.html

The source code for the page itself is quite simple. I deliberately did not pay attention to layout and other beauties, since this article is not about that. If someone wants to make it beautiful, it won’t be difficult.

WebRTC Chat Demo Connected to 0 peers
Send

2. main.js 2.0. Getting links to page elements and WebRTC interfaces var chatlog = document.getElementById("chatlog"); var message = document.getElementById("message"); var connection_num = document.getElementById("connection_num"); var room_link = document.getElementById("room_link");

We still have to use browser prefixes to access WebRTC interfaces.

Var PeerConnection = window.mozRTCPeerConnection || window.webkitRTCPeerConnection; var SessionDescription = window.mozRTCSessionDescription || window.RTCSessionDescription; var IceCandidate = window.mozRTCIceCandidate || window.RTCIceCandidate;

2.1. Determining the room ID

Here we need a function to generate a unique room and user identifier. We will use UUID for these purposes.

Function uuid() ( var s4 = function() ( return Math.floor(Math.random() * 0x10000).toString(16); ); return s4() + s4() + "-" + s4() + "-" + s4() + "-" + s4() + "-" + s4() + s4() + s4(); )

Now let's try to extract the room identifier from the address. If one is not specified, we will generate a new one. Let's display a link to the current room on the page, and, at the same time, generate the identifier of the current user.

Var ROOM = location.hash.substr(1); if (!ROOM) ( ROOM = uuid(); ) room_link.innerHTML = "Link to the room"; var ME = uuid();

2.2. WebSocket

Immediately when opening the page, we will connect to our signaling server, send a request to enter the room and specify message handlers.

// Specify that when closing a message, you need to send a notification to the server about this var socket = io.connect("", ("sync disconnect on unload": true)); socket.on("webrtc", socketReceived); socket.on("new", socketNewPeer); // Immediately send a request to enter the room socket.emit("room", JSON.stringify((id: ME, room: ROOM))); // Helper function for sending address messages related to WebRTC function sendViaSocket(type, message, to) ( socket.emit("webrtc", JSON.stringify((id: ME, to: to, type: type, data: message ))); )

2.3. PeerConnection settings

Most ISPs provide Internet connections via NAT. Because of this, direct connection becomes not such a trivial matter. When creating a connection, we need to specify a list of STUN and TURN servers that the browser will try to use to bypass NAT. We will also indicate a couple of additional options for connection.

Var server = ( iceServers: [ (url: "stun:23.21.150.121"), (url: "stun:stun.l.google.com:19302"), (url: "turn:numb.viagenie.ca", credential: "your password goes here", username: " [email protected]") ] ); var options = ( optional: [ (DtlsSrtpKeyAgreement: true), // required for connection between Chrome and Firefox (RtpDataChannels: true) // required in Firefox to use the DataChannels API ] )

2.4. Connecting a new user

When a new peer is added to the room, the server sends us a new message. According to the message handlers above, the socketNewPeer function will be called.

Var peers = (); function socketNewPeer(data) ( peers = (candidateCache: ); // Create a new connection var pc = new PeerConnection(server, options); // Initialize it initConnection(pc, data, "offer"); // Save peers in the list peers peers.connection = pc; // Create a DataChannel through which messages will be exchanged var channel = pc.createDataChannel("mychannel", ()); channel.owner = data; peers.channel = channel; // Install event handlers channel bindEvents(channel); // Create an SDP offer pc.createOffer(function(offer) ( pc.setLocalDescription(offer); )); ) function initConnection(pc, id, sdpType) ( pc.onicecandidate = function (event) ( if (event.candidate) ( // When a new ICE candidate is detected, add it to the list for further sending peers.candidateCache.push(event.candidate); ) else ( // When candidate discovery is complete, the handler will be called again, but without candidate // In this case, we first send the peer an SDP offer or SDP answer (depending on the function parameter)... sendViaSocket(sdpType, pc.localDescription, id); // ...and then all previously found ICE candidates for (var i = 0; i< peers.candidateCache.length; i++) { sendViaSocket("candidate", peers.candidateCache[i], id); } } } pc.oniceconnectionstatechange = function (event) { if (pc.iceConnectionState == "disconnected") { connection_num.innerText = parseInt(connection_num.innerText) - 1; delete peers; } } } function bindEvents (channel) { channel.onopen = function () { connection_num.innerText = parseInt(connection_num.innerText) + 1; }; channel.onmessage = function (e) { chatlog.innerHTML += "Peer says: " + e.data + ""; }; }

2.5. SDP offer, SDP answer, ICE candidate

When we receive one of these messages, we call the handler for the corresponding message.

Function socketReceived(data) ( var json = JSON.parse(data); switch (json.type) ( case "candidate": remoteCandidateReceived(json.id, json.data); break; case "offer": remoteOfferReceived(json. id, json.data); break; case "answer": remoteAnswerReceived(json.id, json.data); break; ) )

2.5.0 SDP offer function remoteOfferReceived(id, data) ( createConnection(id); var pc = peers.connection; pc.setRemoteDescription(new SessionDescription(data)); pc.createAnswer(function(answer) ( pc.setLocalDescription(answer ); )); ) function createConnection(id) ( if (peers === undefined) ( peers = (candidateCache: ); var pc = new PeerConnection(server, options); initConnection(pc, id, "answer"); peers.connection = pc; pc.ondatachannel = function(e) ( peers.channel = e.channel; peers.channel.owner = id; bindEvents(peers.channel); ) ) ) 2.5.1 SDP answer function remoteAnswerReceived(id , data) ( var pc = peers.connection; pc.setRemoteDescription(new SessionDescription(data)); ) 2.5.2 ICE candidate function remoteCandidateReceived(id, data) ( createConnection(id); var pc = peers.connection; pc. addIceCandidate(new IceCandidate(data)); ) 2.6. Sending a message

When the Send button is clicked, the sendMessage function is called. All it does is go through the list of peers and try to send the specified message to everyone.

Peer-to-peer video chat based on WebRTC. WebRTC technology: audio and video chat in the browser Supported desktop browsers

Popular articles

Latest articles

Sections

Pages

Special projects

Contacts