Standard WebRTC connection establishment process

The process is described here and the signaling part of the call initiation and acceptance is omitted. The process is as follows: 1) WebRTC A forwards SDP offers to WebRTC B through Signal Server. After the local processing is complete, WebRTC B forwards the SDP ANSWER to A through the Signal Server.

2) BOTH A and B send A Binding Request to STUN Server for their own external ADDRESS, and then get their own external ADDRESS from the MAPPED-ADDRESS of the STUN Server packet.

3) A and B collect ICE candidates from the Internet and send them to each other through Signal Server;

4) Both parties start NAT traversal and send STUN Binding Request to each other’s ICE Candidate;

5) NAT traversal succeeds, the P2P connection between A and B is established, and the media communication stage is entered. In this process, we saw three core parts, SDP negotiation, ICE Candidate exchange, and connectivity checking of Stun Binding Req/Res.

WebRTC gateway connection establishment process

Now that we’ve looked at the standard WebRTC connection process, let’s look at how the WebRTC client connects to the gateway.

Firstly, the Media Server of our gateway has a public IP address, so the Server does not need to collect its own public IP address through Stun Server. The WebRTC client first negotiates the SDP with the gateway Signal Server, including the ICE Candidate, and the Media Server assigns IP and port as the gateway ICE Candidate to the client. Because the gateway is a public IP address, a STUN Binding Request sent by a client to this IP address will be received by the server and a Response will be returned. Then the client and the gateway media server conduct DTLS handshake and secret key negotiation, and further SRTP audio and video communication on this basis. The connection between the WebRTC client and the gateway server is successful.

The simplest server-side port scheme of WebRTC gateway server media architecture is that we can allocate A port to each client and use this port on the server to distinguish each user. As described in the figure, A, B, C and D correspond to UDP ports 10001~1004 respectively on the WebRTC gateway server. This solution is logically simple and is used by many open source servers, such as Janus. Another reason is that the libnice library is used to build ice connections between the server and the client, which is similar to the multi-port architecture. What’s wrong with so many ports? 1) Many network egress firewalls limit the UDP ports that can pass through;

2) For the server to open so many ports, the security itself also has certain problems, especially the operation and maintenance students, is refused;

3) To open so many ports on the Server, the cost and performance of ports have certain impact. Can we use a single port? Before using a single port, the core problem is how to distinguish which WebRTC client each RTP/RTCP packet belongs to.

To solve this problem, we need to use a few tricks. First of all, let’s take a look at some basic knowledge points. The diagram below:

1) The content in the ICE-UFRAg field configured in SDP Offer and Answer was originally used as the authentication of STUN packets. Therefore, the USERNAME field in the STUN Binding Request is a concatenation of the previous Offer and the ice-Ufrag contents of the answer.

2) The local UDP FD of the client sending the STUN Binding Request is the same as the UDP FD of the ICE sending media data after the establishment of the connection, that is, the IP port of the Server is the same.

With that background, you must have a rough idea. Let’s look at the implementation details:

1) Set ice-UFRAg to RoomID/userID in the SDP Answer of the Web server, where RoomID and userID are the contents assigned by the call service layer to distinguish each pair of calls and participants. USERNAME in the STUn Binding Request specifies the content of ice-uFRAg for SDP local and remote.

2) The server receives the CLIENT IP address and port of the STUN Binding Request and returns the STUn Binding Response normally.

3) Record the mapping between client addresses and user information.

4) The server receives an RTP/RTCP media packet and can identify which user the packet belongs to by querying the mapping table based on the source IP address and port of the packet.

The WebRTC client uses PeerConnection to represent different media connections, and we will show you how to select a PeerConnection scheme.

Experience live single-port and one-on-one video calls online: github.com/starrtc/and…