This is the fourth day of my participation in Gwen Challenge

Preface introduces

RTP/RTCP protocol is the cornerstone of streaming media communication.

  • The RTP protocol defines the packet format for streaming media data to be transmitted over the Internet
  • RTCP protocol is responsible for reliable transmission, traffic control and congestion control and other quality of service assurance.

In the WebRTC project, the RTP/RTCP module is part of the transport module

  • The media data collected by the sender is encapsulated and then sent to the upper network module for transmission.

  • The RTP/RTCP module on the receiving end unpacks the packets from the upper-layer module and sends the load to the decoding module.

Therefore, RTP/RTCP module plays a very important role in WebRTC communication.

Overview of RTP/RTCP

RTP is the basic protocol for streaming media transmission on the Internet. It specifies the standard packet format for audio and video transmission over the Internet.

  • RTP ensures only real-time data transmission, while RTCP ensures the quality of streaming media transmission and provides traffic control and congestion control.

  • During an RTP session, participants periodically send RTCP packets to each other. Packets contain statistics about data sent and received by participants. Participants can dynamically control the quality of streaming media transmission.

RFC3550 Defines the basic content of THE RTP/RTCP protocol, including the packet format and transmission rules. In addition, IETF also defines a series of extension protocols, including RTP based extension, RTCP based extension of packet type, and so on.

WebRTC data processing and transmission process

WebRTC provides two threads externally: Signal and Worker. The former is responsible for signaling data processing and transmission, while the latter is responsible for media data processing and transmission.

WebRTC thread relationships and data Pipline

  1. The Capture thread collects raw data from the camera to obtain VideoFrame.

  2. The Capture thread is system specific and may be the thread that calls the V4L2 interface on Linux systems or the AVFoundation framework interface on Mac systems.

  3. The VideoFrame of the raw data is then sent from the Capture thread to the Worker thread. The Worker thread acts as a porter and forwards the data to the Encoder thread without special processing.

  4. Encoder thread calls specific Encoder (such as VP8, H264) to encode the original video of rAME, and the encoded output is further RTP packets to form RTP packets.

  5. The RTP packet is then sent to the Pacer thread for smooth transmission, and the Pacer thread pushes the RTP packet to the Network thread. Finally, the Network thread calls the transport layer system function to send the data to the Network.

  6. At the receiving end, the Network thread receives the byte stream from the Network, and then the Worker thread deserializes into RTP packets and performs frame grouping operations in the VCM module.

  7. Decoder thread decodes the data frame completed by the group frame, and the original data VideoFrame after decoding will be pushed to the IncomingVideoStream thread, which will put the VideoStream into render for rendering display. At this point, a frame of video data from collection to display the complete process.

In the preceding process, RTP packets are generated after the encoding is completed at the sender, and the encoding output is encapsulated as RTP packets, which are then serialized and sent to the network.

  • After receiving the network packet by the network thread at the receiving end, it is restored to RTP packet through deserialization, and then it is unpacked to get the media data load for decoder to decode.

  • A series of statistical operations are performed during the sending and receiving of RTP packets. The statistical results are used as data sources for constructing RTCP packets.

RTP packet construction and send/receive statistics and RTCP packet construction and parsing feedback are the focus of the following analysis.

Sending and receiving RTP packets

RTP messages are constructed and sent after the encoder encodes and before the network layer sends packets, while receiving and unpacking occur after the network layer receives data and before the decoder encodes. This section analyzes the contents of the two parts in detail.

RTP packet construction and sending

The construction and sending process of RTP packets after encoding at the sender involves three threads: Encoder, Pacer and Network, which are responsible for encoding and constructing RTP packets, smoothing sending and sending at the transport layer respectively. The following details how these three threads work together.

RTP packet construction and sending

Application of RTP/RTCP data transmission protocol in WebRTC

  • The Encode thread calls the encoder (for example, VP8) to Encode the Raw VideoFrame. After encoding is complete, the output EncodedImage reaches the VideoSendStream::Encoded() function via a callback. Then through PayloadRouter routed to ModuleRtpRtcpImpl: : SendOutgoingData ().

  • The function call down RtpSender: : SendOutgoingData (), and then call RtpSenderVideo: : SendVideo ().

    • This function packages EncodedImage and then fills the RTP header to construct an RTP message. If FEC is configured, FEC packets are encapsulated. Finally, RtpSender::SendToNetwork() is returned for the next send.
  • The RtpSender::SendToNetwork() function stores the message in the RTPPacketHistory structure for caching.

  • Next, if PacedSending is enabled, Packet will be constructed and sent to PacedSender for queuing; otherwise, it will be directly sent to the network layer.

The Pacer thread periodically gets the Packet from the queue and calls PacedSender::SendPacket() to send it,

Next after arrive ModuleRtpRtcpImpl RtpSender: : TimeToSendPacket ().

This function first retrieves the Packet payload from the RtpPacketHistory cache and then calls PrepareAndSendPacket() : Update the related fields of RtpHeader, count latency and packets, and call SendPacketToNetwork() to send the packets to the transport module.

The Network thread invokes the transport layer socket to perform the data transmission operation. At this point, the RTP construction and sending process of the sender are complete. Note that after the Rtp sender sends Rtp packets, statistics about Rtp packets are collected. This information is important as the data source of RTCP SR/RR packets.

Receiving and parsing RTP packets

At the receiving end, the operation of receiving and unpacking RTP packets is mainly performed in the Worker thread. After receiving RTP packets from the Network thread, they enter the Worker thread. After unpacking, they enter the VCM module, which is decoded by the Decode thread and finally rendered by the Render thread.

Receiving and parsing RTP packets

RTP packets through the network layer to the Call object, according to the SSRC find corresponding VideoReceiveStream, by calling its DeliverRtp () function to RtpStreamReceiver: eliverRtp ().

This function first parses the packet to get the RTP header information, and then performs three operations: 1. Bit rate estimation; 2. Continue sending data packets. 3. Receive statistics. The bit rate estimation module uses GCC algorithm to estimate the bit rate, constructs REMB packets, and sends them to the RtpRtcp module to send back to the sender.

Receive statistics collect RTP received information, which is used as the data source of RTCP RR packets. The following focuses on analyzing the packet sending process.

  • RtpStreamReceiver: : ReceivePacket () to determine whether packets FEC message first, if it is call the FecReceiver unpacking, or direct call RtpReceiver: : IncomingRtpPacket ().

  • The function analysis of RTP packets get general describe structure RTP, head, and then call RtpReceiverVideo: arseRtpPacket () has been further Video information and load, and then return to RtpStreamReceiver object after the callback. This object sends the Rtp description and payload to the VCM module to continue the JitterBuffer caching and decoding rendering operations.

  • RTP packet unpacking is the reverse process of packet encapsulation. The important output information is the DESCRIPTION of THE RTP header and the media load, which is the basis for the JitterBuffer caching and decoding. In addition, statistics on RTP packets are the data source of RTCP RR packets.

Sending and receiving RTCP packets

RTCP is controlled by RTP and is responsible for ensuring the quality of service of streaming media. In common RTCP packets, the sender reports the SR and the receiver reports the RR, which contain data sending statistics and data receiving information respectively. This information is very important for streaming quality assurance, such as bit rate control, load feedback, and so on. Other RTCP packets include SDES, BYE, and SDES, which are defined in RFC3550.

This section focuses on WebRTC internal RTCP packet construction, sending, receiving, parsing, and feedback. It is important to note that the data source of RTCP packets is statistics about the sending and receiving of RTP packets. Within the WebRTC, RTCP packets are sent periodically and promptly. The ModuleProcess thread periodically sends RTCP packets. The RtpSender determines whether to send an RTCP packet before sending an RTP packet. In addition, after the receiving bit rate estimation module constructs REMB packets, the ModuleProcess module can send RTCP packets immediately by setting timeout.

RTCP packet construction and sending

On the sending end, RTCP is based on the periodic sending, supplemented by the timely sending of RTP packets and the immediate sending of REMB packets. The sending process mainly includes Feedback information acquisition, RTCP message construction, serialization and sending. The following figure shows the process of constructing and sending RTCP packets.

RTCP packet construction and sending:

PNG application of RTP/RTCP data transfer protocol in WebRTC

ModuleProcess threads periodically call ModuleRtpRtcpImpl: rocess () function, the function through RTCPSender: : TimeToSendRtcpReport () function to determine whether the current need send RTCP packet immediately. If so, first of all from RTPSender: : GetDataCounters () send RTP, statistical information, and then call RTCPSender: : SendRTCP (), then the SendCompoundRTCP () combined send RTCP packets.

In the SendCompoundRTCP() function, we first determine what type of RTCP packets will be sent through the PrepareReport(). Then, for each type of message, its constructor is called (for example, SR message is constructed as BuildSR() function). The constructed message is stored in PacketContainer. SendPackets() is finally called to send.

Each RTCP packet then calls its own serialization function to serialize the packet into a network byte stream. Finally through callbacks to PacketContainer: : OnPacketReady (), eventually the byte stream is sent to the transport layer module: by BaseChannel TransportAdapter arrive, Network transmission thread calls to send data to the Network layer of the socket API.

The process of constructing and sending RTCP packets is not very complicated. The core operations are obtaining data sources, constructing packets, serializing and sending packets. Comparatively, message construction and serialization are cumbersome and are carried out based on the details of RFC definition.

Receiving and parsing RTCP packets
  • On the receiving end, RTCP packets are received by the same process as RTP packets. After receiving the packets through the network, they reach the Call object, then find the VideoReceiveStream through SSRC, and then reach the RtpStreamReceiver. Parsing and feedback of the RTCP packets operation ModuleRtpRtcpImpl: : IncomingRtcpPacket () function.

  • This function first call RTCPReceiver: : IncomingRtcpPacket analytical RTCP packet (), get RTCPPacketInformation object, then calls the TriggerCallbacksFromRTCPPacket (), Triggers the various observers registered here to perform the callback operation.

  • RTCPReceiver: : IncomingRtcpPacket () message, using a combination of RTCPParser resolution for each message type, call the corresponding processing function (e.g., handling SDES HandleSDES function), to describe the structure of a message after deserialization. Finally, all the messages are integrated to form the RTCPPacketInformation object. Next to this object as a parameter called TriggerCallbacksFromRTCPPacket () function to trigger the callback operation, such as processing NACK callback, deal with SLI callback, processing REMB callback, and so on. These callbacks control the encoding, sending, bit rate and other quality of service assurance of streaming media data in their respective modules, which is where RTCP packets ultimately come into play.

So far, we analyze the whole process of sending and receiving RTCP packets.

summary

Based on the in-depth analysis of WebRTC source code, this paper describes the implementation process of RTP/RTCP module, and conducts in-depth and detailed research on key issues (such as the data source of RTCP packets). It lays a good foundation for further mastering the implementation principle and details of WebRTC.