RTP and RTCP, the important protocols for handling all multimedia transmissions, were defined in RFC 1889 in January 1996. In this article, Ron Frederick, one of the authors of the RTP protocol, tells us how this important protocol was born.

01

Before because of

In October 1992, I started experimenting with Sun Videopix’s image capture card because I wanted to write a web-based video conferencing tool based on IP multicast. Modeled after VAT (an audio conferencing tool developed by LBL), the tool uses a similar lightweight session protocol to allow users to participate in meetings, in which you simply send data to a specific multicast group and see if there is traffic from other group members.

For the program to run successfully, the video data must be compressed before it is sent. My goal is to compress the data stream to about 128 Kbps, or the bandwidth suitable for home ISDN lines. I also wanted the video to work even with half the bandwidth, which meant reducing the image size and frame rate to 1/20 of what it used to be. I finally achieved this goal and patented the technology: US5485212A (video compression technology for teleconferencing software).

At the beginning of November 1992, I released the Video conferencing tool NV (Network Video) on the Internet. After preliminary testing, it could be used to live Video broadcast part of the IETF meeting in November to the world, and 200 sub-networks from 15 countries could receive the live broadcast. About 50 to 100 people watched the video on NV in a week.

Over the next few months, three workshops and a number of smaller meetings were streamed live to the Internet via NV, The three workshops include NETWorkshop in Italy, MCNC Packet Audio and Video Workshop and Multig Workshop on Distributed Virtual Reality in Sweden.

The NV source code was subsequently released in February 1993, and in March I released a new version of the NV based on the wavelet theory compression scheme. In May, I added color video support to the tool.

The network protocols used for NV and other Internet conferencing tools became the basis of the RTC, first published through IETF standardization in RFC 1889-1890, and later revised in RFC 3550-3551 with other RFCs covering configuration files for hosting audio and video in specific formats.

Over the next few years, I continued the work on NV and ported it to many other hardware platforms and video capture tools. NV became one of the main tools for live webcast conferences and was even chosen by NASA to broadcast live coverage of space missions.

In 1994, NV began to support video compression algorithms developed by others, including hardware compression schemes such as the CellB format supported by SunVideo video capture cards. This allows NV to send video through CuseMe format and send the video to users running CuseMe on Macs and PCs.

The 3.3 beta was the last release of the NV, released in July 1994. I was in the process of preparing for the release of 4.0alpha and was planning to migrate it to version 2 of the RTP protocol, but I moved on to other projects, so the work was never done.

The framework provided by NV became the basis for video conferencing in the Jupiter Multi-Media Moo project (part of Xerox PARC), which eventually spun off into Placeware, a company that was later acquired by Microsoft. NV is also used for a variety of hardware video conferencing projects, enabling these projects to send complete NTSC high-quality live video over high-bandwidth Ethernet and ATM networks.

02

Why compress your own video?

I had just started working on NV at the time, and the only systems I knew for video conferencing were expensive specialized hardware. For example, at the time Steve Casner used a system from BBN called DVC. Compression required specialized hardware, and decompression could be done in software. What makes NV unique is that compression and decompression are done in software; the only hardware requirement is to digitize the input analog video signal.

Many of the basic concepts about compressed video already existed at the time. The MPEP-1 standard appeared around the same time as NV, but it was never possible to use MPEP-1 for real-time coding. The change I made then was to take the most basic of these concepts and implement them with much cheaper algorithms, where I avoided using cosine transforms and floating point numbers, or even integer multiplication, because they were so slow on Sparc Station. To restore speed and still look like a video, I tried to do only addition and subtraction, bit masking, and shift.

In the year or two since NV launched, there have been a number of different audio and video tools available on other platforms, such as CuseMe, which runs on the Mac. Clearly, the age of streaming has arrived. Finally, I made NV able to support other audio and video tools, and sometimes other tools use NV’s codecs, so they could interoperate with each other while using my compression scheme.

03

The drafting of RTP,

Steve Deering, IP Multicast Creator, Lead Designer for IPv6

All of us were working on IP multicast development at the time and helped create MBone (Multicast Backbone). Steve Deering (who first developed IP multicast), Van Jacobson, and Steve Casner worked together to develop MBONE. Steve Deering and I went to Stanford, and after graduation he went to Xerox PARC, where I spent a summer as an intern (working on the IP multicast project), then continued to work there in my spare time and eventually became a full-time employee. Van Jacobson, Steve Casner, Henning Schulzrinne and I were the original four authors who created the RTP protocol. The BMONE we were developing made possible all forms of online collaboration, so we wanted to design a network protocol that all tools could use, and RTP was born!

04

RTP, fun

One of the fun things is that I developed a version of a classic Spacewar game that uses IP multicast. Without any central server, multiple clients could independently run SpaceWar and start broadcasting information about their ship’s location, speed, direction of travel, and bullets fired. Other users would also pick up this information and render it locally so they could see each other’s ships and bullets, which would explode when they hit each other or were hit by bullets. I even make the debris of the explosion come alive to destroy the other ships one by one, sometimes there will be a series of explosions lol.

In the original game, I rendered it using analog vector graphics, so people could zoom in and out of everything in the game. The ship itself is a bunch of vector line segments that I invited my colleagues at Xerox PARC to help design, so each ship looks very different.

In general, RTP can transport any real-time data stream that does not require perfectly ordered delivery. So, in addition to audio and video, we can create things like shared whiteboards, and even let RTP transfer files, especially when used with IP multicast.

Think of a scenario like BitTorrent, but you don’t need all the data to be peer to peer. The original data sender can send a multicast stream to all receivers at once, and if a packet loss occurs during transmission, any party that successfully receives the data can retransmit the data. You can even determine the scope of the retransmission request so that nearby receivers can send copies of the data, and it can also be multicast to other receivers in the area, because packet loss in network transmission often means that there are a bunch of clients downstream who have all missed the same data at that point.

05

RTP, sorry

I don’t have any regrets about RTP. But I know that one of the biggest complaints about RTP is the complexity of implementing RTCP (the control protocol that runs in parallel with the main RTP data traffic). I think RTP is not widely used mostly because of this complexity, especially in some unicast situations where RTCP features are not required. As network bandwidth became less scarce and congestion became easier to deal with, many people sent audio and video over TCP (and later HTTP), and since that solution was “good enough,” RTP was no longer useful.

Unfortunately, using TCP or HTTP means that multiple audio and video applications need to send the same data multiple times to the receiving object, which is inefficient from a bandwidth perspective. I sometimes wonder if we could have done more with IP multicast than just research. If that’s the case, we could see the transition from cable and broadcast TV to Internet-based audio and video more quickly.

About the four authors of RTP:

Ron Frederick: American computer scientist and one of the authors of the RTP protocol, his Github: https://github.com/ronf/.

Van Jacobson: American computer scientist and one of the major contributors to TCP/IP. He is known for his pioneering achievements in TCP/IP network performance and scaling.

Steve Casner: American computer scientist and author of the RTP protocol. He won the IEEE Internet Award in 2020 for his contributions to the Internet Multimedia Protocol standard.

Henning Schulzrinne: Professor of Computer Science at Columbia University, CTO of the Federal Communications Commission (FCC). He co-developed protocol standards including RTP, RTSP, and SIP.

Translation/Alex

Technology Review/Lianxiang Liu

Original link:

https://webrtcforthecurious.c…


Please scan the picture for detailsQr codeOr clickRead the originalLearn more about the conference.