Content editor: Alex

Technical examiner: Liu Lianxiang, Xu Yaowu

Justin Uberti Character Dialogue #003#

WebRTC became an official standard in January 2021, a major milestone for the open-source project, which was released by Google in 2011.

WebRTC aims to enable real-time media communication, such as voice, video and data transfer, between different browsers and various devices. WebRTC’s role in connecting the world has become increasingly important during the COVID-19 pandemic.

Justin Uberti (Photo courtesy of Myself)

To help you get a better understanding of WebRTC and its backstory, we emailed Justin Uberti, one of its creators (Justin was formerly head of engineering at Google’s WebRTC team and is now head of streaming technology at Clubhouse). In this interview, Justin talks about how he joined the WebRTC project, the challenges he and his team faced, the inspiration he got from the project and his hopes for WebRTC’s future.

This year, Justin left Google after 15 years to join audio social media platform Clubhouse. Justin spoke to us about why he’s leaving, revealing that he’s completely settled into his new job and enjoying what he’s doing.

In addition to WebRTC and job changes, Justin also shared his love of math and computing, and had a message for his younger self: Learn to appreciate the cumulative value of work.

The following is a conversation between LiveVideoStack and Justin Uberti.

Early life

LiveVideoStack: Hello Justin! Welcome to join us. Let’s start at the beginning. When you were a little boy, what were your dreams? Did you ever think about becoming a software engineer at that time? What kind of childhood did you have?

Justin Uberti: My parents were professors at a local college, and while they weren’t computer savvy, their encouragement was a big part of my interest in technology.

When I was about 6 years old, my father bought a Texas Instruments home computer for the family. It can be said that I began to have a strong interest in computers from this time. A few years later, I started playing the first games that Electronic Arts released, and they were fun and deepened my interest in computers. In the early days of EA, they went to great lengths to make their developers “rock stars,” which, well, seemed like my dream. I still have a “career goal” form that I wrote down when I was 9 or 10. I wrote “I want to be a software developer.”

LiveVideoStack: Did math play an important role in your life when you were a kid? What about computers?

Justin Uberti: As much as I’m interested in computers, it’s just a hobby. At that time, although there were a lot of math-related organizations and activities like math clubs and math camps, I focused on my school work and math competitions like Mathcounts and AIME. Along the way, I came to the conclusion that math was going to be my long-term career track.

LiveVideoStack: You studied mathematical physics as an undergraduate at the University of Virginia, but became a software engineer after graduation. What prompted you to enter a technology company? Have you ever thought about going on to graduate school or PhD?

Justin Uberti: In my senior year at the University of Virginia (1995), I went to see the then head of the physics department to learn about potential opportunities after graduation. He pulled up a browser on the Web and showed it to me. It was the first time I had ever been connected to the Web and I was hooked. I spent a lot of time studying how the web worked and started teaching myself C++ so that I could actually use this emerging technology. At the same time, I began to realize that there were more opportunities in computing than in mathematics.

Anyway, I was accepted to NYU and became a PhD in mathematics. But as my interest shifted to emerging technologies and my personal life changed (my father died while I was graduating from college), I soon decided to drop out of NYU and enter the tech world I had always wanted to live in.

Justin Uberti (Photo courtesy of Myself)

LiveVideoStack: Looking back now, is there anything you’d say to yourself as a college student?

Justin Uberti: I’ve always been reflective, so I have a lot to tell my old self here. Top of the list: Learn to appreciate the value of work over time. I used to watch great people in this field and often wondered what I could do to achieve the same level of success as them. But the truth is, these people I look up to just set a goal for themselves and work toward it every day. Over time, this accumulation of daily hard work can really lead to extraordinary results.

Also, I would tell myself to learn how to type properly, which I learned long after I started working!

WebRTC

LiveVideoStack: How did you get started with WebRTC development? What is your role on the WebRTC team?

Justin Uberti: I’ve been interested in voice and video communications for years, starting with the launch of AIM video chat in 2004 (where Justin worked for AOL as chief architect from 1997 to 2006) and leading the development of Gmail video calling and Google Hangouts video technology. I’m well aware of the complexity of this technology and how difficult it can be to develop, so the prospect of creating an open platform for more applications to use video technology is very attractive to me.

At Google, I did some preliminary research into how WebRTC would appear on browsers, based on my previous work developing browser-based RTC applications such as Google Hangouts. Eventually I became the engineering lead and manager of the WebRTC team and began working closely with Serge Lachapelle, the product manager. During the development of WebRTC, I did a lot of work, including writing standard documentation, coordinating with developers, developing sample code, and creating WebRTC stack components.

LiveVideoStack: What was the hardest part of the whole development process? How did you solve it?

Justin Uberti: There were two things that were really challenging for me:

First, the size and complexity of the overall WebRTC project. WebRTC itself has close to 1,000 apis and over a million lines of code, as well as more than 100 IETF RFCS implemented. We knew this was going to be a huge task from the start, but no one thought it would take nearly 10 years to fully implement it.

Second, in an open, consensus-based work environment, all parties have their own interests and motivations, which makes the project more complex. I was happy with the results, but I was inexperienced at the beginning and had to struggle to figure out how to solve some important issues, such as which video codecs WebRTC should use. Our team’s Harald Alvestrand provided critical support in this process.

LiveVideoStack: What does WebRTC mean to you? Do you have any regrets about it?

Justin Uberti: I think it makes a lot of sense to everyone who’s been involved with WebRTC since the beginning. We have truly fulfilled our desire to build a high performance, flexible, open and secure RTC platform! Then, during the COVID-19 pandemic, WebRTC was a great way to connect people, which really took off. Online communication has become the new way of working in many industries, and with plenty of new startups doing interesting things with RTC, I’m looking forward to the future. Everyone involved in the WebRTC project is proud of it!

As for regrets, I won’t say much. During the development of WebRTC, there was certainly a lot of work that had to be redone because of bad initial decisions, which was very unfriendly to early adopters of WebRTC. And some of the work is still too complicated, but every project has these problems.

LiveVideoStack: What features would you like to see in the next release of WebRTC?

Justin Uberti: Now that I’m no longer at Google, MY vision of WebRTC’s future isn’t what it used to be. But I know that there has been considerable development activity around Insertable Streams, the new Capture Handle proposal, and the new Data Channel implementation. I’m also looking forward to seeing the implementation of Cryptex, a new RTP metadata security mechanism.

LiveVideoStack: Some people say that QUIC is the future of WebRTC, what do you think? What challenges will WebRTC face when transferring data using QUIC?

Justin Uberti: I think QUIC will have a big impact on the future of WebRTC, because it’s still hard to build WebRTC servers. Http-based applications already have a great infrastructure, but creating a WebRTC server has to start from scratch. Being able to transport WebRTC traffic through A QUIC means that it becomes much easier to set up cloud WebRTC endpoints and transfer data to them. It would also be nice not to have to continue developing a completely separate transport protocol (and support library) for WebRTC.

However, transferring WebRTC over QUIC is tricky in the details. How do the different congestion control algorithms in WebRTC and QUIC work together? Should we transfer RTP in QUIC or map it to native QUIC concepts? QUIC is primarily used for C2S, right? Or should you consider P2P? There’s a lot to figure out.

LiveVideoStack: based on WebTransport/WebCodecs/WebAssembly technology already have the ability to implement a RTC engine, and Zoom have already done so. How do you see these new technologies competing with WebRTC? Will WebTransport replace WebRTC Datachannel in the short term?

Justin Uberti: In my opinion, these technologies are very important for people who want to deliver a more controlled experience. With WASM, you can end up taking most of the custom RTC stacks for native applications and deploying the code to the browser. Or put them in your custom codec. But these stacks will still be built on top of WebRTC, so I see WASM more as an extension of WebRTC than a competitor.

WebTransport is an interesting protocol that is a simpler network stack for client-server applications that require unreliable transport (and don’t require any other WebRTC mechanisms). I’ve been very supportive of this work and think Victor Vasiliev at Google has been doing a great job!

LiveVideoStack: This year, Google released two AI audio codecs — Lyra and SoundStream. As you can see, is it possible for these AI codecs to join WebRTC?

Justin Uberti: I certainly hope so, but for the point I made above, I think decoupling codecs from WebRTC would be great for innovation on both sides.

Overall, I firmly believe that ARTIFICIAL intelligence is the future of compression technology. The key question is, when will AI codecs replace current core codecs? I’ve talked to a lot of codec experts about this, and most of them agree that we need a new generation of AI codecs.

From Google to Clubhouse

LiveVideoStack: In your 15 years at Google, you’ve met a lot of amazing people. Which one has inspired you the most? What’s the most important thing you’ve learned from them?

Justin Uberti: There are two big influences on me. One was Linus Upson, who was the head of Chrome at the time and who approved the WebRTC project. He helped me figure out how to move the project forward and encouraged my team members to overcome seemingly impossible technical problems.

Eric Rescorla is the second biggest influence on me. I worked closely with him during the WebRTC project, when we were the engineering lead for Chrome and Firefox, respectively. Eric is very focused on robust security for WebRTC and has done everything he can to make it happen, including submitting several important patches to our team’s (Chrome) code base. This full-stack, regardless of gain and loss work attitude deeply moved me.

LiveVideoStack: Of all the work Google has done, what are you most satisfied with?

Justin Uberti: I’ve always had fun solving technical problems and delivering real value to users. WebRTC is definitely my favorite, but I also want to mention porting Stadia Gameplay to iOS via a Web app, which was a bit of a risk at the time, but it worked out, So people can play Cyberpunk 2077 (an action character game from CD Projekt RED) on the iPad.

LiveVideoStack: Why did you leave Google to join Clubhouse? Are you particularly interested in audio? Are you fully adapted to your new job now?

Justin Uberti: I joined Google when it was small. I really enjoyed the casual atmosphere. So I’m looking for a similar rapidly evolving environment, and a new application area to explore. Clubhouse and Social Audio fulfilled my dream, and I really like their team.

I spend a lot of time working on audio and video, but at Google, we focus on video because there are so many things going on in that space: HD, multi-user, screen sharing, VP8, VP9, AV1, etc. But at that time I felt that audio was actually being neglected, and there were a lot of interesting new directions to explore in audio, like Spatial Audio experience that we launched at Clubhouse. Yes, I would like to say that I have completely settled into the Clubhouse and am ready to devote myself to more interesting things.

LiveVideoStack: What do you see as the main difference between Clubhouse and Google?

Justin Uberti: To my surprise, almost all of the internal tools I used at Google — from search to Bug tracking to recruiting tools — are available outside as SaaS products. Because of economies of scale, these third-party tools are often better than Google’s equivalents.

Most of these differences are typical of startups and large companies, and I won’t go into them here. But it feels good to be able to focus on a single product and task. I think it also helps people feel more fulfilled.

LiveVideoStack: As head of streaming technology at Clubhouse, what qualities do you look for when hiring? Which is more important: educational background, experience or the ability to learn quickly?

Justin Uberti: In many cases, these qualities tend to be highly correlated: a quick learner usually has a strong educational background and prior work experience. But a strong ability to learn is what I value most — the new generation of talent will definitely be the leadership force of the future. We hired a couple of great people during WebRTC, but none of them had any experience at the time.

So strong learning ability is the most important.

LiveVideoStack: With Clubhouse recently announcing the launch of spatial audio, what do you find in users’ feedback on the spatial audio experience? What innovations will be in the Clubhouse?

Justin Uberti: It seems like people really like the audio space experience. Here’s some feedback from Twitter users:

The spatial audio on @clubhouse is implemented so well that it takes a little getting used to. On my walk this evening, I turned around three times to see whose voice it was, only to realize it was coming from the app.

This is what I looked like listening to @ClubHouse on Bluetooth in my car today. This new audio space feature is awesome!

I can’t give too many details about the next steps, but I would like to say that the innovation of The Clubhouse is full of potential and is very exciting.

Present and future

LiveVideoStack: What do you think are the biggest limitations of real-time communication technology today? Is there a solution?

Justin Uberti: We’re still in WebRTC/ Online conferencing 1.0. The fact that it has now been achieved and is reliable is a huge achievement in itself. But in some ways it’s still unnatural, and we can’t go beyond talking face to face.

One pain point THAT I think everyone has encountered is the mute button. The button is easy to understand, but if someone forgets to mute the video, it can be loud. This problem does not occur in face-to-face conversations.

I believe all these issues will be resolved. It just takes time. There may also be a need for technology to develop a deeper understanding of shared media.

LiveVideoStack: With Facebook joining the metasverse, what do you see as the core technologies for metasverse experiences in the next 5 to 10 years?

Justin Uberti: Maybe I’m a little premature here, but it seems to me that most interactions in the metasverse are real-time interactions. After all, the real world is a continuous stream of real-time interactions. So a lot of the stuff we’ve been working on over the last few years (WebRTC, QUIC, audio space, video (or point clouds), low latency, media processing and understanding, etc.) is very relevant to the meta-universe.

LiveVideoStack: What technologies are you interested in right now?

Justin Uberti: Well, you’ve already mentioned a lot! At Clubhouse, MY work is mainly in spatial audio and multichannel audio, as well as real-time speech recognition and transcription. At the same time, I have been thinking about how WebRTC and QUIC can be combined over the years.

I’m also super interested in AI (which helps me better understand AI-generated media). I’m also interested in new networking and encryption technologies.

LiveVideoStack: Last big question, if you had a chance to talk to a computer scientist or mathematician, who would you want to talk to? What do you want to talk to him about?

Justin Uberti: Very interesting question! If that person is still alive, I hope it’s Shannon. His work as a mathematician and engineer underpins almost the entire Internet today. I would ask him: What innovations did he envision but didn’t have the technology to make possible? What secondary effects might today’s technology have?

Special thanks to Teacher Xu Yaowu for his help and support in this interview.

Read more:WebRTC were born


The lecturer to recruit

LiveVideoStackCon 2022 Shanghai station, is open to the public recruitment of lecturers, whether you are in the size of the company, title level, old bird or rookie, as long as your content is helpful to technical people, the other is secondary. You are welcome to submit your personal information and topic description via [email protected], and we will give feedback within 24 hours.