The text/Phil Cluff

Translated by Yue Mei Wang

Original link: mux.com/blog/thursd…

The article is reprinted from the LiveVideoStack public account

If you want to read more technical articles, please pay attention to netease yunxin blog.

Learn about netease Yunxin, a communication and video cloud service based on netease’s core architecture.



Growing up in England, I would openly admit that American football was not my first sport of choice, but it was hugely popular by anyone’s standards. Amazon began streaming “Thursday Night Football” live on its Prime Video platform last year. A few months ago, they announced a two-year extension.



More recently, Amazon has begun to compete with itself by hosting Thursday Night Football on Amazon Prime Video and Twitch (note: Amazon bought Twitch in 2016 for about $970 million). To my knowledge, this is the first time a major (non-esports) sports event has been streamed live on Twitch.



My colleagues in the San Francisco office are obsessed with fantasy football, so when I found myself watching Thursday Night Football with my team, I thought, “Hey, what’s the stack behind this?” “And” Is Twitch streaming the same as Amazon Prime streaming in Twitch Player?”



Well, let’s take a closer look at some sports event streaming architecture!



Twitch (left) vs Amazon Prime Video (right)

How to understand streaming media architecture?

Studying the technology stack behind streaming services isn’t really that hard, especially after years in the industry debugging all sorts of weird customer setups and helping them transition to new systems. You can try everything I do here yourself, all you need is a browser, Curl, bento toolkit, and a good working knowledge of web video.

Amazon Prime — Unpacking the streaming stack

We’ll focus on the desktop browser strategy because it’s the easiest platform to debug. So let’s dive in, load Amazon Prime Player in Chrome, and launch the Web Checker.

What do we need to look for? First, let’s assume that Amazon and Twitch are using well-established streaming technologies, such as HLS or MPEG DASH. Both rely on text files called “checklists” to describe video rendering and let the browser know where to get the video clip to play back.

For HLS, we typically look for.m3u8 files in the network inspector; With DASH, we’re looking for.mpD requests, or sometimes just.xml requests. Fortunately, in this case, Amazon Prime seems to be using MPEG DASH and the more traditional.mpD file extension for their streams.

If we filter requests to.mpD and watch the video stream for a while, we’ll notice that the list of requests is renewed every few seconds. This lets the player know when the latest chunk of content is available and where to get it from. We can learn a lot about the Amazon Prime video delivery environment by looking at the checklist. Let’s take a look at the (slightly shortened) list below.

<? xml version="1.0" encoding="UTF-8"? > <MPD xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:scte35="urn:scte:scte35:2013:xml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" availabilityStartTime="The 2018-10-24 T06:01:19. 831000 + 00:00" id="201" minBufferTime="PT30S" minimumUpdatePeriod="PT5S" profiles="urn:mpeg:dash:profile:isoff-live:2011" publishTime="2018-10-26T03:17:16" suggestedPresentationDelay="PT2.000 S" timeShiftBufferDepth="PT299.000 S" type="dynamic" xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"> <BaseURL>.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /.. /SFO/clients/dash/enc/9tojxgpp-1/out/v1/8130a6b1cfb24c2aa27b73b90de12d82/</BaseURL>Copy the code

Packaging: MPEG DASH, H.264 encoded in a 2 second fMP4 fragment

If we look at some of the representations (that is, processing) in the list, we can see the codecs and media packaging techniques that Amazon is using. We can examine the “codec” string to see which codecs are in use, as well as the “mimeType” to check the packaging. The codec string actually contains a lot of information encoded as RFC 6381 strings, including the PROFILE of the H.264 in use. Transmitting this information in the manifest is useful because it allows you to use the API to ensure that the specific version of the codec can be decoded on the device. Amazon uses a common combination of H.264 for video and AAC for audio. Amazon uses nine video renditions for its desktop player, ranging from 288p to 720p30p @ 8 Mbit. They also demonstrated a demultiplexed audio reproduction in four different languages.



Checking the SegmentTemplate in the listing, we can see that the fragment is being provided with the.mp4 file extension (in general, but some people choose to provide.m4f for their fragment extension). If we change our filter to look for “.mp4 “while viewing content, we see segment requests occurring every few seconds. In addition, Amazon is providing audio and video clips separately (decomposition reuse).



We can also use the SegmentTemplate to calculate the length of the video clip. Starting with the video footage, we see that frameRate is set to “30/1”. Next we can see that the time scale of the video is “30”. When we combine this with the declaration duration (D = “60”) for each segment in the SegmentTimeline, we can calculate that each segment contains 60 frames @ 30 FPS and therefore 2 seconds of content. When streaming live video, segment length is important because it severely affects end-to-end transmission latency. In fact, two seconds is the lowest possible duration, in which case there is no negative impact on encoder performance or end user buffering experience.

AD insertion: Multiple DASH cycles

The top element we see in the DASH list is Periods. Scrolling down the list, we see several top-level cycle entities. Multi-cycle MPEG DASH is a way to implement AD insertion in live video streams. In this case, we see long periods of content followed by multiple shorter periods of time with ads.

We can learn more from the HTTP request we looked at earlier, in particular let’s look at the X-header in the Segment response.

Interesting are the X-Media Ackage headlines, which are the smoking gun that inadvertently exposes Amazon’s use of AWS Elemental Media Ackage products. We can look at the title again on the list request to confirm that the list was provided by Elemental MediaTailor products from AWS. MediaTailor is a checklist-based server-side AD insertion (SSAI) solution, so now we know how Amazon is doing AD replacement (and possibly AD targeting) for Thursday Night Football.

DRM: CENC encryption

So you can see how amazon protects its content — there are two different ContentProtection blocks nested in this list. The ContentProtection block defines the different methods available to clients to decrypt content.

<ContentProtection schemeIdUri="urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed"/>  
<ContentProtection schemeIdUri="urn:uuid:9a04f079-9840-4286-ab92-e65be0885f95"/>  

Copy the code
The two UUID’s above are well known in the industry — they tell us that Amazon is using a common combination of Widevine (edef8BA9) and Playready (9a04F079). This will provide fairly comprehensive coverage across the desktop, mobile platforms, and the most popular OTT devices.

CDN delivery: Akamai and CloudFront

Looking at amazon’s listing, we can see that the BaseURL of the media segment has no URL scheme at the beginning of it. This means that video clips are served through the same CDN infrastructure as listings. Going back to our original filter to find the manifest file, we see that the host name (and segment) of the manifest is https://aivottevtad-a.akamaihd.net. Akamaihd.net is the edge host name owned by Akamai, which lets us know that for this view, Amazon is using Akamai to deliver its video clips to end users.

It’s interesting to start now, because I want to think of AWS CloudFront as the primary media CDN. AWS is known to be pushing CloudFront in the media space, especially in the United States, which has the best Internet connections compared to global offerings. I did check with several other people watching the same stream and did find at least one stream provided by CloudFront and not Akamai. There may be more CDNS in the mix that we haven’t found yet. I also found Amazon Prime using Limelight to deliver video clips in the UK during the US Open.

Given Amazon Prime Video’s size and maturity, it makes sense that they would use multiple CDNS to provide some degree of redundancy. However, it is worth noting that given their current strategy of using relative hostnames in their listings, rather than using any form of DNS indirection over their edge hostnames, mid-stream CDN switching in the current architecture would not be able to maintain QOS (or at least require fairly complex player modifications). Pay close attention to their approach and see if they opt for mid-stream CDN switching, and if so, do they buy off the shelf or build their own solution? It’s going to be fun.

AWS Elemental Media Live Video encoder

So far, we have verified that Amazon is using their own AWS Elemental software solution. We checked their packaging and AD insertion technology, but knew nothing about the encoder (let’s face it, I’d be shocked if it wasn’t Elemental). It’s a little hard to identify, but there’s a simple thing we can do to get a hint.

For mPEG-DASH streams, the initialization segment is used for each video or audio playback to set up the decoder on the client side. You can see the URLS for these MP4 segments in the DASH listing under the SegmentTemplate initialization property. You can download one of the initialization segments and dump the contents using Bento’s mp4dump tool. I won’t go into the details of the MP4 structure (although I gave a talk on the subject at Demuxed a few years ago), but we can see the following interesting hierarchy in the MOOv/Trak/HDLR box:

mp4dump --verbosity 3 amazon-init.mp4

// Trimmed for space saving
[moov] size=8+1693
  [trak] size=8+595
  ...
    [mdia] size=8+495
    ...
      [hdlr] size=12+48
        handler_type = vide
        handler_name = ETI ISO Video Media Handler
Copy the code
In general, HDLR boxes are set up to be recognizable by the encoder. In this case, the “ETI” is the identification identifier set by Elemental encoder. I don’t know exactly what it stands for, but MY guess is “Elemental transcodes something.” In fact, I think Amazon will once again use AWS Elemental MediaLive for real-time coding.

Amazon — Speculative architecture

Warning: Some forethought.



When we put everything we’ve learned above together, we can get a pretty comprehensive picture of how Amazon built their video delivery stack for Thursday Night Football — let’s look at an architecture diagram. Let’s assume amazon uses at least what we know about CDN, but there could be more.



In fact, for something this high profile, I would expect some degree of redundancy in the architecture as well, possibly multiple independent branches running in different AWS regions. We already know that there are some forms of streaming startup CDN switching going on, but I’m sure Amazon is more than that.

To be honest, this is a very solid real-time streaming architecture that really reflects Amazon’s commitment to its OWN AWS and Elemental catalog. One of the big surprises for me was that Akamai was front and center in their delivery stack. Even though it appears to be load-balancing with CloudFront, there is certainly a lot of data flowing through Akamai, one of CloudFront’s biggest competitors.

Twitch-streaming Stack unpacking

So what exactly is Twitch? Is it in their player but just the same content? Well, this will be more interesting…… But it also involves more speculation. We know from talking to Twitch engineers that Twitch builds much of its video infrastructure in-house, which makes it harder to compare the responses of platforms known to the industry, but let’s see what we can figure out.

We’ll use the same strategy we used last time we launched the Chrome game on Twitch and looked for list requests. Twitch is a big believer in delivering content in the Apple HLS format, so let’s start with a request for.m3u8 files.



First try! So quickly, it seems unlikely Twitch will simply import Amazon Prime video streams and repackage them. Let’s check out Twitch’s main checklist to see what we can learn.

#EXTM3U
#EXT-X-TWITCH-INFO:SUPPRESS="true",MANIFEST-NODE="video-weaver.sjc02",BROADCAST-ID="30929838144",MANIFEST-CLUSTER="sjc02 ", NODE = "video - edge - a242e4. Sjc02", MANIFEST the NODE - TYPE = "weaver_cluster", CLUSTER = "sjc02", SERVER - TIME = "1540523190.00", TRANSCO Transcodeevent_v2 DESTACK = "2017", the USER - IP = "98.210.167.151", SERVING - ID = "54 fa5e185b94450ab67e1edd3b68cec0", ABS = "false", STREA M - TIME = "15636.023155"

#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="chunked",NAME="720p60 (source)",AUTOSELECT=YES,DEFAULT=YES
# EXT - X - STREAM - INF: FRAME - RATE = 60.000, BANDWIDTH = 6622552, RESOLUTION = 1280 x720 CODECS = "avc1.4 D4020, mp4a. 40.2", VIDEO = "chunked"
https://video-weaver.sjc02.hls.ttvnw.net/v1/playlist/Ct0Ds6pAbATHFHWuAFVkXteyoZK9Z2PHT-yJpD6-Y3meL9myH6K1DfiSXEGFacqiv-_hmdut19Gn5ye6XZWblmWS1zAKTy8eJONaMYv5Jxz7E0a7hEWxHFnmTUD4IWjEgk57m6IBHHxynZJp5Rp7mIigS6ycHqiTNgWcISWQ9jPpeNt OA9XKISN7GvvI0shGQS7QJZ-DlMPDF39R5o2fbAoHNUekFUcqorg7pOAkfm5SxNO5ikadvXi3g9v1-alJ-Im_LY9ZkQ1BT44uYWsxpqFj15tcgsmY5cSJkCk 1AbV9KxXOapla1QQ_Xu1kUpeCdnFzjSk1pTPY0axz3DE_X7ibAMZcsZmNUFDgrN7ofYxdNEAO-fU1C7wWQ697PojkWsd7drfZA478us8lRdSTxeRSOJtnxHArqAeCYBFnxGxzM_TtzOe5k3sHlwoIsY0UmJ6e5drbh7Sm2hZQ46GNRaca4llhzRDg_dkgAZX0WQ gHThyga6NxvYM4JJmXeerNjyNxqVSgqOH1LOWwuGZgX22g238GS-b0E39R8rbjTyG6reCUgqMp5A6DGtvvHWQCTliNMjpsu8PSqddYOti2x2Bj3gzI2e3H0w _1OMEmgz8FH491Ye_I5VhCjtUb8yIsEhAuBp5oVXr0Hq_cZ2g8E7B6Ggxg7Pw9W3aEEMZ8Ubk.m3u8# 5 other renditions removed to save space.

Copy the code


This is a pretty standard HLS master list that contains some Twitch metadata — according to the HLS specification, players are supposed to ignore statements they don’t understand. Let’s take a look at the same areas amazon sees on Twitch.

Package: HLS, H.264 encoded in 2 SEC fMP4 fragment

Just from the master listing we can see which CODECS Twitch is using for delivery — in the # ext-X-stream-INF CODECS field we can see the same codec combination we see from Amazon Prime — H.264 and AAC. Twitch offers six renditions, ranging from 160p to 720p at 60fps. This is very close to what we’ve seen before on Amazon. However, in Twitch’s case, the maximum bit rate is 6.6Mbps, but the frame rate is higher. This is probably the best choice for high-motion content. It’s also worth noting that Twitch’s bitrate and resolution are lower than Amazon’s, which means Twitch is becoming more aggressive in offering users cellular or low-performance Internet connections.

Since Twitch is using HLS, we need to perform additional steps to capture any information used for precision packaging. As we explained in our HLS blog post, HLS uses multiple listings — one listing all the available processes, and then another listing for the breakdown within each process. So let’s look at one of the processing listings — we can pull the URL from the main list and pull it down.

This is where it gets interesting. The processing checklist provided by Twitch contains VERSION 6 of the HLS announcement (# ext-X-version :6), which means Twitch is using some modern and interesting features of HLS — and it is. We discovered that Twitch uses # ext-X-map :URI to point to the fMP4 initialization segment — a method only included in the latest version of the HLS specification. We can also drop down the list to see all segment URLS pointing to.mp4 clips.

This is a big departure from Twitch’s usual strategy, which has long been to use a more traditional transport stream segment wrapper format (.ts). But does this new approach signal a fundamental change in Twitch’s strategy, or are there more obvious reasons for it?

As it turns out, the answer is actually pretty simple — Twitch seems to be DRM for the Thursday Night Football stream. To my knowledge, this is the first time Twitch has DRM content on their platform. I’ve been following the TwitchPresents channel since I started researching this topic, and I haven’t seen DRM used in any Pokemon or Bob Ross episodes. I guess the Thursday Night Football contract specifies the DRM requirements.

Thankfully, in HLS, we don’t have to do any math to get the duration of the media fragment — this information is neatly contained right above each media fragment in the list. In this case, we can see that each segment is preceded by #EXTINF:2.002, indicating that the segment is longer than 2 seconds:

# EXT - X - PROGRAM - the DATE - TIME: 2018-10-26 T03:06:27. 559 z
# EXTINF: 2.002, live
https://video-edge-a242e4.sjc02.abs.hls.ttvnw.net/v1/segment/LONGTEXT.mp4  

Copy the code

DRM: CENC encryption

So how do we tell if Twitch is using DRM on its fMP4 HLS stream? Of course, we need to get the Bento MP4 dumping tool out again. We can take the initialization URL declared in the demo listing and download it to see the data it contains.

This time we’ll dump the file and look for PSSH boxes, which declare the DRM techniques available for decrypting the file. In HLS, data about content encryption must be embedded in the media because only the specification for passing FairPlay DRM information is provided in the manifest file.

mp4dump --verbosity 3 twitch-init.mp4

// Trimmed forspace saving [pssh] size=12+75 system_id = [ed ef 8b a9 79 d6 4a ce a3 c8 27 dc d5 1d 21 ed] data_size = 55 data = [...]  [pssh] size=12+966 system_id = [9a 04 f0 79 98 40 42 86 ab 92 e6 5b e0 88 5f 95] data_size = 946 data = [...]Copy the code
If we take a closer look at these System_id, we’ll notice that they are the same UUID that we see in the ContentProtection block in the DASH listing for The Amazon Prime stream. This leads us to conclude that Twitch is also using Playready and Widevine to protect their desktop streams.

AD insertion: Twitch Weaver

Twitch video streams also have ads, but not on TV shows. By looking at the listing, we can see is using the reappearance of listing the content of the URL to known as the “Weaver” https://video-weaver.sjc02.hls.ttvnw.net.

As we learned on Demuxed a few weeks ago, Weaver is Twitch’s HLS AD insertion service, which stitches ads into video streams by declaring discontinuities in playlists and inserting snippses of AD content. This approach is fairly standard in the industry and much simpler than using multi-cycle DASH.

CDN: Twitch CDN (possibly)

Now things are starting to get a little murkier around here. If we tried to recreate our last method to figure out what CDN Twitch was using, we’d have no clue. Check the origin of Twitch URL, we get the hostname video-edge-a242e4.sjc02.abs.hls.ttvnw.net – but it has no help to us.

However, it is well known in the industry that Twitch runs its own CDN — I have checked other video streams from Twitch and they seem to come from similar host names in the same IP range as the ones I logged while watching Thursday Night Football. Reverse DNS lookups and IP WHOIS lookups don’t show anything particularly useful, just IP ranges owned by Amazon/Twitch.

Video encoder: Twitch’s (possibly) encoder

Trying to figure out what coder Twitch is using is also challenging. First, we can try to dump the contents of the HDLR box using the same method we used before, but unfortunately it gives us a very general answer:

mp4dump --verbosity 3 twitch-init.mp4

[hdlr] size=12+33
    handler_type = vide
    handler_name = VideoHandler

Copy the code
However, we can make assumptions based on what Twitch employees have said publicly. At the Streaming Media East event last year, Yueshi Shen and Ivan Marcin gave an excellent presentation on Twitch’s previous and next generation transcoding architecture. In this talk, Yueshi talked about how their new architecture is built around Intel’s Quick Sync and based on a combination of cost, stability and visual quality. I think the best assumption is that Twitch is using their usual Quick Sync encoder chain for video coding.

Twitch – Speculative architecture

Warning: Some forethought.

At this stage, we’ve learned as much as we can, but have no inside knowledge of how Twitch is built. Again, I came up with a theoretical architecture diagram, which I think is how Twitch is laid out internally.





My comment is that in this case, everything is Twitch proprietary software, which is not terribly shocking, but it’s safe to say that Twitch’s approach has unique advantages in terms of latency, which I’ll discuss in the next section.

The user experience

If I hadn’t mentioned the end user experience, I would have failed. From the end user’s point of view, the experience between the two services is non-contradictory and comparable. For me — at least on a fairly stable Internet connection — the video flows smoothly, with no buffering or visual quality issues on any platform.

However, I want to emphasize that there are some significant differences in end-user experiences between the two platforms.

delay

When it comes to watching games, the Twitch stream is clearly ahead of the Amazon Prime Video stream. Unfortunately, in our experiment, we didn’t have access to the cable TV stream to verify the difference in traditional broadcasting, so I can’t estimate exactly how far we’re talking about hanging up, but I can give some comparative data.

To test the relative latency, I refreshed the stream twice to let the stream time stabilize, then applied visual markers on the Twitch stream, fired up my stopwatch, and waited for the Amazon Prime stream to catch up with the same visual markers.

The difference between the flows is quite striking. On average, Twitch streams beat Amazon Prime streams by 12 seconds. On some attempts, the difference was as little as 10 seconds, while on others it was 16 seconds.

This is well worth pursuing. At LiveVideoStackCon 2018 in October, Twitch Principal Research Engineer Yueshi Shen introduced low-latency live streaming via HLS.

If we look at Akamai’s Will Law’s definition of Demuxed’s “low-latency stream” earlier this year, we can see roughly where Twitch and Amazon are now.

Now let’s say amazon’s hang up latency is around 10-15 seconds and Twitch’s around 5 seconds. Will Will describe Amazon as firmly in the “legacy latency range”, while Twitch is at the forefront of the “low latency range”.

In this particular case, Twitter has a long way to go and Amazon has some catching up to do.

Platform covers

I also want to mention something else THAT I noticed while researching this article. Twitch’s addition of DRM to Thursday Night Football appears to have had an impact on the platforms available for the stream.

As Twitch points out on its blog, streaming is “available on the web and mobile apps” — meaning that a whole bunch of platforms Twitch has traditionally reached (including Chromecast, PS4 and XBox One) don’t currently support their Thursday Night Football video stream. This is in stark contrast to Amazon’s Prime Video platform, where live streaming appears to be available anywhere Amazon has the Prime Video app.

conclusion

Wow, a long article, congratulations on learning so long! Given the work we’ve done, I’ve summarized the key technical details for each of the following implementations:



Author’s note: This data only applies to videos provided to desktop browsers. Other technologies may be used for some native devices, especially iOS applications.

Now a few comments. At the building block level, the architecture actually looks very different, but the same basic approach is used here, even if the details of the technology stack are different.

Both methods use H.264 and AAC, both use Widevine and Playready protected 2-second fMP4 clips, and both use the SSAI insertion strategy based on manifest operations. However, Twitch’s internal coding, CDN, and packaging architecture allow them to deliver lower latency streams at higher frame rates. Amazon has the advantage of a significantly higher top bit rate and a more comprehensive device footprint.

While Amazon’s approach relies heavily on AWS Elemental products, it’s also a good reference architecture — they can use the AWS Elemental product suite to go to market and say, “Hey, it’s for Thursday Night Football,” and that’s very valuable in the high-end live streaming market.

One last thought. After spending nearly $1 billion on Twitch, and with significantly less viewer lag, if Twitch’s approach seems to offer the same quality of experience (which is critical for live sports), why isn’t Amazon using it for their Prime Video streams?