How to play H264 Raw Stream with H5?

I was surprised at first when I heard it. Why do you want to parse naked streams on the front end instead of pushing them in real-time in m3U8? This is due to the need for some business scenarios with high real-time requirements. If the raw stream is transmitted to the server first and then pushed, the delay is long, usually about 5 to 10 seconds. And like security this kind of scene, if slip into a thief, 10 seconds later appear surveillance to respond, the consequences can be imagined. The flash plugin will be banned from major browsers starting in 2020, so the days of using Flash to parse raw streams are over. After investigation, there are two schemes:

  1. Use Websocket to transmit H264 encoded data, use Broadway and other open source libraries to decode, and use HTML5 canvas to draw images. After testing, Broadway decoding efficiency is not high, if it is a relatively large main stream screen will be very slow. (macbook)
  2. Using MSE (Media Source Extension) Extension to achieve HTML5 video Tag streaming live. Through the test, the single-screen main code stream does not lag, the 4-screen sub-code stream does not lag, and the CPU usage is maintained at about 25%. (macbook)

Concrete implementation:

Using scheme 2, parse H264 raw stream data and encapsulate fMP4 fragment. Jmuxer, a plug-in based on code snippet encapsulation in hls.js, is recommended. Description of the source code:

Find the first frame

static extractNALu(buffer) {
    let i = 0;
    const length = buffer.byteLength;
    let value;
    let state = 0;
    const result = [];
    let lastIndex;
    // debug.log('length='+length);
    while (i < length) {
      value = buffer[i++];
      // finding 3 or 4-byte start codes (00 00 01 OR 00 00 00 01)
      // debug.log('state='+state);
      switch (state) {
        caseZero:if (value === 0) {
            state = 1;
          }
          break;
        case 1:
          if (value === 0) {
            state = 2;
          } else {
            state = 0;
          }
          break;
        case 2:
        case 3:
          if (value === 0) {
            state = 3;
          } else if (value === 1 && i < length) {
            if (lastIndex) {
              result.push(buffer.subarray(lastIndex, i - state - 1));
            }
            lastIndex = i;
            state = 0;
          } else {
            state = 0;
          }
          break;
        default:
          break; }}if (lastIndex) {
      result.push(buffer.subarray(lastIndex, length));
    }
    return result;
  }
Copy the code

After parsing the video stream fragment, find 785 frames (S frame, P frame, B frame) to encapsulate the first initial fragment, which can reduce the situation of green screen splashes caused by useless frames:

 nalus = H264Parser.extractNALu(data.video);
      const nalarr = [];
      const nal = nalus.shift();
      const nalType = nal[0] & 0x1f;
      
      if (this.spspps === false) {
        if (nalType === 7) {
          this.spsnal = nal;
          debug.log('find SPS');
          return;
        }
        if (nalType === 8) {
          this.ppsnal = nal;
          debug.log('find PPS');
          return;
        }
        if (nalType === 5) {
          this.ipsnal = nal;
          debug.log('find ips');
          return;
        }
        if(this.spsnal ! = null && this.ppsnal ! = null) { nalarr.push(this.spsnal); nalarr.push(this.ppsnal); nalarr.push(this.ipsnal); this.spspps =true;
          debug.log('encapsulate the initial frame with frames 7,8,5'); }}else if(nalType ! == 7 && nalType ! == 8) { nalarr.push(nal); }if (nalarr.length > 0) {
        chunks.video = this.getVideoFrames(nalarr, duration); // nalarr nalus
        remux = true;
      }
Copy the code

Parses information from binary video streams, which can be read and written from binary ArrayBuffer objects using DataView

exportclass ByteArray { constructor(ab) { this.buffer = ab; this.dataView = new DataView(ab); // offset this. Offset = 0; }GetArrayBuffer() {
    return this.buffer;
  }

  get bytesAvailable() {
    let diff = this.buffer.byteLength - this.offset;
    if (diff < 0) {
      diff = 0;
    }
    return diff;
  }


  WriteString(str) {
    for (leti = 0; i < str.length; i++) { this.dataView.setUint8(this.offset + i, str.charCodeAt(i)); } this.offset += str.length;} this.offset += str.length; } WriteUint32(value) { this.dataView.setUint32(this.offset, value); // this. Offset += 4; } ReadUint32(little =false) {
    const result = this.dataView.getUint32(this.offset, little);
    this.offset += 4;
    return result;
  }

  ReadUint16(little = false) {
    const result = this.dataView.getUint16(this.offset, little);
    this.offset += 2;
    return result;
  }

  ReadUint8() {
    const result = this.dataView.getUint8(this.offset);
    this.offset += 1;
    return result;
  }

  SliceNewAB(len, num = 0) {
    const ab = this.buffer.slice((this.offset - num), (this.offset - num) + len);
    this.offset += len - num;
    return ab;
  }

  SetOffset(num) {
    this.offset += num;
  }

  ReadStringBytes(len) {
    let str = ' ';
    for (let i = 0; i < len; i++) {
      str += String.fromCharCode(this.dataView.getUint8(this.offset + i));
    }
    this.offset += len;
    returnstr; }}Copy the code

Parsing video clips (logic varies depending on definition) :

 const socketBA = new ByteArray(abdata);
    while(socketBA.bytesAvailable > 0) { socketBA.ReadUint32(); socketBA.ReadUint32(); socketBA.ReadUint32(); Const h264buf = socketba.slicenewab (biSize); // biSize = this.feed({video: new Uint8Array(h264buf)}); }Copy the code

The test delay is about 100ms. Leave a comment if you have any questions.