NALU is the basic unit of compressed video. According to different scenarios and transmission mechanisms, NALU can be divided into two transmission modes: packet stream and byte stream

Packet flow

Packet stream is based on THE RTP protocol, which directly uses NALU as the payload part of the RTP packet.

Byte stream

Byte stream mode is NALU in accordance with the decoding order into byte stream transmission. Since there is no NALU length information in NALU, it is impossible to distinguish different NALU if NALU is directly connected to byte stream. To solve this problem, start code field should be added before each NALU. The relevant specifications are defined in Appendix B of the H.265 standard.

NALU byte stream generation process:

  1. Before each NALU, insert the 3-byte start code start_CODE_Prefix_one_3Bytes with a value of 0x000001
  2. If the NALU type is VPS_NUT, SPS_NUT, PPS_NUT, or AU, the start code is preceded by zero_byte, whose value is 0x00
  3. Insert leading_ZERo_8bits, with a value of 0x00, before the start code (which may contain Zero_byte) of the first NALU of the video stream. Note: Leading_zero_8bits can only be added before the first NALU, Otherwise 0x00 followed by 4 bytes 0x00 00 00 01(zero_byte followed by leading_ZERo_8bits) is considered trailing_ZERo_8bits of the previous NALU
  4. Add trailing_ZERo_8bits as padding data after each NALU as needed, with a value of 0x00

The syntax format of the byte stream is shown in the following table, by which NALU can be extracted from the byte stream. You can see that the NALU front boundary can be determined by looking for the start code 0x000001, the video front boundary can be determined by looking for the first 0x00000001, and the AU front boundary can be determined by looking for 0x00000001.

NALU

In H.265, NALU consists of NALU header and NALU body.

NALU Header

The NALU header is fixed at 2 bytes and has the following structure.

The definition in FFMPEG is as follows:

typedef struct H265RawNALUnitHeader {
    uint8_t forbidden_zero_bit;
    uint8_t nal_unit_type;
    uint8_t nuh_layer_id;
    uint8_t nuh_temporal_id_plus1;
} H265RawNALUnitHeader;
Copy the code
  • Forbidden_zero_bit is 1bit, and its value should be set to 0 to prevent conflicts with the mpeg-2 start code.
  • Nal_unit_type: 6 bits. The value ranges from 0 to 63, indicating the current NALU type.
  • The value of nuh_layer_id is 6 bits and should be 0. This field is reserved for future use.
  • Nuh_temporal_id_plus1 is 3 bits, and its value is reduced by 1 to the NALU time domain layer label. TemporalId = NUh_temporal_ID_plus1 − 1

The types of NALU are defined as follows.

 

Ffmpeg defines NALU type as follows:

enum HEVCNALUnitType {
    HEVC_NAL_TRAIL_N    = 0,
    HEVC_NAL_TRAIL_R    = 1,
    HEVC_NAL_TSA_N      = 2,
    HEVC_NAL_TSA_R      = 3,
    HEVC_NAL_STSA_N     = 4,
    HEVC_NAL_STSA_R     = 5,
    HEVC_NAL_RADL_N     = 6,
    HEVC_NAL_RADL_R     = 7,
    HEVC_NAL_RASL_N     = 8,
    HEVC_NAL_RASL_R     = 9,
    HEVC_NAL_VCL_N10    = 10,
    HEVC_NAL_VCL_R11    = 11,
    HEVC_NAL_VCL_N12    = 12,
    HEVC_NAL_VCL_R13    = 13,
    HEVC_NAL_VCL_N14    = 14,
    HEVC_NAL_VCL_R15    = 15,
    HEVC_NAL_BLA_W_LP   = 16,
    HEVC_NAL_BLA_W_RADL = 17,
    HEVC_NAL_BLA_N_LP   = 18,
    HEVC_NAL_IDR_W_RADL = 19,
    HEVC_NAL_IDR_N_LP   = 20,
    HEVC_NAL_CRA_NUT    = 21,
    HEVC_NAL_IRAP_VCL22 = 22,
    HEVC_NAL_IRAP_VCL23 = 23,
    HEVC_NAL_RSV_VCL24  = 24,
    HEVC_NAL_RSV_VCL25  = 25,
    HEVC_NAL_RSV_VCL26  = 26,
    HEVC_NAL_RSV_VCL27  = 27,
    HEVC_NAL_RSV_VCL28  = 28,
    HEVC_NAL_RSV_VCL29  = 29,
    HEVC_NAL_RSV_VCL30  = 30,
    HEVC_NAL_RSV_VCL31  = 31,
    HEVC_NAL_VPS        = 32,
    HEVC_NAL_SPS        = 33,
    HEVC_NAL_PPS        = 34,
    HEVC_NAL_AUD        = 35,
    HEVC_NAL_EOS_NUT    = 36,
    HEVC_NAL_EOB_NUT    = 37,
    HEVC_NAL_FD_NUT     = 38,
    HEVC_NAL_SEI_PREFIX = 39,
    HEVC_NAL_SEI_SUFFIX = 40,
    HEVC_NAL_RSV_NVCL41 = 41,
    HEVC_NAL_RSV_NVCL42 = 42,
    HEVC_NAL_RSV_NVCL43 = 43,
    HEVC_NAL_RSV_NVCL44 = 44,
    HEVC_NAL_RSV_NVCL45 = 45,
    HEVC_NAL_RSV_NVCL46 = 46,
    HEVC_NAL_RSV_NVCL47 = 47,
    HEVC_NAL_UNSPEC48   = 48,
    HEVC_NAL_UNSPEC49   = 49,
    HEVC_NAL_UNSPEC50   = 50,
    HEVC_NAL_UNSPEC51   = 51,
    HEVC_NAL_UNSPEC52   = 52,
    HEVC_NAL_UNSPEC53   = 53,
    HEVC_NAL_UNSPEC54   = 54,
    HEVC_NAL_UNSPEC55   = 55,
    HEVC_NAL_UNSPEC56   = 56,
    HEVC_NAL_UNSPEC57   = 57,
    HEVC_NAL_UNSPEC58   = 58,
    HEVC_NAL_UNSPEC59   = 59,
    HEVC_NAL_UNSPEC60   = 60,
    HEVC_NAL_UNSPEC61   = 61,
    HEVC_NAL_UNSPEC62   = 62,
    HEVC_NAL_UNSPEC63   = 63};Copy the code

NALU Body

Since the length of the NALU must be an integer byte, the bit flow that generates the NALU often needs to be populated. The compressed bitstream segment generated by video encoding is called SODB(String of Data Bits). SODB may not be exactly an integer byte and needs to be filled with Bits to become whole bytes. The filled bit stream is called the Raw Byte Sequence Payload (RBSP).

The process for SODB to generate RBSP is as follows:

  1. The first byte of RBSP takes the leftmost 8 bits of SODB, the second byte takes the next 8 bits, and so on until there are less than 8 bits left in SODB.
  2. The next byte of RBSP first contains the last few bits of SODB, then adds bit 1, and if the byte is less than 8 bits, then fills 0.
  3. It may be followed by a number of 16-bit cabac_zero_word filling bits, with a value of 0x00 00.

The RBSP cannot be directly used as the NALU Body yet, because the RBSP may contain 0x00 00 01, which conflicts with the start code and must be avoided first.

0x00 00 02 is the reserved code.

More details about NALU can be found in the H.265 standard documentation.

If you are interested, you can pay attention to wechat public account Video Coding