A video contains one or more encoded video sequences CVS, each with an SPS that all reference the same VPS. The sequence parameter set SPS contains the shared coding parameters of all encoded images in a CVS, and all PPS in a CVS must reference the same SPS. When an SPS is referenced, the SPS is active until the end of the CVS.

SPS contains the following syntax elements:

  1. Image format information. It includes sampling format, image resolution, quantization depth, whether the decoding image needs clipping output and relevant clipping parameters.
  2. Encoding parameter information. Including coding block, the transform of the minimum and maximum sizes, interframe prediction within the time frame transform block maximum depth, divided into three channels of sampling 4:4:4 format whether individual coding, whether you need to frame filter, interframe prediction process of certain constraints (such as the use of asymmetric mode AMP, the use of time-domain MV prediction), whether or not to use quantitative matrix, Whether sampling point adaptive compensation SAO is needed, whether PCM mode is adopted and relevant coding parameters under this mode.
  3. Information related to the reference image. It includes the setting of short-term reference image, the use and number of long-term reference image, POC of long-term reference image and whether it can be used as the reference image of the current image.
  4. Profile, Tier, and Level parameters.
  5. Time domain hierarchical information. This includes the maximum number of sublayers in the time domain, parameters that control the transport POC carry, time-domain sublayer sequential identification switches, and parameters related to sublayers (such as the maximum demand for DPB).
  6. Video Usability Information (VUI), used to characterize additional Information such as Video formats.
  7. Other information. Contains the VPS number, SPS id, and SPS extension information referenced by the current SPS.

The following table is the syntax structure of SPS:

 

 

The following table is the syntax structure of the SPS extension:

 

The following table is the syntax structure of SPS in the SCC extension:

 

Sps_video_parameter_set_id: specifies the ID of the currently active VPS.

Sps_max_sub_layers_minus1: Specifies the maximum number of time domain sublayers that reference the CVS of the current SPS. Sps_max_sub_layers_minus1 The value ranges from 0 to 6.

Sps_temporal_id_nesting_flag: When SPS_MAX_SUB_layerS_MINus1 is greater than zero, it specifies whether to impose additional restrictions on CVS interframe predictions. When VPS_temporal_ID_neSTING_flag is 1, the value of this syntax element is 1. When spS_max_SUB_layerS_minus1 =0, the value of this syntax element is 1. This parameter is used to specify the time domain sublayer upscaling, that is, switching from a lower sublayer to a higher sublayer.

Sps_seq_parameter_set_id: specifies an SPS ID. The value ranges from 0 to 15.

Chroma_format_idc: indicates the chroma sampling format. The value ranges from 0 to 3. For example, if the value is 1, the 4:2:0 format is used.

Separate_colour_plane_flag: If this syntax element is set to 1, it indicates that the three channels in 4:4:4 format are individually coded. A syntax element of 0 indicates that no separate encoding is performed. 0 is assumed when the syntax element is not given.

Pic_width_in_luma_samples: represents the width of the luminance samples in the decoded image.

Pic_height_in_luma_samples: represents the height of the luminance samples in the decoded image.

Conformance_window_flag: indicates whether the decoder wants to crop the output of the decoded image.

Conf_win_left_offset, CONF_Win_right_offset, conf_WIN_top_offset and CONF_win_bottom_offset: When conformance_WINDOW_flag =1, these four parameters specify the width of left, right, top, and bottom clipping.

Bit_depth_luma_minus8: Indicates the bit depth of the brightness pixel.

Bit_depth_chroma_minus8: Indicates the bit depth of chromaticity pixels.

Log2_max_pic_order_cnt_lsb_minus4: The syntax element ranges from 0 to 12 and is used to calculate the value of the variable MaxPicOrderCntLsb. MaxPicOrderCntLsb is used to control the carry. Only one low POC is passed in the bitstream, not the high POC. The high POC and low POC of the former reference image and MaxPicOrderCntLsb are used to obtain the high POC of the current image, and then the actual POC of the current image is obtained by combining the low POC of the current image.

MaxPicOrderCntLsb=2^(log2_max_pic_order_cnt_lsb_minus4+4)

Sps_sub_layer_ordering_info_present_flag: Time domain sublayer sequence identification switch. This value is equal to 1, Sps_max_dec_pic_buffering_minus1 [I], spS_MAX_NUM_reorder_pics [I], spS_MAX_latency_INCREASe_plus1 [I] apply to sps_max_sub_layers _minus1 + 1 child layer; This value equals 0, indicating that these parameters apply to all of the child layers.

Sps_max_dec_pic_buffering_minus1 [I] : This syntax element specifies the maximum requirement for DPB when HighestTid= I.

Sps_max_num_reorder_pics [I] : When HighestTid= I, indicates the maximum number of images that are decoded before and displayed after an image. The value ranges from 0 to spS_max_DEC_PIC_BUFFering_minus1 [I]].

Sps_max_latency_increase_plus1 [I] : When this value is not 0, it is used to calculate the value of SpsMaxLatencyPictures[I].

SpsMaxLatencyPictures[I] = SPS_MAX_NUM_reorder_pics [I] + SPS_MAX_latency_INCREASe_plus1 [I] − 1

Log2_min_luma_coding_block_size_minus3: specifies the minimum size of the brightness encoding block.

Log2_diff_max_min_luma_coding_block_size: Specifies the difference between the maximum and minimum size of the brightness code block.

Log2_min_luma_transform_block_size_minus2: Specifies the minimum size of the brightness transform block.

Log2_diff_max_min_luma_transform_block_size: Specifies the difference between the maximum and minimum size of the brightness transform block.

Max_transform_hierarchy_depth_inter: represents the maximum partition depth of transform blocks when predicting between frames. The value ranges from 0, CTblog2SIZEy-Log2Mintrafosize.

Max_transform_hierarchy_depth_intra: represents the maximum partition depth of transform blocks when predicting within frames. The value ranges from 0, CTblog2SIZEy-Log2Mintrafosize.

Scaling_list_enabled_flag: indicates whether quantization matrix is used during quantization.

Sps_scaling_list_data_present_flag: indicates whether quantization matrix data exists.

Amp_enabled_flag: indicates whether the asymmetric partition mode is used.

Sample_adaptive_offset_enabled_flag: indicates whether SAO is used after block filtering is removed.

Pcm_enabled_flag: indicates whether to enable the PCM mode.

Pcm_sample_bit_depth_luma_minus1: indicates the bit depth of the PCM sample point in the luminance component.

Pcm_sample_bit_depth_chroma_minus1: indicates the bit depth of the PCM sample point in the chroma component.

Log2_min_pcm_luma_coding_block_size_minus3: indicates the minimum size of the code block in PCM mode.

Log2_diff_max_min_pcm_luma_coding_block_size: indicates the difference between the maximum and minimum size of the code block in PCM mode.

Pcm_loop_filter_disabled_flag: indicates whether loop filtering is used for the reconstructed pixels of the coding unit in PCM mode.

Num_short_term_ref_pic_sets: specifies the number of short_TERM_ref_PIC_set () in SPS. The value of 0 ~ 64.

Long_term_ref_pics_present_flag: Indicates whether long-term reference images are used for inter-frame prediction.

Num_long_term_ref_pics_sps: Specifies the number of long-term reference images. The range is 0 to 32.

Lt_ref_pic_poc_lsb_sps [I] : represents the value after modular of MaxPicOrderCntLsb of the ith long-term reference image in POC and SPS.

Used_by_curr_pic_lt_sps_flag [I] : indicates whether the ith long-term reference image can be used as the reference image of the current image.

Sps_temporal_mvp_enabled_flag: Specifies whether syntactic elements exist in the headers of non-IDR images. Slice_temporal_mvp_enable_flag (indicates whether time-domain MV prediction can be used in the inter-frame prediction process).

Strong_intra_smoothing_enabled_flag: Indicates whether bidirectional linear interpolation is used during filtering.

Vui_parameters_present_flag: indicates whether vui_parameters() contains a syntax structure.

Sps_extension_present_flag: a value of 0 indicates that the syntax element sps_extension_datA_FLAG does not exist; With a value of 1, sps_extension_datA_flag is reserved for future use.

Sps_range_extension_flag: can be any value and is ignored by the current version of the decoder.

Sps_multilayer_extension_flag: a value of 1 indicates that the syntax structure SPs_multilayer_extension () exists.

Sps_3d_extension_flag: a value of 1 indicates that the syntax structure SPs_3d_extension () exists.

Sps_scc_extension_flag: a value of 1 indicates that the syntax structure SPs_scc_extension () exists.

Sps_extension_4bits: the value is 0.

Sps_extension_data_flag: Can be any value and is ignored by current versions of decoders.

If you are interested, please pay attention to wechat public account Video Coding