Development background

Many developers have been giving feedback: Windows platform, make a push screen or push camera, push RTMP or RTSP out, do not know which functions are necessary, which design is optional, and do not know how to choose the technical solution, the following is based on our design of Windows PLATFORM RTSP, RTMP live push module, Design and use instructions for your reference.

Overall scheme Framework

Windows platform RTMP or RTSP push is a collection end module, which mainly completes the collection and coding of screen or camera data, microphone or speaker data, and then packages them in a specific format and transmits them through RTMP or RTSP to realize the purpose of live broadcasting.

Corresponding to the “publisher” of the design architecture diagram, the encoded audio and video data is packaged according to the protocol and pushed to the streaming media server (such as RTMP server, self-built service, SRS or Nginx server can be considered, if it is an RTSP server, you can consider the Official Darwin Streaming Server of Apple).

The design of this scheme is generally a one-to-many design model. The receiver receives RTMP or RTSP stream, and then analyzes audio and video data, decodes, synchronizes audio and video data, and draws it to realize the overall live broadcast solution.

The following is the design architecture diagram:

Module design

  • Its own framework, easy to expand, adaptive algorithm to lower latency, higher transmission efficiency of acquisition code;
  • All functions are provided in the form of interfaces. All states have event callbacks and support automatic reconnection when the network is disconnected.
  • Modular design, can be combined with Daniu live RTSP or RTMP live broadcast playback module to achieve streaming media data forwarding, mianmai, one-to-one interaction and other scenes;
  • Push overlay is provided in hierarchical mode, and developers can combine data sources themselves (such as multiple camera/screen/watermark overlay);
  • Supports external YUV/RGB/h. 264 / AAC/SPEEX/PCMA/PCMU data access;
  • All parameters can be set separately through the SDK interface, or through the default parameters, fool setting;
  • Push, video, built-in lightweight RTSP service module completely separated, can be used alone or combined.

Functional design

  • [Local preview] Support real-time preview of camera/screen/synthetic data;
  • [Camera reversal/rotation] Supports camera horizontal reversal, vertical reversal, 0°/90°/180°/270° rotation;
  • [Camera acquisition] In addition to the conventional YUV format, also supports MJPEG format camera acquisition;
  • [RTMP push stream] Ultra-low latency RTMP protocol live push stream SDK (Windows platform specific models hardcoded to support RTMP extension H.265 push);
  • Windows supports H.264/H.265 encoding;
  • [Audio format] support AAC encoding and Speex encoding;
  • [Audio encoding] Support Speex push, Speex coding quality Settings;
  • [Hard and soft coding parameter configuration] Support GOP interval, frame rate and bit-rate Settings;
  • [Soft coding parameter Configuration] Supports soft coding profile, soft coding speed and variable bit rate Settings.
  • [Multi-instance push] Supports multi-instance push (for example, push screen/camera and external data simultaneously);
  • [RTMP extension H.265]Windows/Android push SDK supports THE RTMP extension H.265 push, Windows for camera acquisition soft coding, using H.265 variable bit rate, bandwidth saving, the effect is close to the traditional H.265 encoding camera;
  • [Multi-resolution support] Supports multiple camera or screen resolution Settings;
  • [Windows Screen Push] Supports screen clipping, window collection, screen/camera data synthesis and other modes of push;
  • [Event callback] Supports real-time callback of various states;
  • Windows platform support text watermarking, PNG watermarking, real-time occlusion;
  • [Complex network processing] Automatic adaptation of various network environments such as disconnection and reconnection;
  • [Dynamic bit rate] Supports automatic adjustment of stream bit rate according to network conditions.
  • [Real-time mute] Supports real-time mute/unmute during push.
  • [Real-time snapshot] Supports real-time snapshot in the process of pushing flow.
  • [Pure audio push stream] support only to collect audio stream and push stream function;
  • [Pure video Streaming] Supports pure video streaming in special scenarios;
  • [Noise reduction] Support noise reduction processing, automatic gain, VAD detection caused by ambient sound and mobile phone interference;
  • [Video data interconnection before external coding] YUV data interconnection;
  • [Audio data docking before external coding] support PCM docking;
  • [Video data docking after external encoding] Supports external H.264 data docking;
  • External AAC/PCMA/PCMU/SPEEX data connection;
  • [Extended video function] Perfect support and video SDK combination use;
  • Server compatibility Supports self-built servers (such as Nginx and SRS) or CDN.

Integration and usage instructions

The demo illustrates

  • Windows platform RTMP/RTSP live push module provides C++/C# interfaces and 32-bit / 64-bit libraries. C++ and C# interfaces correspond one by one. C# interface adds prefix NT_PB_ compared with C++ interface.
  • Win-publishersdk-cpp -Demo: Demo of the C++ interface corresponding to the push SDK;
  • Win-publishersdk-csharp-demo: Demo of the C# interface corresponding to the push SDK;
  • The push module supports Windows 7 and above.
  • This demo is based on VS2013 development.

C++ header file:

  • nT_type_define.h
  • [definition Log] smart_log. H
  • [definition Log] smart_log_define. H
  • Nt_common_media_define.h
  • [Base code definition]nt_base_code_define.h
  • [publisher interface] nt_smart_publisher_define. H
  • [publisher interface] nt_smart_publisher_sdk. H

C# header file:

  • [definition Log] smart_log. Cs
  • [definition Log] smart_log_define. Cs
  • [Base code definition]nt_base_code_define.cs
  • [publisher interface] nt_smart_publisher_define. Cs
  • [Publisher parameter definition] nt_smart_publisher_sdK.cs

Related Lib:

  • SmartLog.dll
  • SmartLog.lib
  • SmartPublisherSDK.dll
  • SmartPublisherSDK.lib
  • avcodec-56.dll
  • avdevice-56.dll
  • avfilter-5.dll
  • avformat-56.dll
  • avutil-54.dll
  • postproc-53.dll
  • swresample-1.dll
  • swscale-3.dll

The integration steps

  1. Copy the debug/release library from the lib directory to the corresponding debug or Release directory of the project to be integrated (make sure that the 32-bit and 64-bit library debug/ Release directories correspond one by one).

The lib directory is as follows:

    1. 32-bit Debug library: Debug
    2. 32-bit release library: release
    3. 64-bit debug library: x64\debug
    4. 64-bit release library: x64\release

2. Related CS header files, add projects to be integrated;

3. Right-click the project to be integrated, choose Properties >Application >Assembly Name, and write SmartPulisherDemo.

Function,

Considering the relatively complex functions of the Push SDK of Windows platform, the q&A format is as follows:

1 Video capture Settings

1. Switch between screen and camera: for online education or paperless scenes, switch screen or camera data (switch data source) at any time in the process of pushing or recording. If real-time switch is required, click “Switch to camera” button on the page;

2. Set up cover layer, used to set a rectangular or square area (can specify the size of the area), cover do not want to show the user part;

3. Watermarking: PNG watermarking can be added and cancelled at any time during push or video recording.

4. Camera overlay to the screen: It is intended to be used in the process of the same screen, the presenter’s camera is suspended on the screen (overlay coordinate can be specified), to achieve dual-screen display, and camera overlay can be cancelled at any time in the process of push or video recording;

5. Overlay the screen to the camera: same as 4.

6. Collection desktop: The collection area can be obtained by clicking “Select Screen Area”, and the location of the area can be switched at any time during the collection process. If it is not set, full-screen collection will be performed by default.

7. Use DXGI acquisition screen and disable Aero during acquisition;

8. Collection window: you can set the window to be collected, zoom in or out of the window, and the push terminal will adapt to the bit rate and resolution;

9. Frame capture rate (frame/second) : the default screen capture rate is 8 frames, which can be set to the expected frame rate according to the actual scene requirements;

10. Zoom screen size zoom ratio: used for HD or ULTRA HD screen, by setting a certain scale factor, zoom screen acquisition resolution;

11. Camera collection: you can select the camera to be collected, collection resolution, frame rate, whether horizontal or vertical inversion is required, and whether rotation is required;

Additional questions:

Question [Confirm data source] : Capture desktop or camera? If desktop, full screen or partial area?

Answer:

For camera: Select camera list, then resolution, frame rate.

For the screen: The default frame rate is 5 frames, which can be adjusted according to the actual scene. Select the screen area and select the area to be captured or recorded in real time.

If the mode is overlay, you can choose to overlay camera to screen or screen to camera.

Users with higher requirements can set watermark or application layer cover.

Q: What if it is a camera and the camera Angle is not correct?

A: We support camera mirroring and flip Settings. The camera can be easily flipped horizontally/vertically through the SDK interface.

2 video bit rate control

Do I choose variable bit rate or average bit rate?

Answer: the advantage of variable bit rate is that if the screen or camera changed little, bit rate low, especially the h. 265 coding, the average bit rate, the rate is evener, needs to set maximum rate and average bit rate + general camera collection and advised to choose a variable bit rate, screen capture choose average bit rate, if you want to use variable bit rate, please cancel “using the average bit rate” option.

265 code or H.264 code?

Answer: Windows 64-bit libraries support H.265 encoding. If you want to push the RTMP stream, you need to support the RTMP H.265 extension on the server, and the PLAYER SDK also needs to support the RTMP H.265 extension playback synchronously.

If the lightweight RTSP service SDK is connected, only the player needs to support RTSP H.265.

If pushing camera data, it is recommended to use variable bit rate +H.265 encoding.

How to set the bit rate parameters more reasonable?

Answer:

Key frame interval: Generally, set the key frame interval to 2-4 times of the frame rate, such as the frame rate 20, can be set to 40-80;

Average bit rate: You can click “Get the default video bit rate”, and the maximum bit rate is twice the average bit rate.

Video quality: If variable bit rate is used, it is recommended to use the default recommended video quality value of Daniu Live SDK;

Encoding speed: for high resolution, 1-3 is recommended. The smaller the value is, the faster the encoding speed is;

H.264 Profile: The default baseline Profile, High Profile can be set as required.

NOTE: Before clicking “Push” or “Video” or starting the built-in RTSP service SDK, please be sure to set the video bit rate. If you do not want to manually set the video bit rate, please click “Get the default video bit rate”!!

3 Audio collection Settings

Q&a: Capture audio? If collect, collect microphone or speaker, or mix?

Answer:

If you want to collect computer output audio (such as music and so on), you can choose “capture speaker”;

If you want to collect microphone audio, you can select “Collect Microphone” and select related devices.

If you want to capture both, you can select both and mix the output.

4 Audio Coding

Question: Is it AAC or SPEEX?

A: Our default is AAC. If we need a lower rate, we can choose SPEEX. Of course, our AAC rate is not very high.

5 Audio Processing

Question: What if I want to filter background noise?

Answer: Select Noise Suppression. Use noise Suppression in combination with Automatic Gain Control. Endpoint Detection (VAD) is optional.

Question: What if I want to do one-on-one interaction?

Answer: Select Echo Cancellation, which can be used in combination with Noise Suppression and Automatic gain Control.

Question: How do I mute my feed or video at any time?

Answer: Select or unselect the “mute” function at any time during push.

6 Multi-channel push

Question: What if I want to push to multiple urls at the same time (e.g. one Intranet server, one extranet server)?

Answer: Fill in multiple urls at the same time and click push.

7 Screenshot (Snapshot)

Question: What if I want to capture the current image in the process of pushing or recording?

Answer: Then set the screenshot path, push or video, click “screenshot” at any time.

8 video

Q: WHAT if I still want to record it?

Answer: Set the video file storage directory, file prefix, single file size, whether to add date, time, and record at any time. In addition, our SDK also supports pause and resume recording during recording.

9 Real-time Preview

Question: I also want to see the video, especially the synthetic effect, how to do?

Answer: Click the “preview” button on the page to see it.

Interface call timing (C# as an example)

Project, if you want to download the demo source code can be downloaded to making “Windows platform RTMP | RTSP push SDK, built-in RTSP SDK, video SDK service”, the c + + or c #.

1 the initialization

NT_PB_Init

To configure the log path, do the following before NT_PB_Init (the directory can be specified by yourself) :

// Set the log path (make sure the directory exists)

                //String log_path = “D:\\pulisherlog”;

            //NTSmartLog.NT_SL_SetPath(log_path);

2 Open

NT_PB_Open

3 Set the callback event

  • NT_PB_SetEventCallBack: Sets the event callback. If you want to listen for events, it is recommended to call this interface after the successful call to Open
  • NT_PB_SetVideoPacketTimestampCallBack: set up the video package timestamp callback
  • NT_PB_SetPublisherStatusCallBack: sets push status callback

4 Set screen clipping

  • NT_PB_SetScreenClip: Sets screen clipping
  • NT_PB_MoveScreenClipRegion: Move screen clipping area. This interface can only be called in push or video

5 Screen selection tool

  • NT_PB_OpenScreenRegionChooseTool: open a screen toolHandle selection tool
  • NT_PB_MoveScreenClipRegion: Move screen clipping area. This interface can only be called in push or video
  • NT_PB_AllocateImage: Allocates an Image. After the allocation, the SDK internally initializes this structure, and returns NULL on failure
  • NT_PB_FreeImage: FreeImage, be sure to call this interface to free memory, if you free in your own module, Windows will have problems
  • NT_PB_CloneImage: NULL is returned if an Image is cloned
  • NT_PB_CopyImage: Copies the Image, releasing the DST resources first, and then copying the Image
  • NT_PB_SetImagePlane: Sets data for a plane of the image. If the plane already has data, it will be released and set again
  • NT_PB_LoadImage: loads a PNG image

6 Set screen capture parameters

  • Nt_pb_enabledxgiscreencapstorm, receives a storm of applause. Women, DXGI screen capture is expected of Windows 8 or later
  • NT_PB_DisableAeroScreenCapturer: capture screen disable Aero, this only have an effect on Windows 7, doing and above system, Microsoft has abandoned the Aero Glass effect
  • NT_PB_CheckCapturerWindow: Checks whether the top-level window can be captured, and returns NT_ERC_FAILED if it cannot.
  • NT_PB_SetCaptureWindow: Sets a handle to the window to capture (capture window)

7 Set camera collection parameters

  • NT_PB_StartGetVideoCaptureDeviceImage: access to handle, and save the handle
  • NT_PB_FlipVerticalVideoCaptureDeviceImage: inverting device image
  • NT_PB_FlipHorizontalVideoCaptureDeviceImage: horizontal inversion image equipment
  • NT_PB_RotateVideoCaptureDeviceImage: rotating equipment images, clockwise
  • NT_PB_GetVideoCaptureDeviceNumber: get number camera
  • NT_PB_GetVideoCaptureDeviceInfo: return camera device information
  • NT_PB_GetVideoCaptureDeviceCapabilityNumber: returns number camera capabilities
  • NT_PB_GetVideoCaptureDeviceCapability: the returned camera capabilities
  • NT_PB_DisableVideoCaptureResolutionSetting:

When multiple instances push multiple channels, for a camera, all instances can only share the camera, so only one instance can change the camera resolution, and other instances use the scaled image.

When using multiple instances, calling this interface disables the resolution setting capability of the instance. Only one instance can change resolution, if not set, the behavior is undefined;

This interface must be called before SetLayersConfig, AddLayerConfig.

  • NT_PB_StartVideoCaptureDevicePreview: start the camera preview
  • NT_PB_FlipVerticalCameraPreview: inverting camera preview image
  • NT_PB_FlipHorizontalCameraPreview: horizontal reversing camera preview images
  • NT_PB_RotateCameraPreview: Rotate the camera preview image clockwise
  • NT_PB_VideoCaptureDevicePreviewWindowSizeChanged: tell the SDK preview window size changes
  • NT_PB_StopVideoCaptureDevicePreview: stop the camera preview
  • NT_PB_GetVideoCaptureDeviceImage: call this interface can get camera images
  • NT_PB_StopGetVideoCaptureDeviceImage: stop getting camera images
  • NT_PB_SetVideoCaptureDeviceBaseParameter: set the camera information
  • NT_PB_FlipVerticalCamera Inverts the camera image up and down
  • NT_PB_FlipHorizontalCamera: flip the camera image horizontally
  1. NT_PB_RotateCamera: Rotates the camera image clockwise

8 Video compositing layer type

        public enum NT_PB_E_LAYER_TYPE : int

        {

NT_PB_E_LAYER_TYPE_SCREEN = 1, // Screen layer

NT_PB_E_LAYER_TYPE_CAMERA = 2, // Camera layer

NT_PB_E_LAYER_TYPE_RGBA_RECTANGLE = 3, // RGBA rectangle

NT_PB_E_LAYER_TYPE_IMAGE = 4, // Image layer

NT_PB_E_LAYER_TYPE_EXTERNAL_VIDEO_FRAME = 5, // External video data layer

NT_PB_E_LAYER_TYPE_WINDOW = 6, // Window layer

        }

9 Audio and video source types

/* Define the Video source option */

        public enum NT_PB_E_VIDEO_OPTION : uint

        {

            NT_PB_E_VIDEO_OPTION_NO_VIDEO = 0x0,

NT_PB_E_VIDEO_OPTION_SCREEN = 0x1, // Capture screen

NT_PB_E_VIDEO_OPTION_CAMERA = 0x2, // Camera capture

NT_PB_E_VIDEO_OPTION_LAYER = 0x3, // Video merge, such as desktop overlay camera etc

NT_PB_E_VIDEO_OPTION_ENCODED_DATA = 0x4, // Already encoded video data, currently supports H264

NT_PB_E_VIDEO_OPTION_WINDOW = 0x5, // Capture window

        }

/* Define the Auido source option */

        public enum NT_PB_E_AUDIO_OPTION : uint

        {

            NT_PB_E_AUDIO_OPTION_NO_AUDIO = 0x0,

NT_PB_E_AUDIO_OPTION_CAPTURE_MIC = 0x1, // Collect microphone audio

NT_PB_E_AUDIO_OPTION_CAPTURE_SPEAKER = 0x2, // Collect the speaker

NT_PB_E_AUDIO_OPTION_CAPTURE_MIC_SPEAKER_MIXER = 0x3, // Microphone speaker mix

NT_PB_E_AUDIO_OPTION_ENCODED_DATA = 0x4, // Encoded audio data, currently supports AAC, SpeEX broadband (Wideband mode)

        }

10 Video coding interface

  • NT_PB_SetVideoEncoderType: Set the encoding type, currently supports H264 and H265 (note: H265 is only supported by the 64-bit SDK library, setting it on the 32-bit library will fail);
  • NT_PB_SetVideoQuality: Sets the video quality. The value ranges from 0 to 20. The default value is 10
  • NT_PB_SetVideoQualityV2: Set the video quality. The range is from 1 to 50. The smaller the value, the better the video quality, but the higher the bit rate. Please consider the default value;
  • NT_PB_SetFrameRate: Sets the frame rate
  • NT_PB_SetVideoMaxBitRate: Sets the maximum video bit rate, in KBPS
  • NT_PB_AddVideoEncoderBitrateGroupItem:

* In some special scenarios, the video resolution will change. If you set a fixed bit rate, when the video resolution becomes larger, it will become blurred. If the video resolution becomes smaller, the bit rate will be wasted

* So provide can set a set of bit rate interface, to meet the needs of different resolution switching

* Rule: for example, if two groups of resolutions are set to 640*360 and 640*480, the bitrate of 640*360 will be used when the resolution is less than or equal to 640*360.

* If the resolution is greater than 640*360 and less than or equal to 640*480, use 640*480 bitrate. If the resolution is greater than 640*480, use 640*480 bitrate

* For more accurate setting, it is recommended to divide several groups to make the interval smaller

* Call this interface set one group at a time, set multiple groups called multiple times

* item corresponding NT_PB_VideoEncoderBitrateGroupItem

  • NT_PB_ClearVideoEncoderBitrateGroup: remove video bit rate group
  • NT_PB_SetVideoKeyFrameInterval: Sets the interval between key frames. For example, 1 indicates that all key frames are key frames, 10 indicates that one key frame is used in every 10 frames, and 25 indicates that one key frame is used in every 25 frames
  • NT_PB_SetVideoEncoderProfile: H264 profile. 1: H264 baseline(default value). 2: H264 main. 3. H264 high
  • NT_PB_SetVideoEncoderSpeed: sets the H264 encoding speed. Speed: ranges from 1 to 6. A smaller value indicates a higher speed and lower quality
  • NT_PB_SetVideoCompareSameImage: Sets whether to make the same image comparison. The same image comparison generally has some advantages in desktop acquisition, and may reduce the bit rate
  • NT_PB_SetVideoMaxKeyFrameInterval: This interface is generally not used to set the maximum key frame interval of a video. It is used to coordinate with the SetVideoCompareSameImage interface. For example, after image comparison is enabled, the SDK finds that the images for 20 consecutive seconds are the same, but the player needs to receive the key frame before decoding and playing, so a limit is required

11 Audio coding interface

  • NT_PB_GetAuidoInputDeviceNumber: acquisition system audio input device
  • NT_PB_GetAuidoInputDeviceName: Obtains the name of the audio input device
  • NT_PB_SetPublisherAudioCodecType: set push audio coding type, type: 1: use the AAC encoding, 2: use speex coding, other value return an error
  • NT_PB_SetPublisherSpeexEncoderQuality: set push Speex coding quality
  • NT_PB_SetAuidoInputDeviceId: sets the AUDIO input device ID
  • NT_PB_IsCanCaptureSpeaker: Checks whether speaker audio can be captured

Audio processing interface

  • NT_PB_SetEchoCancellation: Sets echo cancellation
  • NT_PB_SetNoiseSuppression: Sets audio noise suppression
  • NT_PB_SetAGC: Sets automatic audio gain control
  • NT_PB_SetVAD: Set Voice Activity Detection (VAD)

13 layer composition interface

  • NT_PB_SetLayersConfig: Sets the video composition layer. An array is passed in. Fill each layer correctly
  • NT_PB_ClearLayersConfig: Clears all layer configurations. Note that this interface can only be called before pushing or recording, otherwise the result is undefined
  • NT_PB_AddLayerConfig: adds layer configuration. Note that this interface can only be called before pushing or recording, otherwise the result is undefined
  • NT_PB_EnableLayer: dynamic disable or enable layer
  • NT_PB_UpdateLayerConfigV2: Updates the configuration of the layer. Note that not all fields of the layer can be updated, but only some fields can be updated, and some layers have no fields to update. The SDK only selects the fields that can be updated, and the fields that cannot be updated are ignored
  • NT_PB_UpdateLayerRegion: Modifies layers
  • NT_PB_PostLayerImage: delivers Image data to index layer. Currently, it is mainly used to transmit RGB and YUV video data to related layer
  • NT_PB_SetParam: universal interface, set parameters, most problems, these interfaces can solve
  • NT_PB_GetParam: universal interface, get parameters, most problems, these interfaces can solve

14 RTMP push – Sets the PUSH RTMP Url

NT_PB_SetURL: RTMP push URL setting

15 RTMP push – Starts the push RTMP stream

NT_PB_StartPublisher

16 RTMP push – Stops the RTMP stream push

NT_PB_StopPublisher: Note that this interface works with NT_PB_StartPublisher

17 RTSP Push – Set transport mode (TCP/UDP)

NT_PB_SetPushRtspTransportProtocol: set push RTSP transmission mode, general server can support RTSP TCP or UDP transmission mode at the same time, part of the server only supports TCP or UDP mode. Transport_protocol: 1 indicates UDP RTP packets. 2 indicates that TCP transmits RTP packets. The default is 1 for UDP transmission.

18 RTSP Push – Sets the RTSP Url for push

NT_PB_SetPushRtspURL: Note that the RTSP push ensures that the server push URL is available.

19 RTSP Push – Starts the push RTSP stream

NT_PB_StartPushRtsp

20 RTSP Push – Starts the push RTSP stream

NT_PB_StopPushRtsp: Notice that this interface is used with NT_PB_StartPushRtsp.

21 RTMP/RTSP Push video

  • NT_PB_SetRecorderDirectory: Sets the local recording directory. The directory must be an English directory; otherwise, the directory fails to be set
  • NT_PB_SetRecorderFileMaxSize: Sets the maximum size of a single video file. If this value is exceeded, it will be cut into a second file
  • NT_PB_SetRecorderFileNameRuler: Sets the rule for generating video file names
  • NT_PB_StartRecorder: Start recording
  • NT_PB_PauseRecorder: pause recording. Is_pause: 1: pause recording. 0: resume recording
  • NT_PB_StopRecorder: Stops recording

22 Real-time Mute (Real-time Call)

NT_PB_SetMute: Mute push in real time

23 Snapshot (Real-time Call)

NT_PB_CaptureImage: indicates a real-time snapshot taken during push or recording

24 Close

NT_PB_Close: Handle is invalid after this interface is called

25 Uninit

NT_PB_UnInit: This is the last interface to call

Above is our design module part of the information, interested developers, can be appropriate reference.