The writer is a member of The RTE 2021 Creative Programming Challenge winner Zhang Ye Dong’s team. In the field of real-time audio and video, video content needs copyright protection, and blind watermarking is one of the protection measures. This team developed a blind real-time video watermarking plug-in based on sonnet SDK. Other developers using the Soundnet SDK can also use the plugin in their own applications. Visit Read Original to see the source code for the project.

Project introduction

Blind video watermarking technology is to embed the identification information directly into the RGB or YUV frequency domain of the video, which basically does not affect the viewing quality of the original video, and is not easy to be detected or noticed. The information hidden in the carrier can be used to identify content creators and users or determine whether the video has been tampered with. This technology is usually provided by professional copyright protection service providers for broadcast and TELEVISION copyright protection, with strong commercial value.

This project developed a plug-in for blind user real-time video watermarking based on the SDK of sonnet, and provided a watermarking recognition software based on personal PC for watermarking verification. It reduces the professional threshold of using blind watermarking service and provides a convenient solution for individual users’ privacy protection and anti-piracy of works.

Realize the principle of

The realization principle of blind watermarking is to complete information superposition in the frequency domain. The transformation methods include discrete Fourier transform and wavelet transform, etc. For example, Fourier transform is used to complete text image superposition in the real and imaginary parts, and then video frames are displayed by inverse transformation.

The method of extracting watermark from video frame is to take a screenshot of video frame, and then perform Fourier transform on the screenshot to get frequency domain data. The amplitude in frequency domain, namely energy, is displayed, and the amplitude map in frequency domain is obtained, and the previously superimposed text will be displayed.

The complexity of fast Fourier transform is O(nlog(n)). In principle, blind watermarking can be realized in real time in the process of video processing.

Designed and implemented

The program design includes two parts: sound net SDK docking and blind watermarking development. Blind watermarking development is divided into two parts: Overlaying watermarking on Android and extracting watermarking on Windows. They are gray, yellow and orange. Since it is a Demo, blind watermarking is only completed on the local video preview, and can be extended to the video display in the future.

The design of this scheme focuses on SDK connection and third-party compatibility. The main aspects are less copy YUV data, serialization of video processing, third party compatibility and scene generalization.

The core code

Main process of superimposed watermarking:

Opencv call function:

Mainly Fourier transform and superposition text two functions, sound net SDK and OpenCV open source library compatibility effect is good.

Results show

The first picture is the original video with watermarked words such as WM input; the second picture is the video with blind watermarking superimposed, and the visible video effect is basically unaffected; the last picture is the watermarked image extracted by the user after the second picture is uploaded to the PC, and there are obvious WM characters in the visible image. This completes the validation.

future

The next step is to improve the robustness of watermarking, expand the application scenarios of watermarking, and enrich the data dimension of watermarking. In the robustness of watermarking, grid segmentation is carried out in the planned spatial domain, and watermarking is superimposed in frequency domain for different segmentation regions. Use different transformation methods, such as DWT, for best results; The watermark itself is redundantlyencoded to improve the identification and imperceptibility of the watermark. In the application of extended watermarking, the watermarking is superimposed on the real-time video display end to achieve the purpose of surrendering and tracing the source. In the aspect of enriching data dimension, in the aspect of audio processing, it can expand the voice print watermark; Combined with video content features, extensible feature coding, etc.