Video watermarking, as an important means to protect intellectual property rights, has long been used and accepted by the public, but this method still has many shortcomings. For the audience, the logo in the corner of the screen will affect their viewing experience somewhat. For the video owner, the watermark directly displayed on the screen is also easy to locate and attack. Some manufacturers respond to these attacks by inserting watermarks into the screen from time to time in random directions, making Delogo more difficult, but further reducing the viewing experience. To solve these problems, invisible watermarking technology is proposed and developed gradually. In “Video Invisible Watermarking Algorithm (Part 1)”, we introduce the encapsulation layer and the invisible watermarking technology based on LSB. Although these methods are easy to implement with low computation, the watermarks added are also fragile. Rather than protecting intellectual copyright, they are used to hide data or transmit ancillary information. Next, this paper will introduce some invisible watermarking algorithms operating in transform domain, which can better deal with various attacks. You may need some prior knowledge of DCT, DWT, and SVD before reading this article. If you are unfamiliar with these contents, you can browse the following links of zhihu columns to get a general understanding:

  • , a discrete cosine transform (DCT) : zhuanlan.zhihu.com/p/85299446
  • Image algorithm and easy to understand explanation I — wavelet transform: zhuanlan.zhihu.com/p/22450818
  • Singular value decomposition (SVD) : zhuanlan.zhihu.com/p/29846048

DCT invisible watermark

DCT – based invisible watermarking is a kind of common invisible watermarking algorithm, there are many reasons to choose DCT.

First, human eyes are sensitive to different frequency signals in the image, so manipulating data directly in the frequency domain is beneficial to control the subjective perceived distortion degree, so as to ensure the “invisibility” of the watermark.

Second, different frequencies have different signal stability. Adding watermarks in the frequency domain helps to control the robustness of watermarks and ensure that the watermark can still be restored after the carrier experiences various damages. Thirdly, this kind of method can be directly embedded into some encoders in theory, thus reducing the computation.

However, it should be noted that the above two points are actually contradictory. The lower the frequency range of watermark data is ****, the higher the robustness is, but the greater the image distortion is, and vice versa. Therefore, most implementations will choose to add watermarks in the middle frequency range.

The following figure is a common watermark embedding process. After the image is transformed by DCT, the watermark data is added to the selected frequency coefficient, and then IDCT is used to restore the image, so the embedding of the watermark is completed.

Common DCT – based watermark embedding process

If the original image is used as a reference when extracting watermark, the embedding logic in the graph generally has the following options. In the formula, vi represents the original coefficient, Xi represents the watermark coefficient, and α is a constant.

Watermark embedding formula with parameters

Corresponding watermark extraction process

If the original image is not available for reference, the LSB method in the previous part can be referred to to quantify the original coefficient in a low-precision form during embedding, and then store the watermark data in a high-precision area.

DWT and SVD

As mentioned above, the invisible watermarking algorithm needs to meet both low visual loss and high robustness. DCT is not the only tool that can achieve this effect. DWT and SVD are also two common options. DWT generally uses Haar wavelet, which has low computation and can decompose the image into four copies in different frequency bands, and can be performed recursively for many times, significantly reducing the amount of subsequent data to be processed. Therefore, it is often used to preprocess invisible watermarks. SVD simply regards the image data as a two-dimensional matrix and uses the stability of singular values to protect the watermark.

Below is an example of an invisible watermark combining DWT and SVD. In the figure, only one round of DWT was carried out, and LL low-frequency data was selected for processing. However, there are also implementations of SVD using multiple rounds of DWT and LH and HL data. Their corresponding watermark extraction process is just an inverse process, here is no longer the map.

Example of watermarking embedding based on DWT and SVD

3. Machine learning is everything

In order to further improve the effect, some researchers also follow the footsteps of the popular attempt to use machine learning methods to achieve invisible watermarking. For example, one machine learning implementation, called RivaGAN, is in the Python open source library Invisible – Watermark [2] that I referenced for this article. The framework is shown below. The Attention module deduces the distributed Attention Mask of target data according to the original image, and Encoder module uses this data to embed watermark data D into the video. RivaGan used a Critic network to evaluate picture distortion and an Adversary network to simulate active attack respectively during training, and added a artificially designed Noise network to simulate common transmission distortion (including scaling, clipping, and lossy compression), in order to get better results in terms of both picture distortion and robustness.

RivaGAN’s watermarking process

Four, watermark confusion and encryption

Using invisible watermarking generally requires open algorithms, after all, no one trusts the extraction results of a black box. However, no matter how complicated the watermark embedding method is, it is possible to be extracted, erased or even replaced by attackers. In order to prevent this situation, the watermark data itself or the embedded coordinate information is often obfuscated and encrypted in the form of key. In this way, as long as the attacker does not have the key, even if they already know how to embed the watermark, they cannot detect the original watermark data.

Complete invisible watermarking system

Five, the summary

This paper briefly introduces the frequency domain method and machine learning method of invisible watermarking. As this part of the content involves some professional knowledge, it cannot explain the principles and details in detail like the previous part. If you are interested in the omitted part, it is a good idea to read the source code of Python’s Invisibal-Watermak library, which implements three watermark embedding schemes, all of which are useful in this article.

reference

[1] I.J. Cox, J. Kilian, F.T. Leighton, T. Shamoon. Secure spread spectrum watermarking for multimedia. 1997.

[2] github.com/ShieldMnt/i…

[3] Zhang, Kevin Alex and Xu, Lei and Cuesta-Infante, Alfredo and Veeramachaneni, Kalyan. Robust Invisible Video Watermarking with Attention. MIT EECS, September 2019.

[4] C.I. Podilchuk, E.J. Delp. Digital watermarking: algorithms and applications. 2001.*