Make a statement first. Although I mentioned what I was doing at the end of the article, this is not an advertorials article. I just mentioned it because I was happy to have the same idea. I hope it doesn’t affect your reading.


The big story in tech news these days is that Facebook has released a cryptocurrency white paper called Libra, a set of digital currencies designed around cryptocurrency concepts that Facebook doesn’t fully control. Of course, it’s still a more centralized currency controlled by a few giants, rather than a truly decentralized cryptocurrency. This isn’t the first time Facebook has tried in this direction, and it’s much better prepared than previous approaches. Although there are still many differences from the true “blockchain decentralized currency”, at least the giants have rushed into the market.

So even if Facebook doesn’t get what it wants, it’s still a big deal for the industry. However, this article is not specifically about FB’s cryptocurrency. Blockchain and cryptocurrency are closely related. Some projects focus more on chains, while others focus more on coins. Coincidentally, in the same day, Stephen Wolfram wrote an article about blockchain, an important application of blockchain that has received much less attention amid the frenzy of FB coin news. But in my opinion, this article discusses something more worth thinking about.

Stephen Wolfram doesn’t seem to be particularly well known in the Chinese-speaking world, which is far from where he actually is in the industry. If I had to judge him, he’s one of the most talented people in the business. This man has made extraordinary achievements in both scientific research and business. At the age of 15, he began to publish papers on physics with his heart set on becoming a physicist. At the age of 20, he got his PhD in physics from Caltech and began to study complexity theory. Later, for his research needs, he decided to make some tools by himself and finally invented Mathematica. Mathematica is one of the most important programs in history. In addition to mathematics and physics, he was one of the first scientists to work on artificial intelligence. Why did he write this article? It starts with DeepFake, a video face-changing technology based on deep learning. If you’ve been watching tech news, you’ll remember DeepFake, the popular face-changing hack from last year. A year later, strange applications were created based on it, which started out as a way to swap celebrities’ faces, but it soon became clear that the technology could have devastating and unpredictable consequences if used for fake news. Such concerns are growing as the 2020 U.S. presidential election draws closer.

In early June, the Us House Intelligence Committee held a hearing on DeepFake and AI issues, and Stephen Wolfram was one of those invited to the hearing. But he didn’t have time to go, so he wrote the idea and put it on his Blog. This is a good thing for readers like me, because it allows him to write his thoughts in a more leisurely manner, rather than being distracted by the pace of the hearing.

As you know, photo forgery is a very old technique, and people have been doing it since the days of traditional film before PhotoShop. Stephen Wolfram says photo forgery is almost as old as photography. And the question today is, when AI can be used to fake photos, is there an easy way to tell which photos are real? After all, before we said “no pictures, no truth”, now it is “pictures and videos it may not be true”.

To discuss this, let’s hide the technical details to briefly describe the AI techniques used to forge photos and videos: GAN (generative adversarial network), in the easiest way to understand it, can be thought of as two systems pitted against each other, one that generates fake data sets called generators, and the other, called discriminators, that use real data sets to determine whether the generated fake data sets are true or false. The two systems then work against each other in order to make the discriminator unable to tell what is true and what is false, so the goal of such systems is to produce data sets that the machine cannot tell what is true and what is false. The GAN network is certainly not a panacea, and to say that it has intelligence is a glorified way of saying that it does not have real intelligence, nor does it understand logic itself, but when it comes to faking videos and photos, its current capabilities are enough to spell trouble for humans. Because if a generator generates enough content, and uses social networks to distribute and retweet it, it becomes a huge job to authenticate it. In this day and age, when most people have a deep experience of all kinds of weird rumors in various groups, consider how much more people will have to work to verify them when they are accompanied by real photos and videos.

There is nothing to stop people using machines to produce an endless stream of fake images and videos. For some critical content, human beings should be able to spend a lot of money, such as expert organizations such as forensic groups to verify the authenticity, but for the mass of daily life, especially the popularity of social networks, due to efficiency and labor costs, it can be said that powerless. Fake videos will be generated much faster than they can be authenticated, and even if they can be authenticated, it will be meaningless.

In order to efficiently identify fake videos, it can only rely on machines. There are many start-up companies using AI to identify fake videos. But consider this: Are machines capable of identifying whether images and videos are fake? Sadly, the answer is pessimistic. The reason for the pessimism is already contained in the introduction to the principle of GAN. Since the purpose of the generator is to challenge the discriminator, it means that such images and videos are no longer distinguishable by the machine. Although humans can find some gaps in image generation and make better discriminators, these gaps are also filled in by generators. “It’s an arms race,” as Stephen Wolfram puts it, in which the end result can only be higher levels of fake data. So the idea of using machines to help tell the difference between real and fake may work in the short term, but in the long run, these struggles are futile, and in the end generators will be better than discriminators, meaning fake pictures and fake videos that can’t be judged will win.

Following this corollary, the way we define “real” today would change radically. For example, now you can prove that you are not responsible for a traffic accident with a dashcam video. But in the future, such evidence may not be valid, because no one knows whether the video has been reprocessed by AI. Even popular photos and videos in society will be more fake than real, because a real video can be changed into different fake videos after AI processing for different occasions. A lot of the anchors in our current society will disappear, and there’s nothing we can do about it.

So we need a completely different set of thinking to deal with authenticity, namely: from the current “default photos and videos are real unless proven false” to “default photos and videos are not real unless proven true”. To meet this need, a system is needed that stores all evidence impartially and neutrally, and makes it easy for machines to read such evidence and draw conclusions such as “high probability of verifiability”. According to Stephen Wolfram, this system is called blockchain. Blockchain’s decentralized, immutable, and better program read-write features perfectly meet the above requirements.

It’s impossible to judge authenticity directly using blockchain technology, but if every video created, every photo taken, is documented on the blockchain, it could one day be possible for machines to use that evidence to help people determine authenticity. What’s more, today we don’t fully know how the world will change, or how far AI and forgery will go. So we try to save as much metadata as possible. Including scene data at the time of shooting, such as GPS information, time, temperature, weather… All kinds of relevant and irrelevant information should be stored, and the more metadata there is, the more data there will be to determine whether it is true or false. The idea is to package all that metadata into a media file, then compute the hash, and record the hash onto a decentralized blockchain. One day, machines will be able to read the information stored today and use it to determine its authenticity. The current “no picture, no truth” will become “no truth on the chain”.

It’s important to note here that there are numerous technical details that need to be worked out to implement this system. For example, data that is not stored on the chain must be real. It is also possible that the first author shoots a video and then forges it, and then signs it for storage on the blockchain, making it more difficult to determine authenticity. Therefore, the final judgment of authenticity is still only a probability, rather than 100% on the chain is true. But by storing as much metadata publicly as possible, it can be used for more analysis and cross-comparisons, further reducing the likelihood of forgery. As in the previous example, if the person appearing in the forged video finds another video with his image in a different geographical location on the blockchain within a similar period of time, it can be considered that at least one of the two videos is forged. Moreover, such forged evidence cannot be eliminated on the chain, which ultimately reduces the credibility of the uploader. All in all, such a system can provide more rigor and use more automated means to solve problems. It’s still not a perfect solution, but it’s still better than no system at all. In the future, people will increasingly experience the ubiquity of probability. People can only say “there is a high probability that it is true”, and it is difficult to say “absolutely true”.

The reasoning is clear and simple, and there really is no other solution. In fact, Stephen Wolfram has been a proponent of recording all personal data. For more than two decades, he has been recording everything he can about himself, from the hours he works to the types of documents he uses, the frequency of emails he sends, everything he can, and writing programs to analyze it. The only difference is that in the past he kept the data for his own use, but in the future, as the creator of the information, some of the data records should be made public for more people to use, which is an interesting change.

To my surprise, this is in line with our thinking. I’ve written before that our goal with PRESSone is to help “validate” rights, and I often try to explain that “validate” is not “copyright” and that the next step is copyright-related applications, but the two concepts are different. But I never found a particularly good example of why it’s different, and why a complex system like blockchain is used to do confirmation, so PRESSone is often categorized as a “content copyright” app. It’s content copyright, of course, but confirmation is the more important foundation. Blockchain is a much more expensive storage system than a central database, and since such an expensive solution is used, it needs to be expensive for a reason. The easiest question to ask is “Why do I need to use it?” Interestingly, as time goes on, this kind of answer is popping up more and more, as the development of technology forces humans to require such systems in many situations. Moreover, the development of science and technology is bringing more and more people into this field. In the past, only people in the technology industry are involved, but now creators have to care about this field. No one wants their work to become fodder for fake videos of dubious use in the future, so what you need to do now is preserve immutable data and wait for the future to come.


Reference Note:

  • Cover Photo by Agence Olloweb on Unsplash

  • Stephen Wolfram’s Blog: https://blog.stephenwolfram.com/2019/06/a-few-thoughts-about-deep-fakes

  • DeepFake: https://en.wikipedia.org/wiki/Deepfake

  • The house intelligence committee hearing record: https://intelligence.house.gov/news/documentsingle.aspx?DocumentID=657

  • What is PRESSone: https://static.press.one/files/PRS_whitepaper_1_0_1_cn.pdf