TiDB Hackathon 2020 TiDB Hackathon 2020 TiDB Hackathon 2020 TiDB Hackathon 2020 TiDB Hackathon 2020 This year is the fourth time that TiDB Hackathon has been held. The scale of participating teams is the largest ever, with a total of 45 teams from all over the world signing up, realizing global linkage for the first time. After 2 days of extreme challenge, many exciting projects emerged in the competition. In order to let more friends know the stories behind these participating teams, we will start the TiDB Hackathon 2020 Excellent project sharing series. In this article, we will introduce the wonderful story behind the competition of ‘OR 0=0 or’ team.

** In this competition, the ‘OR 0=0 or’ team implemented the TiDB User Defined Function (UDF) engine very elegantly and efficiently based on WASM, which won unanimous high marks from the judges and won the champion of this competition. Won $100,000 prize money. ** In order to inspire other developers to develop their own applications and projects based on TiDB in the future, we interviewed the ‘OR 0=0 OR’ team and judge Teacher Zhang Donghui after the competition and invited them to share their Hackathon experience.

A perfect illustration of Hacker’s spirit

If you’re not a tech geek, seeing ‘or 0=0 or’ team names (with quotes included) for the first time will be confusing. In fact, the team name came from a spoof idea by the ‘OR 0=0 or’ team members to see if something would happen when the name was put into the systems in 2021

This extremely geek spirit team has 4 members, respectively is Zhuang Tianyi, Zhu Hetian, Wang Weizhen, Shi Wenxuan. They are the Committer, Reviewer, Active Contributor and Tech Leader of TiKV respectively. They are usually net friends and got to know each other through participating in TiKV or TiDB community development.

Mr. Zhang Donghui, TiDB product consultant and one of the judges of the contest, was deeply impressed by ‘OR 0=0 OR’ and gave a high score of almost full score: “‘ OR 0=0 OR ‘left two deep impressions on me during the Demo Show: First, the team name itself is an expression of Hacker’s spirit. At the same time, the name also points to the biggest value of the project — safety, which I think is also the most difficult part; The other is that the Demo is beautifully designed, they use storytelling skills to take the audience step by step from the simplest thing to an amazing performance.”

As a veteran member of TiDB Hackathon for three times, Shi Wenxuan believes that Hackathon can realize some design and practice ideas that are not particularly mature or even not very reliable. In normal times, these ideas might be hard to push, but in Hackathon you can team up and prove it through code, which is the most interesting part of Hackathon.

For newcomers, Wen Xuan suggested that creativity can not only be limited to TiDB and TiKV kernel products, such as debugging tools, development tools, visualization tools, integration with big data solutions and other surrounding ecological tools, which can give them a broader space for imagination. In the last two Hackathon sessions, we have seen more and more high quality ecological projects in the surrounding areas, and this year there were many flink-based projects.

Perfect UDF for 24 hour extreme development delivery

**UDF is a user-defined function engine that allows users to write complex custom function execution logic and perform calculations directly on the database. ** This capability allows users to extend more on the TiDB platform, which is a very important thing for a platform product. Previously, TiDB had not launched UDFs due to security and other challenges. The security problem is actually a particularly difficult thing, may be invested in research and development for a year or two can not solve.

To my surprise, the 4 members of ‘OR 0=0 OR’ accomplished the UDF project within 24 hours, including the whole push down, environment preparation and interface design, which can be said to be a pretty perfect project.

Some functions (such as bcrypt) were previously not supported by TiDB, and UDF implementation eliminates the need to retrieve raw data from TiDB for client computing. Local distributed computing of the database can significantly improve performance and seamlessly integrate database functions such as JOIN. At the same time, UDF can also request computing resources on cloud services for direct computing, such as cloud face recognition and Serverless load for unlimited scaling.

Wen Xuan revealed in the interview that the reason why ‘OR 0=0 or’ chose the direction of UDF. On the one hand, there are UDF requirements for various standardized tests, but TiDB has not done it, which is seen as a very difficult thing to do; On the other hand, there are already some requirements for UDFs on TiDB Cloud’s product roadmap. ‘OR 0=0 or’ chose this very challenging project for two reasons, and ultimately, they chose to implement UDF based on WASM.

WASM, which stands for WebAssembly, is a new format that is portable, small, fast to load, and Web compatible. Its core capability is security, the ability to run untrusted code on a platform, and WASM is a secure enough technology to survive the toughest security environments on the Internet. In addition, WASM is cross-language and cross-platform, running in browsers, operating systems, cloud services, etc., making it easier for developers to program regardless of language.

** As for the UDF developed by ‘OR 0=0 OR’, Teacher Dong Hui believes that this project actually opens a door for TiDB to become a platform product, allowing us to see many possibilities. ** For example, through this project, we can further explore support TiDB to do Machine Learning, let TiDB support trigger and so on. It could even be a core feature of TiDB in the future, supporting a wide range of business scenarios.

sorry

24 hours is an extreme challenge for any software development project. This year, due to the epidemic situation, the team members of ‘OR 0=0 OR’ are scattered in Beijing, Hangzhou and other parts of the country, and the project promotion is carried out online, which has become the biggest challenge facing Hackathon this year. In previous games, teammates could sit together in the office and talk face to face. Collaboration through remote means is a new experience, how to adapt to the remote environment is a new test for all team members. Through remote collaboration, the 4 team members split the function points that will be done. Some will be responsible for WASM running on the TiDB side, some will be responsible for WASM running on the TiKV side, some will be responsible for MySQL compatibility, and finally, someone will merge the code together. But on the first day, they wrote parts separately, and on the second day, they tried to merge, but they couldn’t. Fortunately, before the defense, there was no scene overturned under the full cooperation of everyone.

Referring to the competition has no regrets, Wen Xuan said that due to the limited time did step on a lot of pits. For example, in the planning, they wanted to implement network access in UDF, but this feature was only implemented in TiKV, but it ran into trouble in TiDB. Due to the different libraries called on TiDB side and TiKV side, part of the API working on TiKV side did not work on TiDB side, and the network access function did not work eventually. The second regret was that one of the students initially spent a whole day trying to run Java on WASM and get Java programs running on it. However, after researching various Java to WASM solutions, none of them came out, and this attempt was unfortunately unsuccessful.

harvest

It’s not pointless to try. They’ve basically figured out what works, and with a little more time they can add to that capacity. Wen also has high expectations for the future of UDF: “In addition to our own 24-hour high-intensity Hacking, UDF could not have been born without the support of PingCAP colleagues who worked hard to prepare for this event and the recognition of the judges who listened to the teams for more than ten hours. “Hackathon is over, and it would be a great recognition of our design and code implementation if the WASM UDF we design and implement in the future fits into the TiDB product trunk and becomes a real user-usable feature.”

At the end of the interview, we also want to do something — please ‘or 0=0 or’ talk about the competition in addition to their own other favorite team, Wen Xuan gave sen Haifei Xia’s name. Their project was Dynamic copysets, which solved how to significantly improve the data reliability of the entire cluster in the case of relatively large clusters. The project itself is based on a paper, but only a static solution is provided in the paper, and a dynamic solution is needed if TiDB is to be applied. There is no solution to this problem in the academic world, and Senhaifei xia may have filled the gap in the academic world and explored a new direction that proved to be feasible.

The interview about the project will also be released in the near future, stay tuned!