With the explosion of data volume and the emergence of AI, the most common approach of data business in typical big data business scenarios is to use batch computing technology to process full data and stream computing technology to process real-time incremental data. In production environments, users typically use both batch and stream computing engines to support both scenarios. The downside is that you have to write two sets of code and maintain two sets of engines, and there is no doubt that this architecture imposes additional burdens and costs.

In the face of full data and incremental data, can a unified set of big data engine technology be used to process?

Apache Flink is recognized as the best stream computing engine in the industry. Its computing power is not limited to stream processing, but a set of big data engine with multiple computing functions such as stream, batch and machine learning. Users only need to develop a set of code according to business logic, whether it is full data or incremental data, or real-time processing. One set of programs can be fully supported. In order to give you a more comprehensive understanding of the technology and application practices behind Apache Flink, today we are making the Apache Flink video courses available for free for the first time.

Why collectApache FlinkSeries of courses?

According to the 2018 Market Research report, Apache Flink was the “fastest growing” engine in the open source big data ecosystem in 2018, growing 125% compared to 2017. Flink’s community ecosystem is growing. In China, more and more Internet companies are using Flink in production environment to solve real-time computing, streaming computing, risk control and other problems. Therefore, learning Flink is imminent.

This free open course is divided into 9 classes, covering the basic architecture, application scenarios, cluster deployment, operation mechanism and programming paradigm of Flink, systematically explaining the big data computing development engine Flink.

1.1 Why StudyApache Flink?

Key words: importance of Flink

The course begins with Chen Shouyuan (Bazhen), senior product expert of Alibaba and leader of real-time computing product team. He will share with you how to learn Flink effectively from the original intention of setting up Apache Flink series courses, the definition/architecture/principle of Apache Flink as well as pre-school preparation and learning methods.

1.2 Basic concepts of Flink

Keywords: Apache Flink PMC, stateful streaming processing

In this course, Apache Flink PMC and Ververica Software Engineer Dai Zhilii will discuss with you the core concept of Flink as a stateful streaming processing engine. What is the difference between Flink and other big data engines? Why use Flink and what are the challenges with stateful streaming engines?

1.3 Flink installation, deployment, environment configuration and application running

Key words: development Flink must pass the first lesson

Crack “easier said than done” approach is practical, the third section by alibaba senior development engineer ShaShengYang take you from Flink development environment deployment, configuration, operation, and the application scenarios of different patterns of demonstrate how to properly installed application Flink quickly, and provide you with the possible problems in practical application and the corresponding solutions.

1.4 DataStream API Programming

Keywords: community network celebrity, simple, easy to understand

The fourth course involves actual development. DataStream API is a core content of Flink. This session is shared by Apache Flink Committer and Postdoctoral fellow Cui Xingcan from York University, Canada. This course will introduce you to the basic knowledge of the DataStreamAPI, including the concept and design of the DataStreamAPI, demonstrate the practical development skills of the DataStreamAPI with examples, and analyze some of the source code of the DataStreamAPI.

1.5 Client Operations

Key words: client operation comprehensive usage guide

The fifth course is a comprehensive guide to the use of client operation. Zhou Kai Bo (Baoniu), a technical expert from Alibaba, demonstrates the Flink client operation by video, and demonstrates the demonstration environment, interface, Flink command line and Flink’s five task submission methods, laying a solid foundation for the subsequent development.

1.6 the Window & Time

Key words: Window data flow

The sixth course mainly introduces concepts related to Windows. The course is shared by Qiu Congxian, senior development engineer of Alibaba. He explains basic concepts, core components and how to deal with out-of-order data and late data of Windows, and guides you through the whole data process of Windows with codes.

1.7 Status management and fault tolerance mechanism

Key words: Must listen course

The seventh course was shared by Sun Mengyao, a R&D engineer of Meituan-Dianping. She shared the basic concepts of state management, types and usage examples of Flink states, fault tolerance mechanism and fault recovery, as well as how enterprises should choose the types and storage methods of states and gave reference suggestions.

1.8 Flink Table API programming

Keywords: VERY core part of SQL performance

Table API is a very core part of Flink SQL performance. This section is shared by Apache Flink Contributor and Alibaba Senior R&D engineer Cheng Hequn. While explaining the basic concepts and features of TableAPI, we demonstrate the TableAPI programming, WordCount example, TableAPI operation code and share the dynamic community of TableAPI.

1.9 Flink SQL programming

Keywords: A Year to become Committer, “Flink Learning Blog first Stop”

The last session of Apache Flink series will be shared by Apache Flink Committer and Alibaba Senior Development Engineer Chong Wu (Yunxie). How to run SQL queries on streams, how to use SQL CLI client, how to consume Kafka data using SQL CLI, how to write results to Kafka and ElasticSearch using SQL.

What can you gain?

Realize from 0 to 1 understand Flink to establish Flink system framework system, for the big data engine learning to lay a foundation.

  • Flink is a distributed, high performance, high availability, and high precision open source streaming processing framework for data streaming applications. It will take you to appreciate the beauty of computing.

  • The content of the course focuses on principle analysis and basic application. Through detailed analysis of Flink flow calculation concepts, technical principles and practical operations, you will be guided to have a deep understanding of Flink from the most practical application scenarios and help you grow from a Flink novice to a Flink technical expert.

The course content not only includes the experience sharing of Flink related enterprise users, but also includes the theoretical practice of Flink core developers. It has both extensive sharing and in-depth discussion, which is the essential medicine for big data lovers to learn Flink for home travel! — Bazhen, senior product expert at Alibaba

The latest information push from the media, to the big shopping spree of real-time data, and even urban level brain computing industry, real-time computing has been applied to many live, work, with the rapid growth of business, enterprise more and more high demand for large data processing, the application of the Flink also more and more widely, believe in the near future, Flink will become the mainstream big data processing framework for enterprises of all sizes in various industries, and eventually become the standard for the next generation of big data processing frameworks. The earlier you learn, the better you can seize the opportunity of The Times.

How to download

Watch the series