I. Introduction to Flink

Apache Flink was born as a research project at the Technical University of Berlin, formerly known as StratoSphere. Flink was incubated by the StratoSphere Project in 2014 and donated to Apache that same year, and has since become an Apache top-level project. In January 2019, Alibaba acquired Data Artisans, the parent company of Flink, and announced the open source of internal Blink. Blink is alibaba’s optimized version based on Flink, which added a large number of new functions and made various optimizations in performance and stability. Experienced the challenges and tests of a variety of complex businesses within Ali. At the same time, Alibaba also said that it would gradually Merge these new functions and features into the community version of Flink, so Flink has become the most popular framework for big data processing.

Simply put, Flink is a distributed flow processing framework that can efficiently process both bounded and unbounded data streams. The core of Flink is stream processing, but it can also support batch processing, which Flink regards as a special case of stream processing, i.e., data flow with clear boundaries. This is in complete contrast to the idea of Spark Streaming, whose core is batch processing, which considers stream processing as a special case of batch processing, that is, to split data streams into micro-batches with minimal granularity.

Flink bounded data flow and unbounded data flow:

Spark Streaming:

Second, Flink core architecture

Flink adopts a layered architecture design to ensure that the functions and responsibilities of each layer are clear. The following figure shows the API & Libraries layer, Runtime core layer, and physical deployment layer from top to bottom:

2.1 API & Libraries layer

This layer mainly provides the programming API and the top-level class library:

  • Programming API: DataStream API for stream processing and DataSet API for batch processing;
  • Top-level class libraries: including the CEP library for complex event processing; SQL & Table libraries for structured data queries, as well as batch-based machine learning libraries FlinkML and graphics processing libraries Gelly.

2.2 Runtime core layer

This layer is the core implementation layer of Flink distributed computing framework, including job conversion, task scheduling, resource allocation, task execution and other functions. Based on the implementation of this layer, flow processor and batch program can be run simultaneously under the streaming engine.

2.3 Physical Deployment Layer

Flink’s physical deployment layer is used to support the deployment and running of Flink applications on different platforms.

Flink layered API

In the API & Libraries layer introduced above, Flink makes a more specific division. Details are as follows:

According to the above hierarchy, API consistency increases from bottom to top, and interface performance decreases from bottom to top. The core functions of each layer are as follows:

3.1 SQL & Table API

The SQL & Table API is suitable for both batch and stream processing, which means you can query bounded and unbounded streams with the same semantics and produce the same results. In addition to basic queries, it also supports custom scalar functions, aggregate functions and table-valued functions, which can meet a variety of query requirements.

3.2 DataStream & DataSet API

DataStream & DataSet API is the core API of Flink data processing. It can be called using Java or Scala language and encapsulates a series of common operations such as data reading, data conversion, and data output.

3.3 Stateful Stream Processing

Stateful Stream Processing is the lowest level of abstraction, which is embedded into the DataStream API through the Process Function. Process Function is the lowest level API provided by Flink and has the greatest flexibility, allowing developers fine-grained control over time and state.

4. Flink cluster architecture

4.1 Core Components

According to the above introduction, the second layer of Flink core architecture is the Runtime layer, which adopts the standard master-slave structure. The Master part contains three core components: The Dispatcher, ResourceManager, and JobManager process, while the Slave process is the TaskManager process. Their functions are as follows:

  • JobManagers(also known asmasters) : The JobManagers receive an executor from the Dispatcher containing the JobGraph, The Logical Dataflow Graph with all its classes files, third-party libraries, and so on. JobManagers then convert JobGraph to an ExecutionGraph and apply for resources from ResourceManager to execute the task. Once resources are obtained, the ExecutionGraph is distributed to TaskManagers. So every Job has at least one JobManager; In a highly available deployment, you can have multiple JobManagers, one of which acts as a jobleaderAnd the rest are instandbyState.
  • TaskManagers(also known asworkers) : TaskManagers are responsible for executing the actual subtasks. Each Taskmanager has a number of slots. Slot is a set of resources (such as computing power, storage space) of a fixed size. After TaskManagers starts, it registers its slots with ResourceManager. ResourceManager manages the slots in a unified manner.
  • Dispatcher: Receives the execution submitted by the client and passes it to JobManager. In addition, it provides a WEB UI interface for monitoring the execution of jobs.
  • ResourceManager: Manages slots and coordinates cluster resources. ResourceManager receives resource requests from JobManager and allocates TaskManagers with slots available to JobManager to perform tasks. Flink provides different resource managers based on different deployment platforms such as YARN, Mesos, K8s, etc. When TaskManagers do not have enough slots to perform tasks, it will initiate a session to a third party platform to request additional resources.

4.2 the Task & SubTask

As mentioned above, TaskManagers actually perform subtasks instead of tasks. Here is the difference:

When performing distributed computation, Flink links together operations (Operators) that can be linked together, known as Tasks. This is done to reduce overhead associated with thread switching and buffering, and to improve overall throughput while reducing latency. However, not all operators can be linked. Operations such as keyBy will cause network shuffle and repartition. Therefore, operators cannot be linked and can only be used as separate tasks. Simply put, a Task is a minimal Operator Chains that can be linked to. As shown below, the source and map operators are linked together, so there are only three tasks:

A SubTask is one parallel slice of A Task. A SubTask is one parallel slice of A Task. As shown in the figure above, Source & Map has two degrees of parallelism, KeyBy has two degrees of parallelism, and Sink has one degree of parallelism. Therefore, although there are only 3 tasks in the whole, there are 5 subtasks. Jobmanager is responsible for defining and breaking up these subtasks and handing them off to Taskmanagers for execution, with each SubTask being a separate thread.

4.3 Resource Management

Now that we understand SubTasks, let’s look at how they correspond to Slots. One possible distribution is as follows:

Each SubTask thread runs in a separate TaskSlot, sharing TCP connections (via multiplexing) and heartbeat messages of the TaskManager process to which it belongs, thereby reducing the overall performance overhead. This seems to be the best case, but each operation requires different resources. It is assumed that the number of resources required by keyBy operation of this job is much more than that of Sink, so the resources in the Slot where Sink is located are not effectively utilized.

For this reason, Flink allows multiple subtasks to share slots, even if they are subtasks of different tasks, as long as they are from the same Job. If the parallelism of souce & Map and keyBy is adjusted to 6 and the number of slots remains the same, the situation is as follows:

You can see multiple SubTask subtasks running in a Task Slot, and each SubTask is still executing in a separate thread, but sharing a set of Sot resources. So how does Flink determine how many slots a Job needs? Flink’s solution to this problem is simple: by default, the number of slots required by a Job is equal to the maximum parallelism of its Operation. As follows, operations A, B, and D have A parallelism of 4, while operations C and E have A parallelism of 2. Then the whole Job needs at least four Slots. Through this mechanism, Flink does not care how many Tasks and SubTasks a Job will be split into.

4.4 Component Communication

All Flink components are based on Actor systems to communicate. An Actor system is a container for actors with multiple roles. It provides scheduling, configuration, logging, and many other services. It also contains a thread pool that can start all actors. The message is passed through an RPC call.

Five, the advantages of Flink

Finally, based on the above introduction, the advantages of Flink are summarized:

  • Flink is an event-driven application that supports both stream and batch processing.
  • Memory – based computing can ensure high throughput and low latency, with excellent performance;
  • Support Exactly-once semantics, can perfectly ensure consistency and correctness;
  • Layered API, can meet the development needs of all levels;
  • High availability configuration and savepoint mechanism are supported to ensure security and stability.
  • Diversified deployment modes, supporting local, remote, cloud and other deployment schemes;
  • With a horizontal expansion architecture, it can dynamically expand according to user needs;
  • Supported by a highly active community and a complete ecosystem.

The resources

  • Dataflow Programming Model
  • Distributed Runtime Environment
  • Component Stack
  • Fabian Hueske, Vasiliki Kalavri. Stream Processing with Apache Flink. O’Reilly Media. 2019-4-30

See the GitHub Open Source Project: Getting Started with Big Data for more articles in the big Data series