This is the 9th day of my participation in the August Wen Challenge.More challenges in August

The body of the

Client

On which machine the Flink job is submitted, the current machine is called Client.

Program code developed by the user, which builds the DataFlow Graph and submits it to the JobManager through the Client.

JobManager

The master node is equivalent to ResourceManager in YARN. HA can be used in the generation environment.

JobManager splits tasks and dispatches them to TaskManager.

TaskManager

Slave nodes and TaskManagers are the parts that actually implement tasks.

communication

Client -> JobManager

When a Client submits a job to the JobManager, it communicates with the JobManager using the Akka framework or library. In addition, the Client uses the Netty framework for data interaction with the JobManager.

Akka communication is based on Actor System. Clients can send instructions to JobManager, such as Submit job or Cancel/Update job.

JobManager -> Client

The JobManager can also feedback information to the Client, such as status updates, Statistics, and results.

JobManager -> TaskManager

The Client submits a Job to the JobManager, which then splits the Job into tasks and submits them to the TaskManager (worker).

The JobManager communicates with the TaskManager based on Akka. The JobManager sends a command such as Deploy/Stop/Cancel Tasks or triggers a Checkpoint. In turn, TaskManager communicates with JobManager to return Task Status, Heartbeat, Statistics, etc.

TaskManager1 -> TaskManager2

In addition, Data between TaskManagers is transmitted over the network. For example, Data Stream is operated by operators. Data is usually transmitted between TaskManagers.

conclusion

When the Flink system starts, JobManager and one or more TaskManagers are first started.

JobManager coordinates the Flink system, and TaskManager is the worker that executes parallel programs.

When the system is started locally, a JobManager and a TaskManager are started in the same JVM.

When an application is submitted, the system creates a Client to preprocess the application and turn it into a parallel data stream for JobManager and TaskManager to execute.