This is the third day of my participation in Gwen Challenge

Author: Gao Er egg

Source: Hang Seng LIGHT Cloud Community

As trill, quickly, such as the rise of short video platform, flow calculation into the line of sight of people, companies use the behavior of flow calculation based on user preferences, in a short time, reflected in the recommended model, recommend model with low latency again capture the behavior of user preferences, so as to provide more accurate and timely recommendations, this is the reason why we brush trill stop; Here’s a look at the past and present of streaming data:

The first:

Features: good real-time, but mass data, high concurrency is not good;

The second:

Features: High concurrency is achieved, but low latency is not

Third :(the architecture of the original first generation streaming processing)

  • Features: the current calculation process, what is needed, do not go to the relational database to check, directly save it to the local state on the line;
  • Disadvantages: can not meet the order of processing data;

Fourth:

Advantages: real-time, low delay;

Disadvantages: more trouble to maintain, to maintain two sets of systems;

Note: The speed table here first obtains approximate data for rough query; Then check and batch table

Accurate data after combination;

Fifth:

Is to use FLink for streaming processing, a collection of all the advantages;