Convenient bee intelligent production strategy platform exploration and practice

The introduction

What is strategy

There is a very important position in every major Internet company: strategy (algorithm) engineer. We know that front-end development engineers and back-end development engineers often collaborate to develop Web and apps for users to use, but what about strategy engineers?

This starts from what is the strategy, Baidu Baike to explain is:

Strategy refers to stratagem; Counsel. Generally refers to:

A collection of solutions that can achieve goals;

The course of action and methods of struggle drawn up in light of the development of the situation;

He has the art of fighting, and he pays attention to ways and means.

For example, in the fresh food section of Convenience Bee, we sell hot steamed buns, such as fresh meat bun, cream bun, shaomai, brown sugar steamed bun, vegetable tuanzi and so on. We will also regularly update the flavor of baozi according to customers’ preferences and other factors.

For example, suppose there are 30 kinds of steamed buns, considering the display effect and other factors, only 20 of them can be selected to sell, how to choose?

The strategy focuses on several key points:

What is the optimization objectiveSales of steamed buns
What are the limitationsYou can only choose a maximum of 20 baozi
What is the control variableThe type and quantity of baozi

Three phases of strategy

Note: The following is the derivation of strategy development, which is convenient for you to understand the process of strategy “evolution”, but not the actual use of convenient bees.

Manual rules. Based on a small number of samples and summary of human experience, formed strategies.

Strategy 0.1 — “artificial” intelligence

Shop assistants/operators will choose the type and quantity of baozi to order for each store based on their own experience.

For example, the type and quantity of steamed stuffed bun every day this week is equal to the type and quantity of steamed stuffed bun corresponding to the day last week. If there is intelligence information (such as the opening of surrounding schools), corresponding adjustment can be made.

In this way, some badcases may occur:
- Last week it sold enough, this week it didn’t (or vice versa)
- A certain type of bun that used to sell well suddenly didn’t sell well this week
- The clerk/operator forgot or did not have time to make adjustments and ended up ordering too much or too little
- The number of stores is increasing and the operation is no longer sustainable (1 store per minute, 100 vs. 1000 stores)
- .
If a man can’t do it, let a machine do it, and —

Strategy 1.0 — Steamed buns distinguish meat and vegetable, and select products according to proportion

For example, choose five meat dishes, five half-meat dishes, and five vegetarian dishes, and divide them equally.

Under this strategy, a new badcase appears:
- There are a lot of IT people in the office buildings around a store, and they prefer meat and half meat
- There is a gym around a shop, fitness enthusiasts have a special liking for vegetarian steamed stuffed bun
- .
Statistical analysis. Based on a certain amount of historical samples, the optimal solution is obtained through statistical analysis for simple variables.

Strategy 2.0 — Predict the future based on history

Reason: Based on the fact that the customer flow around the store and its characteristics are stable, the future can be predicted according to the historical situation.

Count the sales of steamed stuffed bun on weekdays and rest days in the past week, and select the type and quantity of steamed stuffed bun according to this.

Special information and special treatment.

It may seem that there are fewer bad cases in this case, but in reality, there are still various requirements changes.

For example, although I love eating fresh meat buns, if I eat them every day for a few weeks, I will get bored and want a change. If you are like me, you will find that even if the customers in one store do not change, the sum of the demands for steamed bun types will be different from day to day.

For example, if a store only sells steamed stuffed bun with certain flavors, customers will gradually develop a certain sense of taste fatigue and weariness, so they need to use a new flavor. Then, which flavor should be replaced and how much should be replaced are all things that need to be considered by strategy (algorithm).
Optimization algorithms (including machine learning and operations algorithms, etc.) For a definite goal, the approximate optimal solution is obtained by using the optimization algorithm, which is easy to deal with multiple variables.

Strategy 3.0 — Operational algorithms for selection components

Through business analysis, a variety of factors with large impact are taken as variables, refined constraints are used to set optimization goals, and operations research algorithms are used to obtain approximate optimal solutions.

For simple business, through the above means of statistical analysis can be solved. But for more complex ones, some optimization algorithms need to be introduced.

For example, if we want to do one thing, we can do it in three ways: A, B and C. When we only do this thing, we can make evaluation and selection by means of statistical analysis. However, if there are 100 things to be done, each thing has these three ways, and the combination is very large. It is impossible to get a better solution in a limited time by simple statistical analysis, and some optimization algorithms need to be introduced to improve the efficiency of solving the problem.

Why do we need a strategy

Humans are good for single-point, fine-grained decisions, while machines are good for large-scale decisions. For convenience stores, at the beginning, only a few can rely on people to make decisions, as more and more stores, we need machines to make decisions for us, the strategy came into being.

It is convenient for bees to use algorithm (strategy) in many places, including site selection, ordering, display, scheduling and so on, through the algorithm (strategy), one is to save a lot of labor costs for us, and two can reduce the occurrence of human error, improve the efficiency and quality of decision-making.

How to develop a strategy

We already know what a strategy is, so how do we develop a strategy?

A development strategy is usually divided into several phases:

The design phase

To develop a strategy, you need to design it first. During the design phase, there are a few things to do.

Analyzing historical data

By analyzing historical data, we can find some problems in the business.

The design strategy

Once the problem is identified, we can begin to design (optimize) the strategy to see how to solve the problem or reduce the impact of the problem.

Modeling – Policy section

The modeling of strategy mainly includes three parts:

The input
Strategy (Algorithm)
The output

By calculating the input and getting the output, these three parts are connected in series to form a complete strategy.

Modeling – System part

System modeling mainly includes:

Upstream scheduling mode

This parameter specifies how upstream policies are triggered, including RPC, HTTP, MQ, and scheduled (agreed time).
Notification method downstream

It refers to how to notify the downstream service after the run is complete, including RPC, HTTP, MQ, and timing (agreed time).
How data is transmitted

How index data is sent downstream, whether through caching or database media, or directly through HTTP or RPC.
Exception handling

This section describes how to handle failures and timeout, such as SMS and phone notification.

The development phase

Once the design is complete and reviewed, it is ready for development. During the development phase, focus on the following.

Technology selection

Choose the technology stack for development. In Convenience Bee, some of our strategy students are familiar with Java, and some are familiar with Python. At this time, we can choose the technology stack we are familiar with to develop.

Develop code

Once the technology stack is determined, development can begin. Developing code usually goes through the following process:

Querying data sources
Reading configuration
log
Recording intermediate results
Monitor the rbi
Write/return the result

self-test

After development, you need to test yourself to verify that the code is bug free and works as expected. To do this, the following things are usually done:

View log validation logic
View buried point data to ensure that key node data is correct
The comparison results confirm that the data is in line with expectations
Single-step debugging locate faults

Validation phase

In the verification stage, the main verification is whether the full data meets the expectation. So what does that say? If you just look at the data, it’s very difficult to see the problem in the millions, tens of millions of data. In this case, we can adopt some measures:

Classic means: Diff the online version of the policy data from the same data source, and analyze the differences of store level, classification level and category level to see whether the differences are consistent with the policy expectations. Classical methods are easy to identify problems with large differences in deviation, but for small differences (such as one or two store products), if the cost of continued analysis will be huge.
Advanced means: use simulation platform. The simulation platform simulates the store’s inventory, arrival, sales and other data. By using the results of the strategy, the store’s sales, abandonment and opportunity loss data can be simulated and calculated. By comparing the business data of the two versions, the effect after the launch can be predicted, and the decision can be made whether the launch can be made or whether further optimization is needed.

On-line and post-evaluation

After passing the verification, it is ready to go online. The online stage will go through:

Policy Online Approval
Strategy released
Verify online results after publication

This step is to verify that the strategy is working and that the results are as expected.

Because some days are selected randomly during the verification, some special scenes may be missed. We showed up once, tested it on a weekday, went live on a Friday, and the weekend data was false.

Therefore, it is necessary to follow the line, generally more than 3 days with the test.
Compare the business indicators before and after the launch

The original strategy development model

Led technology

At the beginning of the business, we did not have special strategy students. After receiving the demand of similar “strategy”, the technical students directly developed in the business system.

The general development mode is shown as follows:

In this mode, there are problems, mainly including:

Policy and business logic are coupled, sharing Biz/Service

When modifying the logic, the biz/ Service on which the policy depends may be accidentally changed, causing policy problems. Even if it is possible to analyze which users are involved in a modified BIZ/Service through some means of testing, the cost of backtesting all of these users is significant.
Policies and services depend on resources differently, resulting in waste of resources

For example, policies may be CPU intensive, while businesses (especially Java services) are often more memory intensive. To ensure the reliability and stability of policies and services, apply for sufficient CPU and memory resources. Strategies and businesses have their own peaks, and applying resources at those peaks is a huge waste.

Convenience bee is a start-up company, for this waste is to try to avoid, to use the good steel blade.

Strategy for leading

As the business grows bigger and the system becomes more and more complex, some “strategic” logic is entrusted to the strategic students. At that time, the company had a strategy platform, and the strategy students developed strategies on it in the following way:

As can be seen from the figure, the data source on which the policy depends is responsible for the system development. After the development, the policy development can be directly obtained from the context.

Policy development goes through the steps 1 through 6 shown in Figure 6. When problems are found, repeat steps 1 to 6 after code changes.

Due to the limited resources of the environment, this whole process takes dozens to hours. If you have small problems during development, you can’t verify them more than a few times a day.

Specifically, there are the following problems in this development process:

Language support

Existing policy platforms are built on the Java architecture, and under this architecture, there are usually several ways to support running Python:
- Execute Python statements or call scripts directly in Java (Jython)
- Run the Python script using Runtime.getruntime ()
- Encapsulate Python as a service and schedule it through inter-service communication
Each method has advantages and disadvantages that won’t be repeated here. Existing platforms take the first approach, using Jython. Jython makes it easy to use objects in Java, but it only supports Python2 and has limited support for third-party modules, making it impossible to develop policies based on machine learning or operations algorithms.
Environment and Resources

As mentioned above, existing policy platforms use Jython, and multiple policies are in the same JVM, crowding out each other’s resources.

Fortunately, we have multiple test environments, and people often use one to validate their strategy. When the number of students developing at the same time exceeds the number of test environments, there is a waiting list.

Online, however, is not so lucky. When the front policy is delayed for any reason, the back policy has to wait for resources, even if it doesn’t rely on the front policy data. As a result, the number of runs in the morning fails, affecting the whole day and causing delays.
Code maintenance

The current development model is similar to Jupyter, where you post code to a page and save it, and the strategy takes effect.

There are convenience aspects to this model, but from a quality perspective, there are a lot of risks, such as not Posting the latest code when launching, or overwriting each other when developing in parallel, etc.
Execution efficiency

The resources of each test environment are fixed. Some small strategies can run for tens of seconds at a time, but some larger strategies can run for nearly an hour. If problems are found, they can not run several times in a day.

In particular, depending on the data source may change during this period (for example, the upstream is also testing at this time), the two DIFF results may not be inconsistent due to the change of their own strategy, which seriously reduces the efficiency of developing self-test.
Problem orientation

When something goes wrong, we usually look at what the input is, and based on that input, what logic went into causing the wrong output.

But on the current platform, you can’t get it without printing it all out. However, printing all the data is too large, which causes great inconvenience to troubleshooting, and sometimes it takes a long time to locate the problem (for example, there is a problem online, but there is no problem in the test environment).

Exploration of strategic platform

We found that due to the problems mentioned above, the iteration efficiency of the strategy could not be improved, and the platform was in urgent need of transformation and improvement to enable and improve the efficiency of the strategy.

Here we are faced with several options:

Iterative optimization based on existing platforms

The existing platform is developed based on SSM (Spring+SpringMVC+MyBatis). The scheduling, operation and business logic of policies are coupled to a certain extent. If the reconfiguration is carried out based on this, the logic needs to be reorganized and the business part removed. In between, the strategy needs to iterate normally.

At the time, this option was costly and risky because there were not enough manpower.
Choose an open source platform on the market

There are streaming or batch policy scheduling (execution) platforms on the market, such as Flink, Spark, Storm, etc. None of these can fully meet our needs and cannot be used directly, but can be used as part of our policy platform.
Construct a brand new platform independently

Finally, Spring Cloud Dataflow is chosen as the execution and scheduling engine of the policy to construct a complete set of policy platform.

Spring Cloud Dataflow was chosen based on its ability to support Java and Python-type policies that matched the team’s technology stack, with no additional learning costs and a quick start in developing strategies.

This new platform focuses on the following points:
- The quality of
- The efficiency of
- The cost of
- standard
- security

The model definition

StrategyGroup

A collection of strategies/algorithms used to distinguish between different strategies (teams). Data permission management is also based on policy groups.

StrategyStream

The next level of a policy group, representing a policy/algorithm.

Policies are iterated over time (version updates), but these versions of policies have similar (or identical) structures, which are abstracted out as policy flows.

Policy flows typically include three types of nodes: source, Processor, and sink:

Source is responsible for splitting the execution to improve parallelism;
The processor is where the specific policies/algorithms are executed.
Sink is responsible for storing the execution results.

StrategyInfo

The next level of a policy flow, a class of nodes that represent one version of a policy flow (such as the Source node).

The smallest unit to publish and deploy an image.

Policy Task (TaskBaseInfo)

As shown in task 1 and Task 2 above, the policy task is based on policy flow + policy/algorithm of the specified version.

The same policy flow can run multiple policy tasks at the same time, and control effective stores (or other business codes) through black and white lists to achieve AB tests and other functions.

Policy TaskExecution (TaskExecutionInfo)

When a policy task is triggered, the policy task is executed.

Policy TaskExecution Details (TaskExecutionDetailInfo)

A “shard” in the execution of a policy task. The source node returns each element in the list as a detail of the execution of a policy task.

Maximum granularity of policy task execution.

Strategy scheduling

Technical selection: Storm, Spark, Flink, Spring Cloud Dataflow

Spring Cloud Dataflow (SCDF)

Spring. IO/projects/sp…

Dataflow. Spring. IO/getting – sta…

Data Flow Server mainly provides SCDF services, including the interaction between Web and Shell pages. Skipper Server is responsible for the deployment of policy nodes.

Because it exposes a series of RESTful apis, SCDF can be easily integrated into the system:

Viewing the Stream list
View a stream detail
Create/update the stream
The elastic telescopic dataflow. Spring. IO/docs/featur…

Policy runtime

The running time of the policy is described in the figure. After the policy is triggered, the Source node, processor, and Sink node process the policy successively. Finally, the Source node determines the status of each execution detail and updates the status of the main task.

Log buried point

Using EFK, logs in policies and burying points of intermediate results are collected to ES and Hive for policy analysis.

conclusion

The above has shared with you the exploration and practice of convenient Bee intelligent production team on the strategy platform. In the process of use, we also continue to improve and optimize the platform. If you have any ideas or suggestions, please leave a comment.

If you are interested in related technologies and strategies, welcome to join us. Resume can be sent to: [email protected] (email title: Convenient bee intelligent production team).

The authors introduce

Ma, a back-end development engineer of The Intelligent production team of Convenience Bee, participated in the construction of the ERP system of the company and made some strategies for fresh food production, and then hatched this strategy platform together with the great minds in the group. At present, together with my friends, I am making my contribution to the mission vision of “Quality life is convenient for China”.

Recruitment website

Bianlifeng. Gllue. Me/portal/home…
Learn more about the position