Three elements of model

In order to transform things and problems into mathematical models of optimization problems, we need to consider three elements: factor variables, constraints and objective functions. We first find all the factors and variables that affect the model according to things and problems, then establish an objective function according to the purpose to measure the effect of the system, and finally find the objective constraints as the constraints of the model.

The formula

According to the formula above, the factor variable of the actual problem can be regarded as an n-dimensional vector, each element of which is a real number. F0 (x) is the objective function we built, and our objective is to minimize the function (maximization can also be minimized). Fi (x) and Hj (x), as constraint functions, can be divided into inequality constraint and equality constraint. The constraint function is used to limit the possible space. If there is no constraint, the constraint function is not needed.

The objective function

Optimization of artificial intelligence

What does optimization have to do with artificial intelligence? It can be said that artificial intelligence is also an optimization process in nature, and for the intelligence we want to achieve, it is also through learning to find the optimal solution. This is a general framework, ai problems to the end almost always come back to the optimal solution problem.

Whether it is traditional machine learning, deep learning, or reinforcement learning with great potential, their basic core ideas can be promoted to optimization problems.

optimization

Constrained optimization

As mentioned earlier, optimization problems may or may not have constraints, and the constrained case is more complex than the unconstrained case. Constraints can be divided into inequality constraints and equality constraints. The function of constraints is to limit the possible space of optimal solutions to some regions.

Even though the constraints make it more complicated, we still have the mathematical tools to do it. For the case of equality constraints, Lagrange multiplier can be introduced to solve the problem, and the original objective function and constraint function can be transformed into Lagrange function. The Lagrangian function and the original objective function have the same optimal solution, so we only need to solve the optimal solution of the Lagrangian function. In the case of inequality constraints, the approach is similar, but the KKT condition needs to be satisfied in addition.

In the following figure, for example, suppose there are four constraints, and their common restricted area is a region defined by four different colors. If the upper part is all the possible space of the optimal solution of the problem, and is in the region after being restricted by constraints.

There are constraints

Unconstrained optimization

The so-called gradient is a vector, the direction of the gradient is the fastest growing direction of the function at a certain point, and the modulus of the gradient is the maximum value of the directional derivative. The direction of gradient descent is the opposite of the direction of gradient. Simply put, gradient descent is like standing on a mountain and taking the same step in all directions, the direction of the fastest descent.

unconstrained

In addition, the local optimal solution may be found when the gradient descent method is used to search for the optimal solution, but once it falls into the local optimal solution, it may not be able to jump out and continue to search for the global optimal solution. Therefore, the local optimal problem also needs to be considered, and there are special methods in engineering to prevent falling into the local optimal solution. However, sometimes the difference between the local optimal solution and the global optimal solution may not be very big, and it will cost a lot to find the global optimal solution, so it is unnecessary to pay attention to the global optimal solution.

Local optimum


This public account focuses on artificial intelligence, reading and feelings, talk about mathematics, computer science, distributed, machine learning, deep learning, natural language processing, algorithms and data structures, Java depth, Tomcat kernel, etc.