Welcome to “Algorithms and the Beauty of Programming” ↑ pay attention to us!

This article was first published on the wechat official account “Beauty of Algorithms and Programming”. Please follow us and learn more about this series of articles in time.

Today we’re going to talk about linear algebra, one of the fundamental mathematics in our machine learning. First, let’s separate the words “linear algebra” to understand “what is linear?” “And” What is algebra? .

What is linear?

We have learned in middle school “one time equation” “one time function”, in fact, they are “linear equation”, “linear function”.

In college, linearity includes: “linear algebra”, “linear transformation”, “linear differential equations”, “linear partial differential equations”, “linear programming”, etc.

In fact, the algebraic meaning of linear function is: additivity, proportionality (multiplicability).

X,y, and k can all be understood as objects, as long as these objects can be written as y equals kx, then they’re linear functions. It goes through the origin.

What is algebra?

Simply put, algebra is all about using letters instead of numbers to perform operations. The letters here are abstract; they can be variables, vectors, matrices, etc.

In fact, the core problem of linear algebra is the solution of multiple equations, the main contents of which are product, inner product and rank.

Given the matrix A and matrix B, take the product of A and B, C is equal to AB. Matrix A is MXN in size and matrix B is NXP in size.

General method: Cij= the ith row of A times the JTH row of B in each column of matrix C.

Let’s say n is the vector

The (x, y) = x1y1 + x2y2 +… +xnyn, call [x,y] the inner product of the vectors x and y.

Definition of rank in online row algebra:

The column rank of A matrix A is the maximum of linearly independent columns of A, similarly, the row rank of A is linearly independent of the row rank of A, the row and column rank of A matrix are always equal, so they can simply be called the rank of matrix A, usually expressed as R (A);

So the rank of matrix is calculated by using elementary transformation of row and elementary transformation of column into step form, and the number of non-zero rows in the step form is the rank of the matrix.

What about algorithms in linear algebra?

Gaussian elimination

Gaussian elimination is an algorithm in linear algebra, which can be used to find the rank of matrices for linear equations and the inverse matrix of invertible equations. When applied to a matrix, Gaussian elimination produces a row step.

Matrix inversion

Generally speaking, there are two common methods for matrix inversion:

1. Adjoint matrix method (constructing augmented matrix)

2. Elementary transformation method, namely, elementary transformation of the desired matrix.

(1) Swap two rows

2, some row times the scalar

3. Some row times the scalar plus another row

Least square method

The least square method is the standard equation for over-deterministic systems, that is, systems in which there are more equations than unknowns, and the approximate solution is obtained by regression analysis. In the overall solution, the least square method calculus is calculated to minimize the sum of the squares of the residuals in the results of each equation.

Main idea: select unknown parameters to minimize the sum of squares of difference between theoretical value and observed value.

The most important application is in curve fitting, the best fitting meaning of the least square, that is, the minimization of the sum of the squares of residuals (the difference between the observed value and the fitted value provided by the model).

Application of linear algebra

1. Solve linear regression

Linear regression is a traditional statistical method used to describe the relationship between variables. Generally speaking, the least square method is used to solve the simple problems, because it is relatively simple and easier for us to understand.

(1) Random generation of sample points

(2) Calculation of mean square error (MSE)

(3) Obtain the optimal parameter

2. Principal Component Analysis (PCA)

Principal component analysis (PCA) is a technique for analyzing and simplifying data sets. It is commonly used to reduce the dimension of the dataset while maintaining the maximum characteristics of the contribution of the dataset to the difference. This is done by preserving the lower order principal components and ignoring the higher order principal components.

The main method is to get the principal components (eigenvectors) and their weights (eigenvalues) of the data by eigendecomposition of the covariance matrix. PCA is the simplest method to analyze multivariate statistical distribution by feature vector. The result can be interpreted as an interpretation of the variance in the original data: in which direction do data values have the greatest impact on the difference? In other words, PCA provides an effective way to reduce the dimension of data; If the analyst removes the components corresponding to the smallest eigenvalue from the original data, the obtained low-dimensional data must be optimal

Principal component analysis is particularly useful in analyzing complex data, eg. Face recognition.

3. Singular Value Decomposition (SVD)

Singular value decomposition is an important matrix decomposition in linear algebra, which has important applications in signal processing, image processing, statistics and other fields.

Its characteristics are: outstanding, strange, extraordinary.

Here is an example of SVD:

This is the effect after we use singular value decomposition to process the image

In fact, sometimes mathematics is not so difficult as we imagine, as long as the heart must go to learn, will overcome a lot of difficulties.

More interesting articles:


Tips: Click on the lower right corner of the page “Write a message” to comment, looking forward to your participation! Looking forward to your forwarding!