What is linear algebra?

In college mathematics, linear algebra is the most abstract course, from elementary mathematics to linear algebra thinking span is much larger than calculus and probability statistics. After learning linear algebra, many people stay in the stage of not knowing why it is. After years of contact with graphic programming or machine learning and other fields, they find that linear algebra is everywhere, but they struggle to understand and master it well. It is true that most people can easily understand the concepts of elementary mathematics. Functions, equations, and sequences of numbers come naturally, but entering the world of linear algebra is like entering a strange world, lost in the strange symbols and operations.





When I first learned linear algebra, it seemed like an alien subject, and a question came to my mind:


Is linear algebra an objective law of nature or an artificial design?

I wouldn’t be surprised if your response to this question is, “Of course mathematics is objective and natural.” I used to think so myself. The elementary mathematics from high school and elementary physics Along the way, very few people to doubt whether a mathematical discipline nature, when I study calculus, probability and statistics have never doubted that only linear algebra let me doubt, because of its various symbols and algorithm is too abstract too strange, haven’t a corresponding life experience. So I have linear algebra to thank for getting me thinking about the nature of a mathematical discipline. In fact, not only students, including many math teachers are not clear Chu exactly is what, what’s the use of linear algebra, so both at home and abroad, in foreign countries, domestic Meng Yan wrote understandable matrix, foreign Sheldon Axler professor wrote “linear algebra should learn”, but have not fundamentally understand the cause and effect of linear algebra. For myself, I didn’t learn linear algebra in college, but later understood it from a programming point of view. A lot of people say that being good at math helps with programming. I do the opposite. Understanding programs helps me understand math.

This article is aimed at programmers, I will take you to do a programmer in the world of linear algebra in-depth adventure! Since we are programmers, before we enter the field of linear algebra, we might as well take a look at the world of programs and ask this question:

There are general purpose languages like assembly, C/C++, Java, Python, and DSLS like Makefile, CSS, AND SQL in computers. Are these languages an objective law of nature or an artificial design?

Why ask such a seemingly stupid question? Because the answer is obvious, you know better than abstract linear algebra about programming languages that you use every day, and it’s clear that while programming languages contain internal logic, they are fundamentally artificial. What all programming languages have in common is the establishment of a set of models, the definition of a set of grammars, and the mapping of each grammars to specific semantics. There is a language contract between the programmer and the language implementer: the programmer guarantees that the code conforms to the syntax of the language, and the compiler/interpreter guarantees that the result of the code execution conforms to the semantics corresponding to the syntax. For example, C++ dictates that object A be constructed on the heap using the new A() syntax. If you write C++ this way, you must ensure that it is executed properly, allocating memory on the heap and calling A’s constructor, otherwise the compiler is violating the language contract.

From an application point of view, can we think of linear algebra as a programming language? The answer is yes, we can try using language contracts as standards. Suppose you have an image, and you want to rotate it 60 degrees, and stretch it twice along the X-axis; Linear algebra tells you, “Okay! You construct a matrix according to my grammar, multiply your image by the matrix multiplication rules, and I guarantee you the result is exactly what you want.

In fact, linear algebra is very similar to A DSL like SQL, so here are some analogies:

  • Model and semantics: SQL is built on the low-level language relational model, the core semantics are relational and relational operations; Linear algebra establishes vector model on elementary mathematics, and its core semantics are vector and linear transformation
  • Syntax: SQL defines the corresponding syntax for each semantics, such as SELECT, WHERE, JOIN, etc. Linear algebra also defines the corresponding syntax for semantic concepts such as vectors, matrices, and matrix multiplication
  • Compile/interpret: SQL can be compiled/interpreted as C; The concepts and rules related to linear algebra can be explained by elementary mathematics
  • Implementation: we can perform SQL programming on MySQL, Oracle and other relational databases; We can also do linear algebra programming on MATLAB, Mathematica and other mathematical software

So, from an application point of view, linear algebra is an artificial domain specific language (DSL) that builds a set of models and maps syntax and semantics through a symbolic system. In fact, the syntax and semantics of vectors, matrices, and arithmetic rules are all artificial design, which is similar to the properties of various concepts in a language. It is a creation, but only if the language contract is fulfilled.

Why do we have linear algebra?

If I give you a matrix, you rotate my graph 60 degrees and stretch it twice along the x axis. I always feel uneasy. I don’t know what you do with the bottom layer. In fact, this is like some programmers with high-level language not steadfast, feel that the bottom is the essence of the program, always want to know how this sentence compiled into assembly? How much memory does that operation allocate? Someone who can just type a wget command in the Shell and take down a web page needs to spend dozens of minutes writing a bunch of code in C. In fact, the so-called bottom and top is just a habitual view, not who is more essential than who. The compilation and interpretation of programs is essentially a semantic mapping between different models, usually from a high-level language to a low-level language, but it can be done the other way around. Fabrice Bellard wrote a virtual machine in JavaScript, running Linux on a JavaScript virtual machine, mapping the machine model onto the JavaScript model.

The establishment of new models is certainly dependent on existing models, but this is a means of modeling rather than an end. The purpose of any new model is to analyze and solve a certain kind of problem more easily. When linear algebra is established, its various concepts and operation rules depend on the knowledge of elementary mathematics, but once this layer of abstract model is established, we should be used to directly use high-level abstract model to analyze and solve problems.

Linear algebra is said to be easier to analyze and solve problems than elementary math. Let’s use an example to get a feel for its benefits:

Given the vertices of a triangle (x1, y1), (x2, y2), (x3, y3), find the area of the triangle.

The most famous formula for calculating the area of a triangle in elementary mathematics is area = 1/2 * base * height. It is easy to calculate the area of a triangle when one of its sides is exactly on the coordinate axes. But what if the same triangle and we rotate the axes so that the sides are not on the axes? Can we still get its base and its height? The answer is definitely yes, but it’s obviously complicated, and there are many cases to discuss separately.

On the contrary, if we use linear algebra to solve this problem, it’s very easy. In linear algebra, the Cross Product of two vectors A and B is a vector whose direction is perpendicular to a and B and whose size is equal to the area of the parallelogram formed by a and B:





We can treat the sides of a triangle as vectors, so the area of a triangle is equal to the cross product of the two side vectors divided by the absolute value of two:


area = abs(1/2 * cross_product((x2 – x1, y2 – y1), (x3 – x1, y3 – y1)))


Note: ABS means absolute value, cross_product means cross product of two vectors.

A problem that is a little difficult in elementary math is solved in linear algebra in a flash! Some people might say, well, you just do it based on the cross product, which is easy, but isn’t the cross product complicated? Why don’t you try it out? Yes, the role of the model is to hide some of the complexity in the model so that the user of the model can solve the problem more easily. Bjarne Stroustrup, the father of C++, once said that C++ was too complex.

Complexity will go somewhere: if not the language then the application code.

The complexity of a problem in a given environment is determined by its nature, and C++ has incorporated some of that complexity into the language and standard library to make applications simpler. Of course, not all situations are made easier by C++, but in principle the complexity of C++ makes sense. In addition to C++, Java, SQL, CSS and other languages and frameworks are better than it. Imagine how complicated it would be to store and manage data yourself at every turn without using a database. It is easy to see why linear algebra defines the cross product as a strange operation, in the same way that C++ incorporates many common algorithms and containers into the STL. Similarly, you can even define the operations you want in linear algebra and reuse them. So mathematics is not rigid at all. It is as dynamic as a program, and you can handle it when you understand it. While we’re at it, let’s answer a very common question:

The dot product, the cross product, and the matrix operations of linear algebra are all weird, so why define these operations? Why are they defined this way?

In fact, like program reuse, linear algebra defines the dot product, cross product and matrix operations because they are very widely used and have great reuse value, which can be used as the basis for our analysis and solution of problems. For example, if many problems involve the projection of one vector to another or the Angle between two vectors, the operation Dot Product may be considered specifically:






The concept of dot product belongs to design and has room for creation. Once the design is decided, the specific formula cannot be arbitrarily played, it must be logical, to ensure that it is correctly mapped to the elementary mathematical model. This is like a high-level language that can define many concepts, such as higher-order functions, closures, and so on, but it must ensure that when mapped to the underlying implementation, the effect produced by the implementation conforms to the specification it defines.


What’s great about linear algebra?

As mentioned above, linear algebra is a high-level abstract model, and we can learn its syntax and semantics in the same way that we learn a programming language, but this understanding is not only for linear algebra, it is universal to every mathematical discipline, and some people may have doubts

Calculus and probability theory are also high-level abstractions, so what are the characteristics of high-level abstractions like linear algebra?

This gets to the core of linear algebra: vector models. The coordinate system we study in elementary mathematics belongs to the analytic model proposed by Descartes, which is very useful, but also has great disadvantages. The coordinate system is an artificial virtual reference system, but the problems we need to solve, such as finding the area, graph rotation, stretching and other applications are independent of the coordinate system, and establishing a virtual coordinate system is often not helpful to solve the problem, as was the case with the triangle area.

Vector model overcomes the shortcomings of analytical model well. If analytical model represents a certain “absolute” world view, then vector model represents a certain “relative” world view. I recommend that vector model and analytical model be regarded as opposite models.

Vector model defines the concepts of vectors and scalars. The vector has magnitude and direction and satisfies the linear combination rule. A scalar is a quantity that has only magnitude and no direction (note: a more profound definition of a scalar is a quantity that remains constant in coordinate transformations). Vector model is one of the advantages of the independent coordinate system, which is of relativity, it has been defined vector and the algorithm of time from the start, despite the constraints of the coordinate system, no matter how you coordinate axis rotation, I can adapt to, a linear combination of the vector, inner product and cross product, linear transformation, etc and other operations are all independent of coordinate system. Notice that coordinate system independence does not mean that there are no coordinate systems, there are, the vertices in the triangle example are represented by coordinates, but different coordinate systems don’t matter when you solve the problem. To use an analogy, Java claims to be platform-agnostic, not that Java is a pie in the sky, but that it doesn’t matter whether you’re programming in Java on a Linux or Windows base.

What’s good about vector models? In addition to the area problem of a triangle as an example, let me give another geometric example:

Given a point in a three-dimensional coordinate system
(x0, y0, z0)And a plane
a*x + b*y + c*z + d = 0What is the vertical distance from the point to the plane?



The problem is almost too complicated to solve in terms of analytic geometry, except for the special case where the plane happens to cross the coordinate axis, but it’s easy to solve in terms of vector models: According to the plane equation, the Normal Vector of the plane is v=(a, b, c), let the Vector from any point on the plane (x, y, z) to (x0, y0, z0) be w, then calculate the projection Vector p from w to V through the inner product dot_product(w, v). Its magnitude is the vertical distance from (x0, y0, z0) to the plane A *x + b*y + c*z + d = 0. Here we use the basic concepts of vector model: normal vector, projection vector, inner product, the whole process of solving the problem is simple and quick.

I’ll leave you with a similar exercise (those of you familiar with machine learning might find this linear algebra applied to linear classification) :

Given two points in n-dimensional space
(a1, a2, ... an).
(b1, b2, ... bn)And a hyperplane
c1*x1 + c2*x2 ... + cn*xn + d = 0, please determine whether the two points are on the same or opposite side of the hyperplane.

Moving away from vectors, we are now going to introduce another protagonist of linear algebra: the Matrix.

Linear algebra defines matrix and vector, matrix and matrix multiplication, the operation rules are very complex, what is used to do is not clear, many beginners can not well understand, it can be said that matrix is learning linear algebra block tiger. Encounter complex things, often need to avoid a head into details, first from the overall grasp of it. In fact, from a program’s point of view, no matter how strange the form, it is nothing more than a syntax, and the syntax must correspond to the semantics, so the key to understanding a matrix is to understand its semantics. There is more than one kind of semantics of matrix, which has different meanings in different environments and can be interpreted differently in the same environment. The most common ones include: 1) represents a linear transformation; 2) represents the set of column vectors or row vectors; 3) represents the set of submatrices.

The matrix as A whole corresponds to the linear transformation semantics: multiply matrix A by A vector V to get W, and matrix A represents the linear transformation from V to W. For example, if you want to rotate v0 60 degrees counterclockwise to get v’, you can just multiply v0 by the Rotation Matrix.





In addition to the rotation transformation, the stretch transformation is also a common transformation. For example, we can stretch a vector twice along the X-axis by using a stretch matrix (try to come up with the form of the stretch matrix yourself). More importantly, matrix multiplication has a nice property: it satisfies the binding rate. This means that linear transformations can be superimposed. For example, we can multiply the matrix M “rotated 60 degrees counterclockwise” by the matrix N “stretched twice along the X-axis” to get a new matrix T representing “rotated 60 degrees counterclockwise and stretched twice along the X-axis”. Isn’t this a lot like our Shell where multiple commands are piped together?


In addition to the coordinate system independence of the vector model, another advantage of the vector model is linearity, so it can be used to represent linear relations. Let’s look at an example of the familiar Fibonacci sequence:

The Fibonacci sequence is defined as:
f(n) = f(n-1) + f(n-2), f(0) = 0, f(1) = 1; Problem: input n, please give the solution
f(n)Algorithm whose time complexity does not exceed O(logn).

First, we construct two vectors v1 = (f(n+1), f(n)) and v2 = (f(n+2), f(n+1)). According to the properties of Fibonacci sequence, we can obtain the recursive transformation matrix from v1 to v2:




And further obtain:





This transforms the linear recursion problem into a classical problem of matrices to the NTH power, solved in order log n time. In addition to linear recursive sequences, the famous n-element first-order equations problems in elementary mathematics can also be more easily solved by transforming them into matrix and vector multiplication forms. The purpose of this example is to show that any system that satisfies linear relation is the place where vector model can be used, and we can often transform it into linear algebra to get a simple and efficient solution.

In short, my experience is that the vector model is the core of the whole linear algebra, the concept of vector, properties, relations, transformation is the key to master and use linear algebra.

conclusion

This paper presents a view that linear algebra can be regarded as a domain-specific programming language from an application point of view. Linear algebra builds vector models based on elementary mathematics and defines a set of syntax and semantics that conform to the language contract of procedural languages. Vector model has coordinate system independence and linearity, it is the core of the whole linear algebra, is the best model to solve linear space problems.

Now, are you fit to make the transition to artificial intelligence development?

And surprise pickup



Follow public accounts

【 Pegasus Club 】


Previous welfare concerns about the pegasus public number, reply to the corresponding keywords package download learning materials; Reply “join the group”, join the Pegasus AI, big data, project manager learning group, and grow together with excellent people!

Reply number “5” big data learning material download, novice guide, data analysis tools, software use tutorial

AI Artificial Intelligence/Big Data /Database/Linear Algebra/Python/ Machine Learning /Hadoop