What mathematical foundations are necessary to learn ARTIFICIAL intelligence?

This article is originally published by AI Frontier. The original link is t.cn/RTzPO4A

At present, artificial intelligence has become a required course in the new era, and its importance is needless to say. However, as an interdisciplinary product, it contains a vast amount of content, and all kinds of complex models and algorithms are even more daunting. For most beginners, how to get started with AI is a bit of a puzzle, such as what kind of math is required, whether engineering experience is required, and what should be the focus of a deep learning framework.

So where do you start to learn artificial intelligence? What is the learning path of artificial intelligence?

This article is excerpted from professor Tianyi Wang’s artificial Intelligence Basics course on geek Time App, with authorization. For more articles, download the Geek Time App and subscribe.

The basic knowledge of mathematics contains the basic ideas and methods to deal with intelligent problems, and is also a necessary element to understand complex algorithms. All of today’s ARTIFICIAL intelligence technologies are ultimately based on mathematical models. To understand artificial intelligence, one must first master the necessary basic mathematical knowledge, specifically including:

Linear Algebra: How to formalize the Object of study?
Probability Theory: How to describe the laws of Statistics?
Mathematical Statistics: How to see big from small?
Optimization Theory: How to find the optimal solution?
Information Theory: How to measure Uncertainty quantitatively?
Formal Logic: How to Achieve Abstract Reasoning?

Linear Algebra: How to formalize the Object of study?

In fact, linear algebra is not only the foundation of artificial intelligence, but also the foundation of modern mathematics and the many disciplines in which modern mathematics is the primary method of analysis. From quantum mechanics to image processing are inseparable from the use of vectors and matrices. Behind vectors and matrices, the core significance of linear algebra is that it provides an abstract view of the world: everything can be abstracted into combinations of features and viewed statically and dynamically within a framework defined by preset rules.

Focusing on the interpretation of abstract concepts rather than concrete mathematical formulas, the main points of linear algebra are as follows: The essence of linear algebra is to abstract concrete things into mathematical objects and describe their static and dynamic properties; The essence of a vector is a rest point in n-dimensional linear space. A linear transformation describes a change in a vector or a coordinate system as a reference system, which can be represented by a matrix; The eigenvalues and eigenvectors of the matrix describe the speed and direction of change.

In short, linear algebra is to artificial intelligence what addition is to higher mathematics, a basic toolset.

Probability Theory: How to describe the laws of Statistics?

In addition to linear algebra, probability theory is an essential mathematical foundation for artificial intelligence research. With the rise of connectionism, probability statistics has replaced mathematical logic and become the mainstream tool of artificial intelligence research. Probability theory has played a central role in machine learning in a world of explosive data growth and exponential computing power.

Like linear algebra, probability theory represents a way of looking at the world that focuses on possibilities everywhere. The frequency school believes that the prior distribution is fixed and the model parameters should be calculated by maximum likelihood estimation. Bayes school believes that prior distribution is random and model parameters should be calculated by posterior probability maximization. Normal distribution is the most important distribution of random variables.

Mathematical Statistics: How to see big from small?

In the research of artificial intelligence, mathematical statistics are also indispensable. Basic statistical theory helps to explain machine learning algorithms and data mining results, and only when reasonable interpretation is made can the value of data be reflected. Mathematical statistics studies random phenomena according to the data obtained from observation or experiment, and makes reasonable estimates and judgments on the objective laws of the research object.

Although mathematical statistics is based on probability theory, there are essential differences in methods between them. The premise of probability theory is that the distribution of random variables is known, and the characteristics and laws of random variables are analyzed according to the known distribution. The research object of mathematical statistics is the random variable with unknown distribution. The research method is to observe the random variable independently and repeatedly, and deduce the original distribution according to the observation results obtained.

In a loose but intuitive phrase, mathematical statistics can be regarded as probability theory in reverse. The task of mathematical statistics is to infer the properties of the population from observable samples in reverse; The tool for inference is a statistic, which is a function of the sample, which is a random variable; Parameter estimation is used to estimate unknown parameters of population distribution by randomly selected samples, including point estimation and interval estimation. Hypothesis testing is often used to estimate the generalization error rate of machine learning models by accepting or rejecting a judgment about the population from a randomly selected sample.

Optimization Theory: How to find the optimal solution?

In essence, the goal of ARTIFICIAL intelligence is optimization: making optimal decisions in complex environments and multi-body interactions. Almost all artificial intelligence problems ultimately boil down to the solution of an optimization problem, so optimization theory is also a necessary basic knowledge of artificial intelligence. The problem of optimization theory is to determine whether the maximum (minimum) of a given objective function exists and to find the value that makes the objective function reach the maximum (minimum). If a given objective function is regarded as a mountain range, the process of optimization is to determine the location of the peak and find the path to the peak.

In general, the optimization problem is to solve the minimum value of a given objective function without constraints. In linear search, the first and second derivatives of the objective function are needed to determine the search direction for finding the minimum value. The idea of confidence region algorithm is to determine the search step and then the search direction. Heuristic algorithm represented by artificial neural network is another important optimization method.

Information Theory: How to measure Uncertainty quantitatively?

In recent years, scientific research has proved that uncertainty is the essential attribute of the objective world. In other words, God really does play dice. The uncertain world can only be described by probabilistic models, which led to the birth of information theory.

Information theory uses the concept of “information entropy” to explain the amount of information in a single source and the amount and efficiency of information transferred in communication, and builds a bridge between the uncertainty of the world and the measurability of information.

In short, information theory deals with uncertainty in the objective world; Conditional entropy and information gain are important parameters in classification problems. KL divergence is used to describe the difference between two different probability distributions; The principle of maximum entropy is a common criterion for summarizing classification problems.

Formal Logic: How to Achieve Abstract Reasoning?

The Dartmouth conference in 1956 heralded the birth of artificial intelligence. In the infancy of ARTIFICIAL intelligence, the founders, including John McCarthy, Herbert Simon, Marvin Minsky and other future Turing prize winners, had a vision for “programs capable of abstract thought that explain how synthetic matter could possess the human mind.” In layman’s terms, the ideal AI should be able to learn, reason and generalize in an abstract sense, and be far more versatile than algorithms that solve specific problems such as chess or Go.

If the cognitive process is defined as the logical operation of symbols, the basis of artificial intelligence is formal logic. Predicate logic is the main method of knowledge representation. Predicate-logic system can realize artificial intelligence with automatic reasoning ability. The incompleteness theorem challenges the basic idea of artificial intelligence that the essence of cognition is computation.

Annual catalogue of Basic Courses of Artificial Intelligence

This column will focus on the core concepts of machine learning and neural network, and combine the current hot deep learning technology to outline the basic outline and main path of the development of artificial intelligence.

Column Subscription Guide

Please search and download the “Geek Time” App on Apple and android marketplaces and sign up.
Find the column entry on the Discovery page and click on the column to complete the purchase.

Follow our wechat account “AI Front “and reply to “AI” in the background to obtain the SERIES of “AI Front “PDF e-books

What mathematical foundations are necessary to learn ARTIFICIAL intelligence?

Related Posts

Xiao Zhi’s hot pot essay – Mandarin duck pot and depth camera registration

Name a few big things that happened in IT these days

Social restructuring, game innovation, everything meta-universe? This conference makes it clear to you | activity preview