Machine learning and artificial intelligence are getting hotter and hotter. Big data is already hot in industry, but machine learning based on big data is even more popular because it can predict data and inform companies’ decisions through calculation. The most common machine learning algorithms closely related to our life include movie recommendation algorithm and book recommendation algorithm. These algorithms make recommendations based on your movie viewing history or book buying history.

James Le posted an article on KDnuggets about how he got into machine learning. In addition, he found ten commonly used machine learning algorithms and introduced them one by one. Lei Feng net compiled below, shall not be reproduced without permission.

If you want to learn machine learning, how do you get started? For me, this is how I started my machine learning. First, I took an artificial intelligence course. My teacher is a University professor from Technical University of Denmark, whose research direction is logic and artificial intelligence. The textbook we use is A classic textbook on Artificial Intelligence: Peter Norvig’s Artificial Intelligence — A Modern Approach. This book mainly covers intelligent agents, adversarial search, probability theory, multi-intelligence systems, AI philosophy and more. I took this course for three semesters. At last, I made a simple intelligent system based on search, which can complete the transmission task in the virtual environment.

I have learned a lot from this course and I will continue to learn it in the future. In recent weeks, I’ve had the opportunity to speak with a number of machine learning gurus at the Machine Learning Conference in San Francisco, where I’ve talked a lot about deep learning, neural networks, and data structures. I also just finished an introductory machine learning course I took online. In the following sections, I will share with you some common machine learning algorithms that I learned in this course.

Machine learning algorithms fall into three categories: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning requires identification data (for training, there are both positive and negative examples), while unsupervised learning does not require identification data, and reinforcement learning falls between the two (there are some identification data). Here I will introduce you to the top 10 algorithms in machine learning (supervised and unsupervised only, not reinforcement learning).

Supervised learning

Algorithm 1: decision tree

Decision tree is a tree structure that provides a basis for people to make decisions. Decision tree can be used to answer yes and NO questions. It represents various combinations of situations through the tree structure, with each branch representing a choice (yes or no) until all choices are completed and the correct answer is finally given.

Algorithm two: Naive Bayes classifier

Naive Bayes classifier is based on Bayesian theory and its hypothesis (that features are independent and do not affect each other)

P (A | B) is A posteriori probability, P (B | A) is the likelihood that P (A) as the prior probability, P (B) to predict the value for us.

Specific applications are: spam detection, article classification, emotion classification, face recognition and so on.

Algorithm three: least square method

If you know anything about statistics, you’ve probably heard of linear regression. That’s what linear regression is all about. So if you look at the graph below, you have a series of points in the plane, and then you take a line, and you make it as close as possible to those points, and that’s linear regression. There are many ways to find this line, and the least square method is one of them. The least square method works as follows, finding a line that minimizes the Euclidean sum of all points in the plane from that line. This line right here is the line we want to get.

Linearity refers to using a line to fit data, distance represents data error, and the least square method can be regarded as error minimization.

Algorithm four: logistic regression

Logistic regression model is a binary classification model, which selects different features and weights to classify samples in probability, and calculates the probability of samples belonging to a certain category with a log function. That is, there is a certain probability that a sample will belong to one class, and there is a certain probability that it will belong to another class, and the class with high probability is the class to which the sample belongs.

Applications include: credit ratings, the probability of success in marketing campaigns, product sales forecasts, and whether there will be an earthquake one day.

Algorithm 5: Support Vector Machine (SVM)

A support vector machine is a dichotomous algorithm that can find an (n-1) -dimensional hyperplane in an n-dimensional space that can divide these points into two classes. In other words, if there are two kinds of points that are linearly separable in the plane, SVM can find an optimal line to separate these points. SVM has a wide range of applications.

Specific applications include: advertising display, gender detection, large-scale image recognition and so on.

Algorithm six: integrated learning

Ensemble learning is to integrate many classifiers together, each classifier has different weight, and merge the classification results of these classifiers together as the final classification result. The initial integration method was Bayesian decision making, but now error-correcting output coding, Bagging, and Boosting are used for integration.

So why is an integrated classifier better than a single classifier?

1. Bias homogenization: If you average the number of Democratic and Republican votes, you’re bound to get results you didn’t see before, and ensemble learning, like this, can learn things you wouldn’t have learned any other way.

2. Reduced variance: The overall result is better than the result of a single model because it considers the problem from multiple perspectives. Similar to the stock market, it is better to consider multiple stocks rather than just one, which is why multiple numbers work better than a few because they take into account more factors.

3. It is not easy to overfit. If a model of is not fit, it is more difficult to overfit multiple models that comprehensively consider multiple factors.

2. Unsupervised learning

Algorithm seven: clustering algorithm

Clustering algorithm is to process a pile of data and cluster the data according to their similarity.

There are many kinds of clustering algorithms, as follows: center clustering, association clustering, density clustering, probability clustering, dimension reduction, neural network/deep learning.

Algorithm 8: Principal Component Analysis (PCA)

Principal component analysis (PCA) is to find the principal components by converting some columns of possibly related data into linearly independent data by orthogonal transformation.

PCA is mainly used for data compression and simplification in simple learning and visualization. However, PCA has some limitations. It requires you to have knowledge of a specific field. It does not apply to noisy data.

Algorithm 9: SVD matrix decomposition

SVD matrix is a complex real complex negative matrix. Given a matrix M with m rows and N columns, the MATRIX M can be decomposed into M = U σ V. U and V are unitary matrices, sigma is a diagonal matrix.

PCA is actually a simplified version of SVD decomposition. In the field of computer vision, the first face recognition algorithm is based on PCA and SVD, which uses features to represent the face, then reduces dimension, and finally carries out face matching. Although today’s facial recognition methods are complex, the basic principles are similar.

Algorithm 10: Independent Component Analysis (ICA)

ICA is a statistical technique for discovering hidden factors that exist under random variables. ICA defines a generation model for observational data. In this model, data variables are considered to be composed of implicit variables linearly mixed by a mixed system, which is unknown. And it is assumed that potential factors are non-Gaussian distribution and independent of each other, which is called independent components of observable data.

ICA is related to PCA, but it is good at detecting potential factors. It can be applied in digital images, document databases, economic indicators, psychological measurement and so on.

The above is my brief introduction to machine learning algorithms. Now you can use my introduction and your own understanding to think about the applications of machine learning in our daily life.

In fact, these machine learning algorithms are not all as complex as imagined, and some are closely related to high school mathematics. But how to apply what you have learned is at the heart of machine learning, and at the heart of everyone’s learning.

via The 10 Algorithms Machine Learning Engineers Need to Know

Part-time call!

If you’re excited about the future and enjoy exploring world-changing technological advances, look no further!

This is what we need from you:

Proficient in English, interested in technology and products, concerned about the academic trends of ARTIFICIAL intelligence girl & cute girl & technology nerd;

Text is not beautiful, but hope to be easy to understand;

Here, you will gain:

A group of like-minded friends from all over the world;

Cutting-edge academic science and technology trends, recharge themselves every day;

Higher quality of life, reading articles can earn pocket money;

If you are interested, please send your personal introduction/resume to [email protected]. If you have any works, please attach them.

Lei Feng net copyright article, unauthorized reprint prohibited. See instructions for details.