Machine learning

Concept:

  • Machine learning is a science of artificial intelligence. The main research object of this field is artificial intelligence, especially how to improve the performance of specific algorithms in experiential learning.

  • Machine learning is the study of computer algorithms that can be improved automatically through experience.

  • Machine learning is the use of data or past experience to optimize a computer program’s performance criteria.

    (Official language from Wikipedia)

    The simple way to think about it is by having a machine learning correlation algorithm, predictive ability, and then doing the correlation. The essence of machine learning is that by giving machines data, they can look for correlations in the data

Diagrams of concepts related to artificial intelligence

A simple understanding of artificial intelligence includes machine learning algorithms, search algorithms, etc. Deep learning is an extension of machine learning.

data

Data set: A collection of data, usually containing features and labels, in which each row of data is represented as a sample, each column of data (except the last column) as a feature, and the last column as a label. In the specific algorithm, the data set includes training set and test set. Feature space can be produced by using data set visualization, and high-dimensional feature space can be produced according to the dimension of feature.

process

General process:

Learn data –> machine learning algorithm –> model –> input sample –> output result

Prediction Results:

classification

Classification and regression

  • Choose two types of tasks according to the machine learning process

    • Classification: When machine learning is expected to predict categories

      • Common classification: dichotomous, multi – classification
    • Regression: The hope is that machine learning can predict the values of successive numbers

      • Regression tasks can be simplified to categorization tasks
Supervision or not:

Supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning

  • Supervised learning: marking the machine’s training data

    • Common supervised learning: K nearest Neighbor (KNN), linear regression, polynomial regression, logistic regression, SVM, decision tree, random forest
  • Unsupervised learning: Training data given to the machine without any ‘markup’

    • Common unsupervised learning: cluster analysis, dimensionality reduction of data, feature extraction of data set
  • Semi-supervised learning: training data given to a machine with some data marked and others unmarked

    • Causes of missing data: samples or markers are missing for various reasons
    • Semi-supervised learning is quite common in daily life. Most of it requires us to process data and then hand it over to the machine for learning
  • Enhanced learning: Take action based on Friday’s environment and learn the way to act based on the results of taking action

    • Based on supervised and semi-supervised learning
Learning environment:

Batch learning, online learning

  • Batch learning: When training models, input all samples at once

    • Advantages: simple, write a good algorithm will not change and improve

    • Disadvantages: can not adapt to the change of the environment, want to adapt to the change need to re-batch learning

  • Online learning: When training models, errors are calculated and parameters are adjusted for each input sample

    • Advantages: Reflect new environmental changes in time

    • Cons: New data can lead to undesirable changes

Learning Methods:

Parametric learning, non-parametric learning

  • Parameter learning: based on data, assume relationships, find relational parameters

    • Features: Learn parameters through data set learning. When parameters are learned, the original data set is no longer needed
  • Nonparametric learning: Don’t make too many assumptions about the model

    • Note: Non-argument does not mean no argument