Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

The term regression was first coined by Francis Galton, an English scientist and cousin of Charles Darwin, the famous biologist and founder of evolution. Galton found that while there was a trend toward taller parents, taller children; Short parents, short children. But given the height of both parents, the average height of children tends or “regresses” to the average height of the population as a whole. In other words, even if both parents are unusually tall or short, the children tend to be the average height of the population. This is known as the law of universal regression. Galton’s conclusion was confirmed by his friend Karl Pearson, a British mathematician and founder of mathematical statistics. Pearson collected height records from more than 1,000 members of families and found that for a group with a tall father, the children were, on average, shorter than their parents. For those with short fathers, the children were taller on average than their parents. This “reverts” both tall and short children to the average height of all men, in Galton’s words, “back to average.”

Unary linear regression

  • Regression analysis is used to establish equations that simulate how two or more variables are related;
  • The predicted variable is called dependent variable and output • The variable used for prediction is called independent variable and input;
  • Unary linear regression includes an independent variable and a dependent variable.
  • The relationship between the above two variables is simulated by a straight line. If more than two independent variables are included, it is called multiple regression.

The graph of this equation is a straight line, called the regression line. 𝜃1 is the slope of the regression line, and 𝜃0 is the intercept of the regression line.