This is the seventh day of my participation in the August More text Challenge. For details, see:August is more challenging

1. Knowledge of statistics and data analysis

Basic concepts: random variable, distribution function, probability density function

  1. What is a random variable? What’s the relationship between random variables and randomized trials?

Randomized trial: same condition – random phenomenon – large number of repeated observations 1. Three characteristics: 1. Before the test, it is impossible to determine what results will be produced. 1. All possible results of the test can be clearly stated. Repeatable trials under the same conditions – the results of repeated trials appear in a random manner. Random variables: Describe the results of randomized trials. X stands for — could be the result of a single randomized trial, or could be a combination of the results of multiple randomized trials.

  1. How do you differentiate between the different random variables?

Randomness is generated based on certain rules – the distribution of random variables to distinguish different random variables based on the distribution of random variables – by knowing the distribution of random variables, the final results can be predicted before the start of the experiment

  1. What is a sample? What’s the relationship between the sample and the random variable?

Samples – Results of each randomised trial – “observations” – depending on sample size – Call the different randomised trials the randomised trial with sample size N and the random variable X 1. Consider all the results as a randomized trial with sample size N, corresponding to samples X1, x2, x3… Each result is treated as an independent randomized trial of sample size 1. X1, x2, and x3 are independent samples from the same randomized trial. -x is the mean of the results of these randomized trials

  1. How are random variables classified? What are the categories based on? Random variables can be divided into discrete random variables and continuous random variables. Classification is based on whether all possible outcomes of the described randomized trial are countable

Countable – Whether all possible outcomes can be listed in a certain order

  1. What are the common discrete random variables? What is the distribution law of each of them?
    1. Bernoulli distribution: 0-1 distribution test results are only two
    2. Binomial distribution: n repeated independent Bernoulli distributions
      1. Each Bernoulli distribution event occurs with the same frequency
      2. The results of each test are independent of each other and are not interfered with by the results of the other tests
    3. Poisson distribution: a discrete probability distribution – the number of random events in a unit of time [space]
  2. What are the common continuous random variables? What’s the probability density function for each of them?

    PDF – Probability density Function

    CDF – cumulative distribution function
    1. Uniform distribution: The distribution of the probability density function with a fixed value over the result interval
    2. Normal distribution
    3. Exponential distribution: A probability distribution describing the time between events in a Poisson process – a process in which events occur continuously and independently at a constant average rate

Common features of random variables

  1. What are the number characteristics that describe random variables?
    1. Expectation: E(X) – the average of the random variable X
    2. Variance & standard deviation: D(X) – characterizes the fluctuation of the random variable X – the greater the variance, the greater the uncertainty of the result
    3. Quantile: The order of a sample x in the global distribution
    4. Covariance & correlation coefficient: Concern with the relationship between two or more random variables
  2. What is the relationship between the expectation of X+Y and XY and the expectation of X and Y?
    1. E(X + Y)=E(X)+E(Y) -x,Y doesn’t have any constraints
    2. E(XY) = E(X)E(Y) -x,Y must be an independent variable, does not indicate that X and Y are independent of each other
  3. What’s the relationship between the expectation of the distribution and the size of the median?

The relationship between the expectation of a distribution and the size of the median – varies from distribution to distribution 1. Positive skew – median less than expected 1. Normal – Median equal to expected 1. Negative skewness – the median is greater than expected

  1. Describe the difference between independent and unrelated variables

Independent: Two things that are unrelated or unrelated

  1. What is the expectation and variance of a common distribution?

Discrete random variables:

Continuous random variables:

Normal distribution, law of large numbers, central limit theorem

  1. What are the basic properties of a normal distribution?

The probability density function graph is symmetrical with the expectation as the center, and the size of the expectation is equal to the median

  1. What is the relationship between the three sigma approach and the normal distribution

The probability of 68.27%,95.45,99.73 – σ, 2σ, 3σ interval samples falling outside 3σ is only 0.27% – gross error – this part of data is eliminated

  1. The law of large numbers — repeat the random trial for the random variable X many times — as the number of trials increases, the mean of X gets closer and closer to E(X).