preface

This is a machine learning experiment using the knowledge learned in “TensorFlow Notes (1) — Related basic Concepts in TensorFlow”. On the one hand, it is to understand what machine learning is like, on the other hand, it is also to learn and apply some knowledge learned before. The implementation environment of this time is:

  • MacOS Mojave 10.14.1
  • Python 3.7.0 (Pyenv)
  • Tensorflow 1.12.0
  • Numpy 1.15.4

So what is machine learning

Machine learning and human learning processes:

Here’s another chestnut if you don’t understand the picture above:

Suppose you have a girlfriend who loves beauty. You go out for dinner every weekend to get to know each other. Have been dating for eight weeks, every week will eat your girl friend to 10 minutes to 30 minutes later than the appointed time, so every time you date will be 10 to 30 minutes later than the appointed time, and you are summarized a rule (if the date said call her away from home, so basic it is about 30 minutes late, if she says is coming soon, But your girlfriend has been 15 or 45 minutes late instead of 10 or 30 minutes, and you’ve adjusted your arrival time.

According to the chestnut šŸŒ°, let’s make an analogy:

Machine learning is a method in which a computer takes existing data (eight dates), creates a model (the pattern of lateness), and uses that model to predict the future (whether or not you will be late).

Tensorflow basics

TensorFlow Notes (1) : Basic Concepts in TensorFlow

Build a linear model

Let’s start by explaining what the simple linear model we need to build is:

Suppose we have a linear model: y=0.1x+0.2. We know what this linear model looks like. It’s a straight line, but NOW I want the machine to know this line. Remember what we said above, we can provide a series of numbers like (x, y), and then plug in y=k*x_data+b, and then figure out the values of k and b, and the machine will know what the linear model looks like.

Here’s a more mathematical introduction:

Given a point set of size N š‘†={(š‘„1,š‘¦1),(š‘„2, š‘„2),… (š‘„ š‘›, š‘¦ š‘›)},

The goal of a linear model is to find a set of lines š‘¦=Kš‘„+š‘,

Makes the loss value of all the pointsSmaller is better, because a smaller š‘™š‘œš‘ š‘  indicates a smaller difference between the predicted value and the actual value.

Because if we find this groupKAnd š‘, we can predict either one ēš„ Value.

I want to say a few more words here, that linear models don’t necessarily predict very well in practiceThis is because the actual data distribution may not be linear, it may be quadratic, cubic, circular or even irregular, so it is important to determine when a linear model can be used. It’s only because we know here that this is a linear model that we’re going to do this, that we’re going to have to do this in order to actually build the model.

Code reading

In a few words, look at the code in sections:

  1. Import the appropriate Python package, using TensorFlow and Numpy
# Tensorflow simple example
import tensorflow as tf
import numpy as np
Copy the code
  1. Generate 1000 random points using numpy. For the use of Numpy, see my numpy series notes
Use numpy to generate 1000 random points
x_data = np.random.rand(1000)
y_data = x_data*0.1+0.2         # real value
Copy the code
  1. Construct a linear model
Construct a linear model
b = tf.Variable(0.)
k = tf.Variable(0.)
y = k*x_data+b                  # predicted
Copy the code
  1. Define the loss function that determines yThe real valueAnd yPredictive valueThe gap between
# Quadratic cost function (loss function)
loss = tf.reduce_mean(tf.square(y_data-y))
Copy the code

Explain what each part means in turn:

  • y_data-y: There is nothing to explain here, just the difference between the actual value and the predicted value
  • tf.squareThe function is to square
  • tf.reduce_meanThe tensor function is used to calculate the mean of the tensor along the tensor line. It’s mainly used for reducing dimension or calculating the mean of the tensor.

So the above three functions together calculate the loss value.

  1. useGradientDescentOptimizerClass to create an optimizer to optimize the model, reducelossValue, the principle of this class is gradient descent, as for what is gradient descent, refer to other tutorials, will be covered in the future, just know how to write it.
# Define a gradient descent method to train the optimizer
optimizer = tf.train.GradientDescentOptimizer(0.2)
Copy the code
  1. Use optimizer to reducelossValue,minimizeIs a method of the optimizer
Define a minimization cost function
train = optimizer.minimize(loss)
Copy the code
  1. Initialize all of the above variables
Initialize variables
init = tf.global_variables_initializer()
Copy the code
  1. usingSessionTrain our model
with tf.Session() as sess:	# Define the session context
    sess.run(init)	Perform initialization
    for step in range(3000) :# train 3,000 times
        sess.run(train)	Perform training operations
        if step % 20= =0:	# Every 20 steps
            print(step, sess.run([k, b]))	Print out the values of k and b
Copy the code

At this point, the simplest linear model is complete. Here is all the code:

# Tensorflow simple example
import tensorflow as tf
import numpy as np

Use numpy to generate 1000 random points
x_data = np.random.rand(1000)
y_data = x_data*0.1+0.2         # real value

Construct a linear model
b = tf.Variable(0.)
k = tf.Variable(0.)
y = k*x_data+b                  # predicted

# Quadratic cost function (loss function)
loss = tf.reduce_mean(tf.square(y_data-y))
# Define a gradient descent method to train the optimizer
optimizer = tf.train.GradientDescentOptimizer(0.2)
Define a minimization cost function
train = optimizer.minimize(loss)

Initialize variables
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    for step in range(3000):
        sess.run(train)
        if step % 20= =0:
            print(step, sess.run([k, b]))
Copy the code

Partial screenshots of the results:

It can be clearly seen from the above two figures that the result of K was very, very close to 0.1 and that of B was also very, very close to 0.2 at the 2980th training session, so it can be seen that this model is relatively correct.