Nowadays, wechat has become an important interface for every Chinese to access society. We use iT to watch chicken soup on the subway, use electronic payment at the mall, swipe wechat moments while sprawled on the sofa, and have a love relationship (cross it off). In short, wechat has almost done everything in our life, which makes people have the illusion that in the near future, we only need to install wechat as a software on our mobile phones.

To prove that the next stage of human society is weixinism, let’s set a small goal: controlling deep learning training with wechat.

Generally speaking, if you want to monitor deep learning training, you can only use SSH or Tensorboard. After all, it is very troublesome to operate with mobile phone. Wouldn’t it be wonderful if wechat could also monitor deep learning?

In the age of the command line, the use of a graphical interface is forced; In the era of graphical interfaces, using the command line was a compulsion. Play is a tide!

@Coldwings
Use wechat to supervise your TF training – Zhihu column

Now it has realized many functions including passive monitoring, active query, remote shutdown/stop training and so on.

Effect display:

The communication between wechat and mobile phone is mainly realized by Littlecodersh /ItChat.

Welcome to GitHub for the fork project: QuantumLiu/wechat_callback, which I’ve had to bite the bullet and add bilingual annotations to make it easier to read.

I am ashamed that I have only done a little bit of work. I hope I can throw a brick to attract jade and get your support to improve this project together.

I. Introduction of main functions

  1. Real-time monitoring: After the end of each EPOCH, the epoch training information and two charts representing all batch and EPOCH information are automatically sent to the file transfer Assistant.
  2. Active query: At any time after the start of training, the instruction of specific format can be sent to obtain the information of the specified query item. Currently, batch and epoch indicators and graphics card status information are supported.
  3. Remote command: It’s important to gracefully terminate your training or even shut it down when you feel that your training has stopped, or when your lab partner is urging you to stop. The fit method of keras can be set in the callback
    self.model.stop_training = True
    Copy the code

    To terminate training at the end of the current epoch, otherwise only Ctrl+ C violence stops. With this plug-in, you can use specially formatted instructions to specify stopping the epoch, stopping training immediately, and even shutting down and canceling the shutdown.


Have a try!

  1. The preparatory work


git clone https://github.com/QuantumLiu/wechat_callback.git
cd wechat_callback
Copy the code


Libraries to use:

Make sure thatnvidia-smiavailable

2. Run the test script

python wechat_test.py
Copy the code

Resolution:

At the beginning of wechat_test.py, import wechat_utils

import wechat_utils #will login automaticly #wechat_utils.sendmessage()isthe callback class # wechat_utils.sendMessage () is a callback class for Keras, passing callBackList when fitCopy the code

In wechat_utils. The py:

# Automaticly login when imported # = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Itchat. Auto_login (enableCmdQR = 0.5, hotReload = True) itchat. Dump_login_status # dump ()Copy the code

As you can see, itchat.auto_login() is called when wechat_utils is imported. If not unexpected, the QR code will be displayed on the command line. You need to use your mobile wechat to scan the code to login to your wechat account.


In the test script, I used NUMpy. random to generate training data and set up a multi-layer FC network

model = Sequential()
model.add(Dense(2048, input_dim=784))
model.add(Activation('relu'))
for i in range(9):
    model.add(Dense(2048))
    model.add(Activation('relu'))
model.add(Dense(1,activation='sigmoid'))

x=np.random.rand(nb_sample,dim)   
y=np.random.randint(2,size=(nb_sample,1))
Copy the code

Calling the plug-in is as simple as passing the keras Callback class wechat_utils.sendMessage () to Callbacklist on fit.

model.fit(x=train_x,y=train_y,batch_size=batch_size,nb_epoch=60,validation_data=(val_x,val_y),callbacks=[wechat_utils.se ndmessage()])Copy the code

So the training starts, and the phone gets the following feedback:We can send it query instructions, which generally includeKeywords and Parameters

For example, to get a chart, any keyword containing the following will be identified as a chart fetching command:

[u' get the graph ','Show me the figure']Copy the code

Parameters are specified with {} or [],All directives support no parameterThe default parameter for getting a chart is to query all information, for example

‘Show me the figure’ triggers the directive {batches} to query the level information of a batches, and [Loss hinge] to query the Loss and hinge indicators (normally, the same attribute parameters are separated by Spaces)

Similarly, [‘GPU’,’ GPU’, U’ graphics card ‘] is the keyword of GPU status query. [] specifies the parameter, as shown in the figure, to query the GPU memory and temperature. GPU parameters are determined based on the nvidia-SMI preset parameters and are all uppercase. See GitHub’s ReadMe or source code for query properties.

The shutdown command keyword is [u’ Shut down’,’Shut down’,’Shut down the computer’,u’ don’t waste power ‘,u’ wash sleep ‘], use {SEC} and [name] to specify wait time and save filename, filename does not include.h5. Save the model by default. If you don’t want to save the model, you can include [u’ don’t save the model ‘,”don’t save”] in the message, for example:

Shut down now{120},don't save
Copy the code

To cancel the shutdown, you only need to include [u’ cancel’,’ cancel’,’aaaa’]. If you are in a hurry, you can send a string of ‘A’ to it

The key words to Stop training immediately are [‘Stop now’, ‘That’s enough ‘, U ‘Stop training ‘, U’ give up therapy ‘]The keyword to Stop the epoch is “Stop at”. The parameter can be expressed as an integer without [].

Third, summary

The project took only two and a half days from idea to comment, GitHub and Zhihu. It was done in a rush and in a rough way, especially with the details of drawing and multi-threading.

I am just a freshman (suspended from ING), and my level is very limited. I sincerely ask you to give me more advice to improve my posture. If this project can bring you a little convenience or inspiration, I will feel very honored and gratified.

Thanks again @Coldwings for the original idea.