This article introduces how to upload and use your own data set in GPU deep learning cloud service.

doubt

How to Speed Up Your Python Deep Learning with a Cloud GPU? In this article, I introduced you to FloydHub, a deep learning environment service.

After the article was published, some readers raised two questions in the background:

  1. I don’t have a foreign currency credit card, so I can’t renew my free time. Do you have similar domestic service?
  2. I want to use my own data set for training. How do I do that?

The first question was answered by a reader.

I took a look and Russell Cloud here is indeed a GPU deep learning Cloud service similar to FloydHub.

But thanks, I found out he was a developer for Russell Cloud.

So these days, the use of a problem, I directly asked him to answer questions.

Because of this green channel, the response has been very fast. The user experience is good.

The advantages of this domestic service are as follows:

First, alipay and wechat can be used for payment without Visa or Mastercard, which is very convenient.

Secondly, Russell Cloud is based on Ali Cloud, with relatively fast access speed and stable connection. The advantages are obvious when uploading and downloading large-scale data. In contrast, FloydHub suffered two interruptions when it uploaded 500MB of data.

Third, all documents are written in Chinese and questions are answered in Chinese. Be more friendly to those who are not good at English.

Fourth, the development team made micro innovations. For example, you can view the running results at any time in the wechat mini program, and query the remaining time information.

With the first problem solved, I’ll use Russell Cloud to show you how to upload your own data set and do deep learning exercises.

registered

To use, sign up for a free account on Russell Cloud.

Because the interface is Chinese, I will not repeat the specific steps.

Once registered, you get 1 hour of free GPU usage.

If you sign up with my invitation link, you can get 4 more hours of free GPU usage.

These are the only five invitation links I have available. You can type it in if you want.

Let’s see who’s quick.

After registering, you can go to the console and see information about yourself.

There is a Token column, which is your login information. Now let me tell you how to use it.

You need to download the command line tool by going to the terminal and executing:

pip install -U russell-cli
Copy the code

Then you need to log in:

russell login
Copy the code

At this time according to the prompt, the Token input, login is completed.

Unlike FloydHub, most of Russell Cloud’s identity and project verification uses this Token approach.

If you are not familiar with terminal command line operations, please refer to my “How to Install Python Runtime Anaconda?” which provides detailed instructions on terminal basic functions.

The environment

I have placed the gitLab link for the data and execution scripts used below.

You can just click here to download the zip and unzip it.

The decompressed directory contains two subfolders.

Cats_dogs_small_vgg16 contains our run script. There is only one file.

How to use it will be described later.

Let’s start with the data set upload issue that you’re most concerned about.

data

Another folder in the unzipped directory, cats_and_dogs_small, contains the data set we will use and upload.

As shown in the figure above, image data is divided into three categories.

This is also the standard specification for image data classification used by Keras by default.

Open the training set train, which contains two directories, respectively “cat” and “dog”.

When you use Keras, and you have this directory structure, you can just call flow_from_directory under ImageDataGenerator and translate the image data in your directory into a tensor that your model can use.

Open the test and Validation directories, and you’ll see the same structure as train.

Build your first dataset on Russell Cloud.

On the home page, click the “Console” button.

In the Data Set column, select Create Data Set.

As shown in the figure above, set the name of the dataset to cats_and_DOGs_small.

This is going to be the ID of the dataset that we’re going to use to connect the dataset in the cloud to our local directory.

Go to the terminal and run the CD command to go to the cats_and_dogs_small directory in the decompressed folder and run the following command:

Russell Data init --id Specifies the id of your data setCopy the code

Please replace “your dataset ID” with your actual dataset ID above.

Execute these two commands and the data is uploaded to Russell Cloud.

After the upload is successful, go back to Russell Cloud’s dataset page and you will see a newly generated version appear under the versions TAB.

Note that on the right side of the image above, there is a “copy” button. Click it to copy the Token for that version of the dataset.

It is important to note that the information is copied from here, not from the ID of the first page of the dataset.

I wasted a lot of time getting this wrong.

run

To implement your own deep learning code, you need to create a new project on Top of Russell Cloud.

You have to give the project a name.

You could just call it cats_DOG_SMALL_vGG16.

Leave other items as default and click Create Project.

When the following page appears, the project is successfully created.

Again, you need to connect your local code folder to the project you just created.

Here’s how:

Copy the ID information from the previous page.

Go to the terminal, run the CD command to go to the cats_dogs_SMALL_vgg16 directory in the decompressed folder, and run the following command:

Russell init --id The ID you just copiedCopy the code

This way, your local changes can be logged by Russell Cloud and updated to the task execution configuration.

With this command, you can run a convolutional neural network training script using Russell Cloud’s remote GPU.

russell run "python cats_dogs_small_vgg16_augmentation_freeze_russell.py" --gpu --data 92e239eca8e649928610d95d54bb3602:cats_and_dogs_small --env tensorflow1.4
Copy the code

Explain the arguments in this command:

  • runThe following quotes contain sections that are the actual commands executed;
  • gpuIt’s telling Russell Cloud that you’re running on a GPU, not a CPU.
  • dataThe following string of numbers (before the colon) is the corresponding identifier for the version of the dataset you just generated; After the colon, is the name you gave the data set mount directory. If the mount directory is called “potato”, then in the code, your dataset location is “/input/potato”.
  • envIs the name of the integrated deep learning library environment. We specify Tensorflow 1.4 here. For more options, please refer todocumentation.

When you type the above command, Russell Cloud synchronizes your project code to the Cloud and executes the code according to the parameters you specify.

You can’t see the results locally.

You need to go to the web page, view the “Task” under “Run Log”, in the simulation terminal provided by the system, view the running results.

In order to preserve the hard-earned results of deep learning, you need to save the model with the following statement:

saved_model = output_dir / 'cats_and_dogs_small_finetune.h5'
model.save(saved_model)
Copy the code

The history.history object contains some evaluation data during the training process, such as accuracy rate (ACC) and loss value (Loss), which also needs to be saved.

Here you can use pickle to do this:

import pickle
with open(Path(output_dir, 'data.pickle'), 'wb') as f:
	pickle.dump(history.history, f)
Copy the code

If you are careful, you must have noticed that an output_dir appears in the above code, and its real path is output/.

This is the default output path that Russell Cloud provides for us. The data stored in this system will also be saved in the cloud storage space after running.

You can see the saved data under “Output” under “Task Record”. They have been saved as a compressed package.

Download it and decompress it, and you’re ready to enjoy the fruits of your cloud GPU’s labor.

You can use the content saved by History to draw, or further load trained models to categorize new data.

To improve the

You may run into some problems with Russell Cloud in practice.

Let me list my own problems here, so you don’t step into the pits I have.

First of all, the deep learning environment version update is not timely enough.

Tensorflow is stable at 1.8 at the time of writing, while Russell Cloud’s highest supported version is still only 1.6. The highest version in the documentation is still at 1.4. The default Keras is Python 3.5 + Tensorflow 1.1.

Be careful not to use the default Keras directly, otherwise some of the best features in Python since 3.6 won’t work. For example, some functions will get an error if you pass a PosixPath path (instead of a string) as a file address argument. It’s not your code’s fault, it’s the old environment.

Secondly, when the screen output is too much (for example, I ran 100 epochs, each showing 100 training progress), the simulation terminal on the “Run Log” page will be easily unresponsive. The workaround is to download the log file directly, read it and analyze it.

Third, many of Keras and Tensorflow’s code bases (for example, using a pre-trained model) automatically invoke the download function to download data from Github. However, the connection between the domestic server and Github is not stable enough, so there will be failure to download from time to time, resulting in program timeout and abnormal exit.

I have fed back the above questions to the developer team. The other side has said it will resolve it as soon as possible.

It would be great if none of these potholes existed by the time you read this.

summary

This article recommends Russell Cloud, a Chinese GPU deep learning Cloud service. If you prefer to read Chinese documents, don’t have a foreign-currency credit card, or have trouble accessing FloydHub and Google Colab, try it.

Through a practical deep learning model training process, I showed you how to upload your own data set to the cloud and mount and invoke it during training.

You can take advantage of the GPU time that the platform gives you, run a deep learning task or two of your own, and compare it to running on a native CPU.

If you like, please give it a thumbs up. You can also follow and top my official account “Nkwangshuyi” on wechat.

If you’re interested in data science, check out my series of tutorial index posts entitled how to Get started in Data Science Effectively. There are more interesting problems and solutions.