Click here to watch the big shots share

According to the Research and analysis of the White Paper on China’s ICT Talent Ecology, China’s AI talent gap has exceeded 1 million by the end of 2018, and the number will climb to 2.26 million by 2020. However, the 367 universities around the world with ai research direction only graduate about 20,000 students in this field every year, which is far from meeting the market demand.

Talent gap, Tencent cloud released supernova AI talent training plan, deep cultivation of college education ecology. As the only machine learning platform in the program, Tencent Cloud Intelligent Titanium Ti-One entered the classrooms of colleges and universities, and was favored by teachers and students.

Whether you are an AI beginner or an AI expert, you can find a modeling method suitable for you on the intelligent Titanium machine learning platform.

Yu Zukun, senior product Manager of Tencent Cloud, is invited to share the content of the course “Tencent Cloud machine learning Platform TI-One”. Start a machine learning modeling journey on the cloud and have a romantic encounter with intelligent Titanium Ti-One!

This sharing mainly includes three parts: Ti-One product architecture; Ti-one product features; Ti-one was used to build the model.

1. Ti-one product architecture

1.1 Overview of machine learning

Machine learning builds models by studying incoming data to predict or identify future new inputs.

Machine learning is like cooking rice. You put rice (data) into a rice cooker (machine learning), choose different cooking modes (different algorithms), and get rice or porridge (different models).

At present, machine learning has a wide range of application scenarios, such as image recognition, financial risk control, intelligent investment research, accurate recommendation, disease diagnosis, engineering detection and so on.

1.2 Traditional machine learning and deep learning

Traditional machine learning is shown as a complete assembly line, including the following steps:

**1. Data acquisition: ** access data from the data source for subsequent algorithm training.

2. Data preprocessing: missing value processing, data format processing.

3. Feature extraction: The process of extracting values helpful to the training results based on the original data, which largely depends on the experience of the modeler, especially the understanding of the business.

4. Feature selection: Select good features and add them to the model training process.

5. Selection algorithm: Different methods can be selected for model building, such as random forest algorithm, decision tree algorithm, support vector machine algorithm, etc.

The characteristic of traditional machine learning is that feature selection requires human participation. After the derivation of deep learning, the feature selection process can be completed by neural network.

When we talk about machine learning in today’s class, we mean machine learning in a broad sense, including both traditional machine learning and deep learning.

1.3 Machine learning modeling process

Process: User data – data preprocessing – feature engineering – machine learning algorithms – model evaluation – Generating offline or online services.

A method is needed to evaluate whether the generated model is good or not. A good model needs to generate offline or online services into the actual application, and the whole process needs a lot of computing resources to support.

Intelligent Titanium TI-One provides the whole process algorithm development and deployment support for the above links.

1.4 Value proposition of machine learning platform

We have already seen the concept of machine learning and the whole process of building models by machine learning. Now when it comes to machine learning modeling algorithm engineers have two options:

One is to build your own, using frameworks such as Caffee, PyTorch, TensorFlow, etc.

The other is to use machine learning platforms directly, like the Smart Titanium Ti-One.

We can look at the difference between the two:

Frame Angle

For self-build, each framework needs to be installed, deployed on a machine, and maintained accordingly. At the same time, there are different versions of each framework, and maintaining the dependent environment of each framework version is a time overhead.

For the Smart Titanium Ti-One, we’ve integrated the framework into the platform, debugged it, and provided platform-level algorithm modeling services “right out of the box.”

Algorithm of Angle

For self-built framework users, it is necessary to constantly find algorithms from the open source community to use, which also involves some modification of algorithm bugs.

For the Smart Titanium Ti-One, we have tuned the algorithms we use a lot and deployed them on the platform. Users can use them directly by drag, notebook, or through the SDK. There are some engineering modeling support services, and the platform is ready for algorithmic engineers to focus entirely on model building.

Model Angle

For self-construction, different frameworks have different characteristics, including model construction and deployment. After engineers are familiar with one framework, there will be another framework, and there is a threshold to get started.

For the intelligent Titanium Ti-One, the platform has been packaged into a page visualization operation, which can simplify the use process of the whole life cycle of the model.

Evaluation of Angle

For a self-built framework, you have to code it yourself and print it out so you can see what the model looks like.

For machine learning platforms, model effects can be evaluated dynamically in real time.

Collaboration Angle

For self-built tasks, you can use monitoring scripts to add monitoring for tasks that run independently to discover and handle exceptions by yourself. Share algorithms by copying code or git.

For the intelligent Titanium Ti-One, a variety of monitoring configurations, with a sound alarm system. Support publishing algorithms and models, controllable sharing granularity and level.

1.5 User Positioning

AI is a general trend, all walks of life have the application of AI demand, at present, there are more and more users with algorithmic modeling product requirements. We consider:

How can algorithm beginners get started quickly, use the algorithm to build AI models, and establish the confidence to start?

Algorithm novice, although have certain experience, how to reduce the threshold of use, improve the efficiency of modeling?

Algorithmic experts, perhaps more demanding in terms of performance and distributed computing power requirements?

Intelligent Titanium TI-One provides suitable solutions for the above three types of users.

1.6 Ti-One Product Architecture

Resource layer

Data storage supports multiple storage modes, such as HDFS, CEPH, OBJECT storage COS, and FILE storage CFS. In terms of computing resources, it has a large number of cloud computing resources and supports local computing power.

Scheduling layer

There are a large number of users and a large number of computing clusters in cloud modeling, and distributed scheduling tools are required for different training tasks. Distributed resource scheduling suite, which uses the resource scheduling platform developed by Tencent, can support large-scale clustering tasks.

The framework layer

Support Spark, TensorFlow, Angel, PyCaffee, Pyspark, Pytorch and other mainstream machine learning frameworks.

Algorithm layer

It supports hundreds of machine learning algorithms, including traditional machine learning algorithms, graph algorithms, and deep learning algorithms, which are constantly being enriched.

Interaction layer

Three different modes of interaction to meet different user groups.

Visual modeling

Pull and pull way to build workflow, simple and easy to use, suitable for AI small white.

Notebook

The interactive data exploration and modeling process, suitable for people with some algorithmic foundation, provides greater flexibility.

SDK

More suitable for modeling experts to provide greater adhesion.

1.7 Machine learning platform TI-One logic architecture

Ti-one adopts the classic layered architecture model, which is the interaction layer, TI kernel engine layer, computing power layer and storage layer respectively from top to bottom.

The interaction layer offers different product forms, including DAG drag-and-drop modeling capabilities, AutoML automatic modeling, and Notebook interactive programming.

The kernel engine mainly contains the core training scheduling engine and model service engine, which is the core capability of TI-One. It has the characteristics of high availability and high expansibility, and supports user-defined operators through plug-ins, docking with a variety of algorithm frameworks, and supports different scheduling methods (parallel, serial and cycle drive, etc.).

The computing power layer is an abstraction of the underlying computing cluster. It supports common computing power cluster resources (GaiaStack/K8S/Yarn) and can be easily extended to support other computing power resources. The built-in algorithms are optimized for computing power, supporting stand-alone and distributed training.

The storage layer abstracts different data sources. Currently ti-One supports COS, HDFS, Cephfs, local files, and various JDBC data sources (GP\Hive\Kudu\Impala\ Mysql, etc.).

2. Ti-one product features

2.1 Feature Overview

2.2 Drag-and-drop task flow design

Visual drag

Data, algorithms, and components can be dragged and dropped directly. What you see is what you get.

Automatic node connection

Automatic connection, data input and output automatically generated, simple and efficient.

Freely draw workflow

Custom workflow, multiple models can be trained in parallel, get twice the result with half the effort.

2.3 Flexible operation mode

Supports scheduling policies based on running resources, including parallel and serial.

Support parameter setting, provide numerical, enumeration type parameter running Settings.

Supports periodic scheduling.

Support historical instance details view, model comparison, continue to run.

2.4 Supports a variety of machine learning frameworks

It covers Spark, TensorFlow, PyCaffe, Pytorch and Tencent’s own Angel framework, providing diversified framework support.

There are many AI modeling scenarios. If the built-in algorithms of the platform do not meet your requirements, you can use a customized framework to upload your own scripts to create models.

2.5 Visual Analysis

Support intermediate result preview, you can see whether the data output of the intermediate nodes of the workflow meets your expectations, so that you can know when modeling.

Rich and varied forms of chart presentation; Multiple model evaluation methods; The chart can be presented by levitating.

2.6 Interactive Modeling the Notebook

Notebook is a flexible interactive development tool ideal for data preparation, data processing, algorithm debugging, and model training. The platform encapsulated the Jupyter Lab, and added the resource monitoring feature to monitor the resource consumption of tasks while retaining the original interaction characteristics.

2.7AutoML Automatic Parameter Setting

Support automatic parametric modeling of institutionalized data, and make the system adjust these super parameters intelligently through some learning mechanism without setting parameters manually, so that the whole machine learning training process can be automated. It also supports real-time monitoring of training progress and analysis of automatic tuning.

2.8 Model Management

The model repository page is used to manage all saved models. Supports the following functions:

Versioning and switching is performed on each model.

The model is filtered using an automatically generated TAG.

Create offline batch forecasting jobs that are model-based.

In the future, Ti-One will also integrate inference services on the cloud, with features such as one-click deployment of models to TI-EMS.

2.9 Full life cycle modeling task support

Review: want to build a model, you first need to access to data, to some of the data, choose the appropriate algorithm to construct models, training model, model training after the completion of the evaluation model, the effect of a good model to warehouse, warehouse can save different versions of the same model in the model, select the appropriate model appropriate release into online services.

The next step is to call the model on the application side. After calling the model for a period of time, you can see the effect of the model. If the effect is not good, you can adjust the training data, algorithm, parameters, etc., to optimize the model, and then re-release the new model to enter the next cycle. It can be seen that all aspects of the complete modeling are supported on the TI-One platform.

3. Use Ti-One to build the model

Next, you can try to build a model on the Ti-One platform using visual pull and drag.

Before starting to use smart TI TI – ONE, complete registration and open service 】 【, specific can consult the official product documentation: cloud.tencent.com/document/pr… .

3.1 Survival prediction model of Titanic

Visual workflow modeling practices

Background:

On April 15, 1912, the Titanic, carrying 1,316 passengers and 861 crew, sank after colliding with an iceberg. The shipwreck is considered one of the top ten human disasters of the 20th century. In 1985, the wreck of the Titanic was found two and a half miles down in the North Atlantic ocean.

In 1998, I enjoyed the Hollywood blockbuster “Titanic”. The shocking effect, beautiful picture and the tragic love of the leading men and women in the film once caused the turmoil of youth. After many years, as a beginner of data analysis, do you have any new feelings when looking at this disaster from the perspective of data analysis? In the event of a disaster, as there were not enough lifeboats for the passengers and crew, those who survived had an element of luck, but it could still be that some were more likely to survive than others. So what kind of people are more likely to survive this disaster? What were the characteristics of the people who survived? Do these characteristics contribute to survival in other disaster sites? With the Titanic sinking event as the background, the multi-dimensional data of passengers will be used to build a model to judge whether the passengers can be rescued in the Titanic sinking event. Walk through each of the following steps to experience the process of establishing and successfully running a workflow on the Ti-One console.

Goal:

Predict what kind of people would be more likely to survive after the Titanic hit the iceberg?

About the building steps of the model, please see the official document: cloud.tencent.com/document/pr…

3.2 Text emotion classification

Visual workflow modeling practices

Background:

Text classification is a basic and important task in natural language processing. Its application scenarios are very wide, covering many fields such as finance, commerce, military, politics, etc., such as public opinion monitoring, news classification, news polarity analysis, spam identification, user sentiment analysis. Text classification model has important application value and commercial value. For example, merchants can adjust purchase strategy to improve sales by judging users’ emotional attitude towards goods.

The existing text classification algorithms mainly include two categories, which are based on traditional machine learning and deep learning. Traditional machine learning-based methods classify texts through preprocessing, feature extraction, vectorization and common machine learning classification algorithms, such as LR and SVM. The model effect is affected by the quality of feature extraction. Methods based on deep learning train data through deep learning model, commonly used algorithms include FastText, LSTM, etc. Model effect is mainly affected by the amount of data and the number of iterations.

Goal:

FastText algorithm is used to build a deep learning text classification model to solve practical problems in text classification scenarios

About the building steps of the model, please see the official document: cloud.tencent.com/document/pr…

3.3 Use TensorFlow to achieve iris classification

Run custom code modeling based on the framework

Background:

Tensorflow framework of intelligent titanium machine learning platform provides users with Tensorflow running environment based on Python API. Users can upload scripts and dependent files to the framework for algorithm training. We will take the iris classification task as an example to demonstrate to users how to use the deep learning framework TensorFlow of the intelligent titanium machine learning platform to run the custom code, how to send parameters to the custom code through the workflow page, and how to view the code log/error message, etc. The entire workflow takes only a few seconds to run. After the training is complete, you can deploy the model service and test it online.

Goal:

Using a custom framework to achieve iris flower classification task

About the building steps of the model, please see the official document: cloud.tencent.com/document/pr…


The questionnaire

In order to provide developers with the most practical, the most popular cutting-edge, the most dry video tutorials, please let us hear your needs, thank you for your time! Click to fill in the questionnaire

Tencent Cloud University is a one-stop learning and growth platform for cloud ecosystem users under Tencent Cloud. Tencent Cloud University big tycoon share invites internal technology big tycoon every week, to provide you with free, professional, the latest technology trends of the industry to share.