Apsara Clouder Big Data special skills certification: build social friend recommendation system using MaxCompute

This certification can help students understand how to use Ali Cloud big data computing services to quickly build enterprise-level social friend recommendation system, and master the ability to use cloud related big data services for development and testing.

Big Data Computing Services (MaxCompute, formerly ODPS) is a fast, fully managed GB/TB/PB level data warehouse solution. MaxCompute provides you with a comprehensive data import solution and a variety of classical distributed computing models, which can quickly solve massive data computing problems, effectively reduce enterprise costs, and ensure data security.

At the same time, DataWorks and MaxCompute are closely related. DataWorks provides MaxCompute with one-stop data synchronization, task development, data workflow development, data management and data operation and maintenance. For details, please refer to DataWorks (original Big Data development suite).

MaxCompute mainly serves to store and compute batch structured data, and provides massive data warehouse solutions and analysis and modeling services for big data. With the continuous enrichment and improvement of social data collection methods, more and more industry data have been accumulated. Data scale has grown to the traditional software industry cannot carry massive data (100 GB, TB and PB) level.

In the scenario of massive data analysis, due to the processing capacity limitation of a single server, data analysts usually adopt distributed computing mode. However, the distributed computing model puts forward higher requirements for data analysts and is difficult to maintain. With a distributed model, data analysts need to be familiar with both the business requirements and the underlying computing model. MaxCompute is designed to provide you with a convenient way to analyze massive amounts of data without worrying about the details of distributed computing.

MaxCompute has been widely applied in Alibaba Group, such as data warehouse and BI analysis of large Internet enterprises, log analysis of websites, transaction analysis of e-commerce websites, user characteristics and interest mining, etc. Product advantage

Massive computing storage

MaxCompute applies to storage and computing requirements greater than 100GB and the maximum value is EB.

Multiple computing models

MaxCompute supports SQL, MapReduce, Graph, and MPI iterative algorithms.

Strong data security

MaxCompute has stably supported all offline analysis services of Ali for more than 7 years, providing multi-layer sandbox protection and monitoring.

Low cost

Compared with enterprise – built private clouds, MaxCompute provides more efficient computing storage and reduces procurement costs by 20%-30%.

Functions overview

Data channel

Support for batch and historical data TUNNEL The TUNNEL is a data transmission service provided by MaxCompute, which provides highly concurrent offline data uploading and downloading. Supports daily TB/PB data import and export, especially suitable for batch import of full data or historical data. The Tunnel provides a Java programming interface for you to use. In addition, commands are provided in the MaxCompute client tool to enable the communication between local files and service data.

Real-time and incremental Data channel For real-time data upload, MaxCompute provides a low-latency and easy-to-use DataHub service, especially for incremental data import. DataHub also supports a variety of data transfer plug-ins, such as: Logstash, Flume, Fluentd, Sqoop, and so on. In addition, Log data in Log Service can be delivered to MaxCompute by one click, and then Log analysis and mining can be performed by DataWorks.

Use MaxCompute for MaxCompute

Apsara Clouder Big Data special skills certification: build social friend recommendation system using MaxCompute

Official website of Ali Yun University (Official website of Ali Yun University, Innovative Talent Workshop under cloud Ecology)