The online workshop of Ali Yun University is launched, with detailed principles and practical exercises, so that you can truly master cloud computing and big data skills.

One of the first courses to go online is a Python crawler course by wei Wei, author of the bestselling “Mastering Python Web Crawlers”, which takes you from beginner to master in two months.

What is the use of a reptile?

If you’re looking for a job and want to know which positions are hot right now, check out the job listings on job boards.


The distribution of World Cup fans can be analyzed by analyzing the jersey sales of each team on Taobao, or some data from relevant forums or post bars.


If you want to know the field of concern of zhihu gods, you can crawl and analyze their answers;


Data analysis can only be done with data, and the Internet provides you with massive data sources, depending on whether you have access to these data. Python is the best choice for crawlers. Basic crawlers can be implemented in just a few lines of code, making it easy to learn and get a greater sense of accomplishment.

Crawlers are one of the best ways to get started with Python. They are also the foundation for big data analysis and machine learning. After mastering basic crawlers, you will be much more comfortable with other Python skills.

How to master crawler technology

So how do you learn Python crawlers? We’ve mapped out a learning path for you. In the advanced section, you will learn anti-crawler techniques, how to bypass anti-crawlers, and how to write distributed crawlers to improve data crawling efficiency.

Stage 1: crawler introduction and framework learning

Regular expressions are an essential tool for Python crawlers to filter text or match it according to rules. Urllib, Requests, and Scrapy are three of the most popular Python crawler libraries and frameworks.

Lesson 1:
Introduction to the Python web crawler and expression basics


Lesson 2:
A crawler based on the Python Urllib library


Lesson 3:
Network crawler Packet Capture Analysis Technology


Class 4:
Compiling crawler projects based on Python Requests library


课时5: preparation of crawler project based on Scrapy framework

Stage 2: Project practice

Synthesize the knowledge learned in front, climb the data of the following websites, analyze and solve the difficult problems encountered in the process of climb in the actual combat of the project.

Lesson 6:
Recruitment website information crawler project development practice


Lesson 7:
Taobao commodity information crawler project development combat


课时8: zhihu information crawler project development practice (2018-08-22 19:30)

Stage 3: Advanced applications

Anti-crawler technology is applied in some websites, which makes it impossible to crawl data. Advanced applications will introduce some common anti-crawler methods and conquering methods.

If a large amount of data is crawled, the capability of single crawler is limited. We will also explain how to write clustered distributed crawler to improve the efficiency of data acquisition.

Lesson 9:
Common anti-crawling Strategies and Conquering methods of reptiles


Class 10: Distributed crawler writing practice (Class starts at 19:30 on September 5, 2018)

Experts answer questions online and master Python crawlers

Join the Python Crawler Exchange group (21723932) to learn how to communicate crawlers and learn how to use them in two months.