purpose

Write a real sense of a crawler, and he crawls to the data respectively saved in TXT, JSON, mysql database already exist.

Objective analysis:

Beginner have what don’t understand can private letter me — I just organized a set of 2021 the latest 0 basic introductory tutorial, unselfish sharing, access methods: pay attention to xiaobian CSDN, send a private letter: [learning materials] can be obtained, attached: development tools and installation package, and system learning roadmap.

Data filtering:

We use chrome Developer Tools to simulate mouse positioning in the corresponding position:

You can see the data we need, all wrapped up in

So we’ll have finished our preparations.

Take a look at the current directory:

Write the items. Py:

This time we will write items, which is very simple, just need to fill in the name of the field you want to get:

Write the spiders:

This part makes the core of our entire reptile!!

The main objectives are:

Filter the data in the Response sent to us by Downloader and return it to PIPELINE for processing

Let’s look at the code below:

Write the PIPELINE.

> < span style = “box-sizing: border-box; line-height: 22px; display: block; word-break: inherit! Important; word-break: inherit! Important;

Normally, we would store the data locally:

Text: The most basic form of storage

Json format: Easy to call

Database: A storage mode for a large amount of data

TXT format:

Json data:

We want to output data in JSON format. The most convenient way is to define a class in PIPELINE:

Database format (mysql) :

Python supports all kinds of database operations in the market, so if you want to learn Python, it is necessary to listen to the teacher’s class and receive Python benefits. If you want to learn Python, you can go to Wei Xin (same pronunciation) of Teacher Mengya: The front group is: Mengy, the back group is: 7762, put the above two groups of letters together in order, she will arrange to learn.

But now generally more commonly used free database mysql.

Mysql > install mysql

Linux and MAC both have powerful package management software such as APT, Brew, etc

Windows can download the installation package directly from the official website.

Since I’m a Mac, I’m talking about how the Mac is installed.

Let’s see what the weather table looks like:

Finally, let’s edit the following code:

Write Settings. Py

We need to add our PIPELINE in settings. py,

Scrapy to get it going

< span style = “box-sizing: border-box! Important; word-break: inherit! Important;

The value of a digit can be customized. The smaller the value is, the higher the value is

Get the project running:

Results show:

Text format:

Json format:

Database format:

This is the end of the example, focusing on how to customize a PIPELINE to save crawl data in different ways.