This is the python theme month

This article is participating in Python Theme Month. See the link to the event for more details

Start with the renderings

Say thing first pass: be this way, girlfriend likes handsome boy star super. She asked me if I could make a program to download a photo of a certain handsome star and then slowly choose it as wallpaper.

Dude, he’s such a nerd right in front of me. I was gonna say no and say I wouldn’t. But it’s been a few years since I came out to work, so it doesn’t seem like I’m a dish. What to do? I think, just do a crawl pictures of the program. But I don’t climb handsome, I climb one Piece. Well, I have the skills. I don’t do it. I just play.

No sooner said than done.

You need a header file

import re
import requests
import os
Copy the code

Because the crawler needs to use the request network part, so it needs these two packages, if not, you can download them by yourself. And the OS is for the operating system, so here’s the picture.

Complete request

url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=='+name+'+&pn='+str(i*30)
        result = requests.get(url,headers=headers)
        dowmloadPic(result.content.decode(), name)
Copy the code

Here careful friends will find that in fact, Baidu pictures are divided into dynamic loading and static loading. Just change index to Flip to load statically. This way to climb the page can also turn the page, a lot simpler.

Once you have your HTML, you need regular expressions

 pic_url = re.findall('"objURL":"(.*?) ",",html,re.S)
Copy the code

Re.findall uses regular expressions, so there’s nothing to say here. Take a look at the use of Findall on your own.

Images are downloaded

 fp = open(dir.'wb')
        fp.write(pic.content)
        fp.close()
Copy the code

The final image is to be saved to the local hard disk, we just need to set the path.

Complete code:

#! /usr/bin/python
# -*- coding: UTF-8 -*-
import re
import requests
import os


def dowmloadPic(html, keyword,i) :
    pic_url = re.findall('"objURL":"(.*?) ",",html,re.S)
   
    abc=i*60
    print('Find keywords :' + keyword + 'pictures, now start downloading pictures... ')... Complete code: please move to the public number: like the code poemdir = r'D:\image\i' + keyword + '_' + str(abc) + '.jpg'
        if not os.path.exists('D:\image'):
            os.makedirs('D:\image')
        
        fp = open(dir.'wb')
        fp.write(pic.content)
        fp.close()
        abc += 1


if __name__ == '__main__':
    #word = input("Input key word: ")
    headers = {'User-Agent':'the Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari}
    name = input('Enter the name of the downloaded image')
    num = 0
    x = input('How many do you want to climb? , n * 60 ')

    for i in range(int(x)):
        url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=='+name+'+&pn='+str(i*30)
        result = requests.get(url,headers=headers)
        dowmloadPic(result.content.decode(), name,i)
print("Download complete")

Copy the code

Careful students can find that we only need to input different keywords, we can climb different pictures. I’ll stick to “One Piece.” Then send it to your girlfriend. This is the technology, but the image that comes out is one piece.

This crawler entry is so simple, you learn waste?

Hey, guys, it’s not easy to be original. Leave a like before you go.

Related Posts

Some issues with Python crawlers -7 list de-duplication

This article will give you an in-depth understanding of Java class loading and ClassLoader source code analysis.

Arija Yangqing: My humble opinion on the direction of AI