This article was first published on the public account “Python Knowledge Circle”. Please go directly to the public account

“It takes about 3.1 minutes to read the text

The last article climbed the singer’s name and singer’s ID, this article climbed the singer’s ID to directly download the corresponding singer’s lyrics. I can actually write this as a big project, break this big project into small projects to make it easier for everyone to understand, and once the small projects are mastered, put together into a complete project.

The previous article did not learn also does not matter, the public number reply “singer” can get the last crawling result CSV file, the file has the singer name and singer ID.

Ok, first look at the result of crawling lyrics, I input is Angela Chang’S ID: 10562, crawling lyrics of 50 popular songs.

Project environment

Language: Python tool: Pycharm

Guide package

Requests: get the page source based on the URL. BeautifulSoup: Parse the extracted source code.

Program structure:


The program consists of six parts: get_HTML () : extract page source get_top50() : extract singer song information. Get_lyrics () : Extracts the lyrics of the song. Save2txt () : Save the lyrics to a TXT file. Main: the main function.

Parsing the page

This is the first step of crawler. Pay attention to the setting of proxy IP to prevent its OWN IP from being blocked. If there is a large amount of crawler data, it is suggested to add waiting time in the main function, which will not increase the pressure on the crawler target server and reduce the risk of being blocked.

This function returns the source files of the popular songs page. In this article, we used selenium library. Selenium library is slow to crawl, so we used the Requests library instead. Url is based on the previous article to get the ID splicing together, but the page URL is a false URL, with a false URL can not extract the page source. Later, I checked the API of netease Cloud Music and found that the real URL did not have #. You can remove the redundant # and add the corresponding singer ID. I put this URL in the main function.

Get the singer’s song information

The source code extracted by get_HTML (URL) contains the name of the song and the id of the song. This selector tag is very hidden. I print soup first, find the location of the first song, and look forward to the tag. The selector is analyzed as.f-hide #song-list-pre-cache A, and then the extracted elements are processed to remove unwanted information, retain valid information and return it in the form of ZIP one by one.

To get the lyrics

Get the corresponding lyrics from the song ID obtained by the previous function. If the url on the page is not used to get the content, only the API link provided by them can be used to join the song ID. It uses json.loads() to decode the Python JSON format, extract the lyric information, and use re.sub() to do string substitution processing, removing unwanted elements.

Save the data

Directly use the song name as the name to save a plain text file, if you want to save in a specific directory, you need to create this directory in advance.

Execute main

Finally, execute the main function, input the need to crawl the singer ID, run, every time I get a song lyrics I set a waiting time, this is also a small strategy to prevent IP blocked.

Ok, if you want to see the lyrics of a popular song by a certain singer in the future, find their ID and run the code of this article to download the lyrics. In the future, when I go to concerts, I will no longer have to cry aloud with singers because I forget lyrics.

Public reply “lyrics” to obtain the source code.

The next article will teach you how to download music.

If this article is a little help to you, I hope we can give more support, the attention to the attention, the point like point like, the forward forward, if there is any problem, welcome to contact me in the background, you can also join the technical exchange group in the background, there are gods in the group, you can exchange and learn together.

Recommended reading

This article gives you an easy introduction to python crawlers

Python crawler to get information about netease Cloud music singer

Let the code dance with Michael Jackson



Public account “Python Knowledge Circle”

Long click the QR code to follow us

Our public account focuses on:

1. Share Python technology

2. Python crawler sharing

3. Sharing of materials and tools

Welcome to pay attention to us, grow together!