background

Originally, I listened to books on platform A, but in the following chapters, I could only go to platform B. Cumbersome operation, poor experience, no support for caching, extremely expensive traffic (last month, more than 60 outside the package traffic consumption), so I can only manually download the audio, but the operation is also tedious.

demand

Download an audio file as follows

  1. To enter the list page, you may need to scroll to load
  2. Go to the Play page, open a new TAB, click “Play button”
  3. Go to developer tools, go to Audio, don’t hit play, audio doesn’t come out
  4. Copy the link to the new TAB to open
  5. To download it using the built-in browser function, you need to click two times
  6. Waiting for the download to complete
  7. Go to another TAB and copy the name of the audio,
  8. Enter the folder where you downloaded the content, and then change the name, otherwise it is all the same name, can not distinguish

The process was tedious, painful enough after five or six downloads, and more than two hundred chapters….

implementation

The first thing I thought of was a headless browser. After a lot of trouble, it was broken off and on, and finally solved. The specific audio address was not disclosed.

const puppeteer = require("puppeteer");
const axios = require("axios");
const querystring = require("querystring");
const fs = require("fs");
Copy the code

Puppeteer is a “headless browser” that implements human-computer interaction in the form of code. English: www.puppeteerjs.com/#?product=P…

(async () => { const audioInfoList = []; Const [reqUrl, search,] = "split("? );Copy the code

“Interface address for paging list information” is obtained with the help of developer tools to minimize the simulation of human-computer interaction, otherwise there would be a mouse scroll here to trigger pull-up loading.

  let params = Object.assign({}, querystring.parse(search), {
    begin: 20,
    count: 10,
  });

Copy the code

Substitution of parameters

  const audioRsp = await axios.post(
    reqUrl + "?" + querystring.stringify(params)
  );

  const audioInfoList = audioRsp.data.appmsg_list.map((item) => ({
    name: item.title,
    url: item.link,
  }));
Copy the code

Make an Ajax request that records the name of the audio and the address of the audio playback page

  for (let i = 0; i < audioInfoList.length; i++) {
    const audioItem = audioInfoList[i];
    try {
      const browser = await puppeteer.launch();

      const page = await browser.newPage({});
      await page.goto(audioItem.url, {
        waitUntil: "networkidle0",
      });
Copy the code
  1. Create a browser
  2. Create a TAB
  3. Go to the audio player page
      page.waitForSelector(".audio_card_switch");
      await page.click(".audio_card_switch");
Copy the code

Wait for the play button represented by.audio_card_switch to appear before clicking because the play button requires a network request to appear.

const downloadSrc = await page.evaluate(() => { return document.querySelector("audio").src; }); const audioContent = await axios.get(downloadSrc, { responseType: "arraybuffer", }); fs.writeFileSync(`download/${audioItem.name}.mp3`, audioContent.data); console.log( `${i}/${audioInfoList.length - 1} ${audioItem.name} : ${downloadSrc}` ); await browser.close(); } catch (error) {console.log(" Download failed :" + I + "" + Audioitem.name +" "+ audioitem.url); }}Copy the code
  1. Get audio Address
  2. Get the audio content via an Ajax request, note responseType: "arraybuffer",Is required, otherwise the audio won’t play, you might get “This file can’t play. This could be because the file type is unsupported, the file extension is incorrect, or the file is corrupted. 0 xc00d36c4 “error
  3. Store the audio file locally.

Download the way

for (let i = 0; i < audioInfoList.length; I++) {, download one and then download the next one, so that you can download the audio of the current page at one time. ForEach (async (audioItem, audioIndex) => {, the latter is very eat “resources”, download order is disorderly, the probability of download failure is high, once there is also not good to check which is the problem, What’s in the directory is not audio.

0xc00d36fa

“Can’t find audio equipment. Make sure headphones or speakers are connected. For more information, search for manage Audio Devices in devices.

Do not mistake this error for a problem with the downloaded audio file.