At the bottom of the previous five articles, we should access the air business.

This article begins with the front end multilingual feature, showing how to use Puppeteer to control Chrome/Chromium and download files.

A directory

What’s the difference between a free front end and a salted fish

directory
A directory
The preface
Three Puppeteer
 3.1 Capturing a Snapshot
 3.2 Downloading Files
Iv References

The preface

Returns the directory

Puppeteer is a Node library that provides a high-level API for controlling Chromium or Chrome via the DevTools protocol.

As explained in the GitHub introduction, Puppeteer can be used for most of the things you do manually in your browser!

  • Capturing a page Snapshot
  • Generate page PDF
  • Automatically manipulate the page DOM

For detailed examples, please refer to the GitHub or Chinese documentation at the bottom of this article (readme.md).

Three Puppeteer

Returns the directory

  • Installation:npm i puppeteer

! Jsliang installation error:

  • (node:7584) ExperimentalWarning: The fs.promises API is experimental

My node.js version is [email protected], so I need to upgrade Node.js.

There are two ways to upgrade, one is to download the latest version to cover the installation, the other is to manage through NVM/NVMW.

Jsliang network is good, download the latest documentation: Node official website

Check the latest version after installation:

  • node -v:v14.17.1

Json: “Puppeteer “: “^10.0.0”

Puppeteer installation is a test of Puppeteer’s Internet speed, with all kinds of errors expected

The installation is complete

3.1 Capturing a Snapshot

Returns the directory

Let’s take a simple example of grabbing a page snapshot:

src/index.ts

import program from 'commander';
import common from './common';
import './base/console';
import puppeteer from 'puppeteer';

program
  .version('0.0.1')
  .description('Library of Tools')

program
  .command('jsliang')
  .description('Jsliang help instruction')
  .action(() = > {
    common();
  });

program
  .command('test')
  .description('Test channel')
  .action(async() = > {// Start the browser
    const browser = await puppeteer.launch({
      headless: false.// Open the physical browser
    });

    // Create a new TAB and open it
    const page = await browser.newPage();
    await page.goto('https://www.baidu.com/s?wd=jsliang');

    // Take a snapshot and store it locally
    await page.screenshot({
      path: './src/baidu.png'});// Close the window
    await browser.close();
  });

program.parse(process.argv);
Copy the code

After the NPM run test is executed, the SRC folder contains the image file biduo.png, which is displayed as follows:

This can be affected by the Actual Scientific Internet tool or 360 Safety guard, so make sure these apps are turned off in case your blood pressure spikes

This gives us a glimpse of Puppeteer, and of course it can be exported as a PDF, etc. Read more about Puppeteer in resources below.

3.2 Downloading Files

Returns the directory

Since we can get screenshots, it’s not surprising that we can manipulate the DOM. Let’s get files offline!

To take a document example, let’s create an Excel file:

Create a way to play, not to explain, document address: https://www.kdocs.cn/

Then, our next step is to download this Excel (assuming we have hired someone to do the translation work), which looks like this:

This picture comes from the network, this knowledge sharing for reference, infringement must be deleted

Then let’s do a simple one:

It doesn’t matter how multilingual it is, our goal is to access this Excel file by operating Puppeteer

OK, we have the file. How can we download it? The situation is as follows:

  • Imagine if we opened the Puppeteer via a headless browser, which is almost as good as a traceless browser. If you log in normally, you have to re-log in, enter the link, and then click the button to download.

So, here’s the no-login link for the document:

We all know that sign-on free is sign-on free. This is a stupid explanation, but I feel it’s necessary…

Here is the above Demo address, you can use it to practice, but I do not ensure that this link will be deleted one day, so follow the above steps to set up a!

  • Excel trial file. XLSX:https://www.kdocs.cn/l/sdwvJUKBzkK2

OK, rory, let’s get down to business — how to get offline files:

  1. Operation browser openhttps://www.kdocs.cn/l/sdwvJUKBzkK2
  2. Sleep 6.66s (make sure your browser opens the link and loads the page)
  3. Then trigger the click of the “More Menu” button
  4. Sleep 2S (make sure more menu buttons are clicked to)
  5. Set the download path (ensure the download location, otherwise pop-ups will not be easy to handle)
  6. Finally, the click of the “Download” button is triggered
  7. Sleep for 10s
  8. Close the window

The only point to pay attention to above is point 5, because our Windows click download will have a popup window (not the default download), so you need to set the download path in advance (will be reflected in the code).

So, code!

src/common/index.ts

import { inquirer } from '.. /base/inquirer';
import { Result } from '.. /base/interface';
import { sortCatalog } from './sortCatalog';
import { downLoadExcel } from './downLoadExcel';

const common = (): void= > {
  // Question route: see questionlist.ts
  const questionList = [
    // q0
    {
      type: 'list'.message: A: May I help you? '.choices: ['Public Services'.'File Management']},// q1
    {
      type: 'list'.message: 'Current public services are:'.choices: ['File sort']},// q2
    {
      type: 'input'.message: 'Which folder do you want to sort? (Absolute path) ',},// q3
    {
      type: 'list'.message: 'What kind of support do you need? '.choices: ['multilingual'.'turn Markdown Word'],},// q4
    {
      type: 'list'.message: 'What kind of support do you need? '.choices: [
        'Download multilingual Resources'.'Import multilingual Resources'.'Export multilingual Resources',]},// q5
    {
      type: 'input'.message: 'Resource download address (HTTP)? '.default: 'https://www.kdocs.cn/l/sdwvJUKBzkK2',}];const answerList = [
    // q0
    async (result: Result, questions: any) => {
      if (result.answer === 'Public Services') {
        questions[1] (); }else if (result.answer === 'File Management') {
        questions[3]();
      }
    },
    // q1
    async (result: Result, questions: any) => {
      if (result.answer === 'File sort') {
        questions[2]();
      }
    },
    // q2
    async (result: Result, _questions: any, prompts: any) => {
      const sortResult = await sortCatalog(result.answer);
      if (sortResult) {
        console.log('Sort succeeded! '); prompts.complete(); }},// q3
    async (result: Result, questions: any) => {
      if (result.answer === 'multilingual') {
        questions[4]();
      }
    },
    // q4
    async (result: Result, questions: any) => {
      if (result.answer === 'Download multilingual Resources') {
        questions[5]();
      }
    },
    // q5
    async (result: Result, _questions: any, prompts: any) => {
      if (result.answer) {
        const downloadResult = await downLoadExcel(result.answer);
        if (downloadResult) {
          console.log('Download successful! '); prompts.complete(); }}},]; inquirer(questionList, answerList); };export default common;

Copy the code

I regret that Inquirer. Ts was modified so badly that jsliang had to write a file to indicate the sequence of the problem before it was sorted out:

src/common/questionList.ts

// The common section questions consultation route
export const questionList = {
  'Public Services': { // q0
    'File sort': { // q1
      'Folders to sort': 'the Work Work'.// q2}},'File Management': { // q0
    'multilingual': { // q3
      'Download multilingual Resources': { // q4
        'Download address': 'the Work Work'.// q5
      },
      'Import multilingual Resources': { // q4
        'Import address': 'the Work Work',},'Export multilingual Resources': { // q4
        'Export full resource': 'the Work Work'.'Export single gate resource': 'the Work Work',}},'turn Markdown Word': 'Not currently supported'.// q3}};Copy the code

After writing, switch to the write function:

src/common/downLoadExcel.ts

import puppeteer from 'puppeteer';
import path from 'path';
import fs from 'fs';

export const downLoadExcel = async (link: string): Promise<boolean> => {
  // Start the browser
  const browser = await puppeteer.launch({
    headless: false.// Open the physical browser
    devtools: true.// Open development mode
  });

  // 1. Create a new TAB and open it
  const page = await browser.newPage();
  await page.goto(link);

  // 2. Sleep 6.66s - Make sure the page opens normally
  await page.waitForTimeout(6666);

  // 3. Trigger the click of "More Menu" button
  const moreBtn = await page.$('.header-more-btn'); moreBtn? .click();// 4. Sleep 1s - Make sure the button is clicked
  await page.waitForTimeout(2000);

  // 5. Set the download path
  const dist = path.join(__dirname, './dist');
  if(! fs.existsSync(dist)) { fs.mkdirSync(dist); }await (page asany)._client? .send('Page.setDownloadBehavior', {
    behavior: 'allow'.downloadPath: dist,
  });

  // 6. Trigger the click of the download button
  const elements = await page.$$('.header-menu-item');
  let downloadBtn;
  if (elements.length) {
    downloadBtn = elements[8];
  }
  if(! downloadBtn) {console.error('Download button not found');
    await browser.close();
  }
  awaitdownloadBtn? .click();// 7. Sleep 10s-make sure resources are downloaded
  await page.waitForTimeout(10000);

  // 8. Close the window
  await browser.close();

  return await true;
};
Copy the code

After running like this, if the console does not report an error, VS Code will display:

(Dist /Excel) (XLSX) (common) (Dist /Excel) (XLSX

See you next time!

Iv References

Returns the directory

  • Github: Puppeteer
  • Puppeteer
  • Puppeteer front-end sharp device
  • An introduction to Puppeteer

Jsliang’s document library is licensed by Junrong Liang under the Creative Commons Attribution – Non-commercial – Share alike 4.0 International License. Based on the github.com/LiangJunron… On the creation of works. Outside of this license agreement authorized access can be from creativecommons.org/licenses/by… Obtained.