preface

In my last article, I wrote a few days ago that I almost ran away from html2Canvas due to many compatibility problems. Then, after the guidance of the leaders in the comment section, we found a simple and reusable poster generation scheme — Node+Puppeteer poster generation.

The main design idea is to access the interface that generates the posters, which accesses the incoming address through the Puppeteer and returns the corresponding element screenshots.

What are the advantages of Puppeteer generated posters over Canvas generated posters:

  • No browser compatibility, platform compatibility, etc.
  • High code reuse, H5, small program, APP poster generation services can be used.
  • More space for optimization operation. Because of the change to the form of interface generated posters, various server-side methods can be used to optimize the response speed, such as: add server, add cache

Puppeteer is introduced

Puppeteer is a Node library that provides a high-level API for controlling Chromium or Chrome via the DevTools protocol. Puppeteer runs in headless mode by default, but headless mode can be configured by changing headless:false to Puppeteer. The vast majority of operations performed manually in a browser can be done using Puppeteer! Here are some examples:

  • Generate a PDF or screenshot of the page.
  • Grab SPA (single-page application) and generate pre-rendered content (that is, “SSR” (server-side rendering)).
  • Automatic form submission, UI testing, keyboard input, etc.
  • Create an automated test environment that is constantly updated. Perform tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  • Capture the Timeline trace for the site to help analyze performance issues.
  • Test the browser extension.

Plan implementation

1. Write a simple interface

Express is a concise and flexible Node.js Web application framework. Write a simple Node service using Express, define an interface, and receive the required configuration items for screenshots to puppeteer.

const express = require('express')
const createError = require("http-errors")
const app = express()
// middleware --json as an input parameter
app.use(express.json())
app.post('/api/getShareImg'.(req, res) = > {
    // Business logic
})
// Error interception
app.use(function(req, res, next) {
    next(createError(404));
});
app.use(function(err, req, res, next) {
    let result = {
        code: 0.msg: err.message,
        err: err.stack
    }
    res.status(err.status || 500).json(result)
})
// Start the service to listen on port 7000
const server = app.listen(7000.'0.0.0.0'.() = > {
    const host = server.address().address;
    const port = server.address().port;
    console.log('app start listening at http://%s:%s', host, port);
});
Copy the code

2. Create a screenshot module

Open a browser => Open a TAB => Screenshot => Close the browser

const puppeteer = require("puppeteer");

module.exports = async (opt) => {
    try {
        const browser = await puppeteer.launch();
        const page = await browser.newPage();
        await page.goto(opt.url, {
            waitUntil: ['networkidle0']});await page.setViewport({
            width: opt.width,
            height: opt.height,
        });
        const ele = await page.$(opt.ele);
        const base64 = await ele.screenshot({
            fullPage: false.omitBackground: true.encoding: 'base64'
        });
        await browser.close();
        return 'data:image/png; base64,'+ base64
    } catch (error) {
        throw error
    }
};
Copy the code
  • Puppeteer.launch ([options]) : Launch a browser
  • Browser.newpage () : Creates a TAB page
  • Page.goto (URL [, options]) : Navigate to a page
  • Page. SetViewport (viewPort) : Specifies the window to open the page
  • Page.$(selector) : element selection
  • Elementhandle. screenshot([options]) : Screenshot. Among themencodingProperty can specify whether the return value is base64 or Buffer
  • Browser.close () : closes the browser and tabs

3. The optimization

1. Request time optimization

The page.goto(url[, options]) configuration item waitUntil indicates when the execution is complete. The default is when the load event is triggered. Events include:

 await page.goto(url, {
     waitUntil: [
         'load'.The page load event is triggered
         'domcontentloaded'.// the page "DOMcontentloaded" event is triggered
         'networkidle0'.// there is no network connection within 500ms
         'networkidle2' // the number of network connections within 500ms is not more than two]});Copy the code

If the solution of networkidle0 is used to wait for the page to complete, the interface response time is long. Networkidle0 needs to wait for 500ms, which is not required in many real service scenarios. Therefore, you can encapsulate a delay timer and customize the waiting time. For example, our poster page only renders a background picture and a TWO-DIMENSIONAL code picture. When the page triggers load, it has already been loaded, and there is no waiting time. We can pass 0 to skip the waiting time.

 const waitTime = (n) = > new Promise((r) = > setTimeout(r, n));
 // Omit some code
 await page.goto(opt.url);
 await waitTime(opt.waitTime || 0);

Copy the code

If this is not possible, the page needs to tell the puppeteer to end at some point, or use page.waitForSelector(selector[, options]) to wait for a specified element of the page to appear. For example, puppereer waits for an element whose ID =”end” to be inserted when an operation is performed on the page.

 await page.waitForSelector("#end")
Copy the code

Similar methods include:

  • Page.waitforxpath (xpath[, options]) : Waits for the xpath element to appear on the page.
  • Page.waitforselector (selector[, options]) : Waits for elements matched by the specified selector to appear on the page, and returns immediately if there are already matched elements when this method is called.
  • Page. WaitForResponse (urlOrPredicate[, options]) : Waits for the specified response to end.
  • Page. WaitForRequest (urlOrPredicate[, options]) : Waits for the specified response to appear.
  • Page.waitforfunction (pageFunction[, Options [,…args]]) : Waits for a method to execute.
  • Page. WaitFor (selectorOrFunctionOrTimeout [, options args [,…]]) : this method is equivalent to the above several methods of selector, according to the different results of the first parameter, such as: You pass in a string, and it decides if it’s an xpath or a selector, which is equivalent to waitForXPath or waitForSelector.

2. Optimize boot options

Chromium launches with a lot of unneeded functionality, and you can disable certain boot options with parameters.

    const browser = await puppeteer.launch({
        headless: true.slowMo: 0.args: [
            '--no-zygote'.'--no-sandbox'.'--disable-gpu'.'--no-first-run'.'--single-process'.'--disable-extensions'."--disable-xss-auditor".'--disable-dev-shm-usage'.'--disable-popup-blocking'.'--disable-setuid-sandbox'.'--disable-accelerated-2d-canvas'.'--enable-features=NetworkService']});Copy the code

3. Reuse the browser

Because every time the interface is called, a browser is started, which is closed after the screenshot, resulting in a waste of resources, and it takes time to start the browser. And if too many browsers are running at the same time, the program will throw an exception. So connection pooling is used: launch multiple browsers, create tabs in one of the browsers to open the page, and only close the tabs after the screenshot is taken, leaving the browser. The next time a request comes in, the TAB page is created directly to reuse the browser. Close the browser when it has been used a certain number of times or has not been used for a period of time. The generic-pool connection pool has already been fixed, so I’ll just use it.

const initPuppeteerPool = () = > {
 if (global.pp) global.pp.drain().then(() = > global.pp.clear())
 const opt = {
   max: 4.// The maximum number of puppeteer instances to be generated.
   min: 1.// Ensure the minimum number of puppeteer instances in the pool to survive
   testOnBorrow: true.// The pool should validate instances before providing them to users.
   autostart: false.// Do you need to initialize the instance when the pool is initialized
   idleTimeoutMillis: 1000 * 60 * 60.// Disable an instance if it hasn't been accessed for 60 minutes
   evictionRunIntervalMillis: 1000 * 60 * 3.// Check the instance access status every 3 minutes
   maxUses: 2048.// Custom attributes: maximum number of reuses per instance.
   validator: () = > Promise.resolve(true)}const factory = {
   create: () = >
     puppeteer.launch({
       // Start parameters refer to article 2
     }).then(instance= > {
       instance.useCount = 0;
       return instance;
     }),
   destroy: instance= > {
     instance.close()
   },
   validate: instance= > {
     return opt.validator(instance).then(valid= > Promise.resolve(valid && (opt.maxUses <= 0|| instance.useCount < opt.maxUses))); }};const pool = genericPool.createPool(factory, opt)
 const genericAcquire = pool.acquire.bind(pool)
 // Rewrites the original pool consumption instance method. Add an increase in the number of times an instance is used
 pool.acquire = () = >
   genericAcquire().then(instance= > {
     instance.useCount += 1
     return instance
   })

 pool.use = fn= > {
   let resource
   return pool
     .acquire()
     .then(r= > {
       resource = r
       return resource
     })
     .then(fn)
     .then(
       result= > {
         // Whether the business side uses the instance successfully or after, the instance consumption is completed
         pool.release(resource)
         return result
       },
       err= > {
         pool.release(resource)
         throw err
       }
     )
 }
 return pool;
}
global.pp = initPuppeteerPool()

Copy the code

4. Optimize interfaces to prevent image duplication

Repeated calls with the same set of parameters will start a browser process to take screenshots each time, and you can use caching mechanisms to optimize repeated requests. Image Base64 can be stored in Redis or written to memory by passing in a unique key as an identification bit (such as user ID + activity ID). When an interface is requested, it is checked to see if it has been generated in the cache. If it has been generated, it is directly cached. Otherwise, go through the poster generation process.

At the end

At present, this scheme has started trial operation in the project, which is too friendly for my front-end development. There is no need to draw canvas step by step in the small program, no need to consider cross-domain resources, and no need to consider the compatibility of wechat browser and various built-in browsers. The time saved will allow me to write this article. Secondly, I am more worried about the performance problem, because only the action of sharing will be triggered, and the concurrency is small. Currently, the performance problem has not been exposed in use, and knowledgeable leaders can guide me to further optimize or prevent the points.

code

View the full code: Github

Related reading

About how I almost ran away using HTML2Canvas

The resources

  • Talk about Puppeteer in conjunction with the project
  • The Puppeteer API is introduced
  • This version is in Chinese