Two implementations of web page screenshots: HTML2canvas & Puppeteer

demand

Last week, I made a requirement: save the contents of the web page as pictures in sequence and upload them to OSS to return the URL to generate PPT for the back end. Here to share how to save the content of the web page as a picture. As for the things after saving as pictures, such as directly saving as local, or uploading to the server, etc., I will not explore here.

implementation

html2canvas

Html2canvas is a screenshot library that allows you to capture a screenshot or part of a web page directly from the user’s browser. Screenshots are DOM based because they don’t do actual screenshots, but instead build screenshots based on the information available on the page, so it may not be 100% accurate. That is, it can only correctly describe properties it understands, which means there are many CSS properties that don’t work, such as animations.

As the website says, it’s very simple to use.

	npm install html2canvas
	import html2canvas from 'html2canvas'
    
    html2canvas(document.body).then(function(canvas) {
    	document.body.appendChild(canvas);
	});
Copy the code

In fact, as long as a line of code can get converted canvas data, await html2canvas(the element you want to convert).

My sample code

    <button @click="shot">click me to convert</button>
    <div>below here are something that i want to take them to screenshot</div>
    <div class="wrap" ref="wrap">
      <div>Yuxi said, "If you can't read the document, go home and feed the pigs." IE says, why are you all looking at me?</div>
      <img alt="Vue logo" src=".. /assets/logo.png" />
    </div>
Copy the code

    async shot() {
      let ele = this.$refs["wrap"];
      let canvas = await html2canvas(ele);
      var img = canvas.toDataURL("image/png");
      this.debugBase64(img);
    },
    // This method opens the rendered image in a new page.
    debugBase64(base64URL) {
      var win = window.open();
      win.document.write(
        '<iframe src="' +
          base64URL +
          '" frameborder="0" style="border:0; top:0px; left:0px; bottom:0px; right:0px; width:100%; height:100%;" allowfullscreen></iframe>'
      );
    },
Copy the code

The effect is shown below:

But actually I’m in a big hole here, the thing I need to render is related to the formula, using the MathJax rendering formula, whose underlying implementation is SVG, so my formula can’t be converted to canvas. To this end, puppeteer is the only solution.

puppeteer

A previous article covered using this Node library to render web pages as PDFS and save them locally. Rendering to PDF is just one of the library’s minor features, such as screenshots.

Screenshot is a screenshot method from a Page instance that has the following configuration items. You can take full-screen screenshots, or you can take screenshots of a specified area with coordinates and length and width.

Using the puppeteer library is secondary, and installing the Puppeteer library is even more difficult, as it requires downloading a full Chrome app if it is not available for network reasons. You can install only the puppeteer-core in package.json. Configure your own downloaded Chrome application when you use it. Instead of going into details, you can search for relevant solutions. Take the full-screen screenshot of Zhihu’s official website as an example.

const puppeteer = require("puppeteer-core");

async function getPic() {
  const browser = await puppeteer.launch({
    executablePath: "./chrome-win/chrome.exe"});const page = await browser.newPage();
  await page.goto("https://zhihu.com", {
    waitUntil: 'networkidle0'});await page.setViewport({ width: 1980.height: 1080 });
  await page.content();
  await page.screenshot({
    path: "zhihu.png".type: "jpeg".quality: 100.fullPage: true
  });

  await browser.close();
}

getPic();
Copy the code

ExecutablePath: “./chrome-win/chrome.exe”. Some single-page applications may also be configured to waitUntil all requests load, i.e. WaitUntil: ‘networkidle0’.

The screenshot looks like this:

If you want to take a screenshot of the specified area, you need to know the coordinates and length and width of the screenshot area. In general, the part that needs a screenshot will have a full DOM element to render, if not, write it inside a DOM element.

For how to obtain the position of a DOM element on the page, we need to use these DOM attributes, offsetLeft offsetTop offsetWidth offsetheight. If there are special circumstances on the layout, then obtaining coordinates may require special processing.

I got the coordinates, and the screenshots worked. But there seems to be another problem: coordinates only exist on the page, and it’s the Node side that needs them. How to build a bridge between its node and the page passes coordinates to node. This is done using the evaluate method in the Page example, which passes two parameters: pageFunction and args. The former is the method to be executed on the current page instance, and the latter is the parameter to be passed to the current method. In this case, the page will fetch the coordinate information of the element in the window object, then return the information of the window object in the pageFunction. Sometimes you also need to pass the relevant data to the page for its use, in this case using the second parameter. The code is as follows:

const position = await page.evaluate(() => {
    // eslint-disable-next-line no-undef
    return window.elePostion
})
Copy the code

After you get the coordinates, you can take screenshots in turn. The screenshots can be saved as a local file or you can get the buffer of the image.

Two implementations of web page screenshots: HTML2canvas & Puppeteer

demand

implementation

html2canvas

puppeteer

Related Posts

The “dry stuff” interviewer asked me how to quickly search for 100,000 rectangles. – I said RBush

Today’s Pick has a book for programmers with zero to ten years of experience

Some understanding of the observer pattern and the publisher-subscriber pattern