What is Puppeteer?

The following information is available on the Puppeteer website: Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

The popular description is: Puppeteer can run Chrome or Chromium without interfaces (on an interface server, of course), and then code the browser to behave, even in non-interface mode. Chrome or Chromium can also render web pages correctly in memory. So what can Puppeteer do?

  • Generate web screenshots or PDFS
  • Capture SPA (Single-Page Application) for server rendering (SSR)
  • Advanced crawler, can climb a large number of asynchronously rendered content of the web page
  • Simulate keyboard input, form automatic submission, login web page, etc., to achieve UI automatic testing
  • Capture a timeline of your site to track your site and help analyze site performance issues

This article chooses a screenshot scenario as a demonstration.

Without further ado, let’s take a quick look at how to quickly deploy a Puppeteer Web application using a function computed product.

How to quickly deploy a distributed Puppeteer Web application?

To rapidly deploy distributed Puppeteer Web applications, this article uses functional computing services as an example.

Function Compute: Function Compute is an event-driven service that lets the user write code and upload it instead of managing the server. Function computing prepares computing resources and runs user code on an elastic scale, with users paying only for the resources consumed by the actual code running. Function calculation refer to more information.

With functional computing services, our goal here is to build a distributed application, but it’s really simple to write business code and deploy it to functional computing, and that’s it.

After using the function calculation, our system architecture diagram is as follows:

Results demonstrate

Can be directly through the following link to view the effect: 1911504709953557. cn-hangzhou.fc.aliyuncs.com/2016-08-15/… PS: The first request may have a few seconds of cold start time, but the cold start can be completely eliminated by using reserved mode. I’ll cover that next time.

Setup Steps:

The overall process is shown in the figure below:

Fun Init, Fun Install, and Fun Deploy are the only commands we need to operate, and each step is automatically completed by the three commands.

1. Install tools

Install Fun:

It is recommended that you download the binary executable directly from here and unzip it for direct use. Download address.

Install Docker: You can do this as described here.

2. Initialize the project:

With Fun, you can quickly initialize the scaffolding for a Puppeteer Web application using the following command:

fun init -n puppeteer-test http-trigger-node-puppeteer
Copy the code

In the preceding command, -n puppeteer-test indicates the name of the directory where the puppeteer project is initialized, and http-trigger-node-puppeteer indicates the name of the template to be used. You can omit this name. After you omit this name, you can select a desired template from the list prompted by the terminal. After executing, you can see the following directory structure:

.├ ─ index.js ├─ package.json ├─ templateCopy the code

In contrast to the traditional Puppeteer application, there is only one more template.yml file, which describes the resources used to calculate the function. The index.js is our business code. You can write your own business code as required by the Puppeteer official help document. The core code is as follows:

const browser = await puppeteer.launch({
  headless: true,
  args: [
    '--no-sandbox'.'--disable-setuid-sandbox']}); const page = await browser.newPage(); await page.emulateTimezone('Asia/Shanghai');
await page.goto('https://www.baidu.com', {
  'waitUntil': 'networkidle2'
});
await page.screenshot({ path: '/tmp/example', fullPage: true.type: 'png' });
await browser.close();
Copy the code

Package. json contains the following contents:

{... ."dependencies": {
    "puppeteer": "^ 2.0.0." "},... . }Copy the code

As you can see, puppeteer dependencies are declared in package.json. This is also standard practice when we use Node development, nothing special about it.

3. One-click dependency installation

Puppeteer installation, even on a traditional Linux machine, is not easy. Because puppeteer itself relies on so many libraries, which libraries to install and how to install them becomes a headache.

Fortunately, the function calculation command line tool Fun already integrates with the Puppeteer solution. As long as the Puppeteer dependencies are included in package.json, use Fun Install -d to install all system dependencies in one click.

fun install -d
Copy the code

4. Run and debug functions locally

The local running and debugging of Puppeteer is exactly the same as described here and will not be repeated. We’ll just show you how it works:

5. One-click application deployment

Almost all FaaS platforms set code package limits to reduce cold startup of applications, and function computation is no exception. Puppeteer itself is around 350M, along with its system dependencies. How to deploy 450M functions to the FaaS platform is a headache and tedious problem.

Fun, the command-line tool for function computation, now natively supports this heavily dependent deployment (only Node Runtime is available in version 3.1.1). No additional action is required, just execute fun deploy:

$ fun deploy
Copy the code

Fun automates the deployment of dependencies. When packaged dependencies are detected to exceed platform limits, the configuration wizard is entered to help the user automate the configuration.

The recommended path here is to enter yes when prompted to use NasConfig: Auto to process the NAS automatically, and then do nothing more until the function is deployed successfully.

If you have other requirements, such as using your existing NAS service, you can enter no when prompted to use NasConfig: Auto to enter the corresponding process. For more details, please refer to the FAQ below.

FAQ

What did Fun do when she installed the puppeteer?

Puppeteer itself is an NPM package, which is easy to install through NPM Install. The problem here is that Puppeteer relies on Chromium, which in turn relies on some system library. So NPM install will also trigger the chromium download operation. Here users often encounter problems, mainly:

  1. Because of the large volume of Chromium, it often fails to download due to network problems.
  2. NPM only downloads Chromium, the system libraries that Chromium depends on are not installed automatically. Users also need to find missing dependencies to install.

The main optimizations Fun makes are:

  1. By detecting the network environment, for domestic users, it will help configure Taobao NPM image to achieve accelerated download effect.
  2. Automatically install the dependencies that Chromium is missing for users.

How does Fun deploy large dependencies to functional computation? Isn’t there a code package size limit?

Almost all FaaS impose a limit on the size of the function code package in order to optimize the cold start of functions. Function calculation is no exception. However, Fun can help users create, configure, and upload dependencies to NAS with one click through the built-in NAS (Ali Cloud File Storage) solution. Function computation, on the other hand, can automatically read function dependencies from the NAS at run time.

To help users automate these operations, Fun has a built-in wizard (version 3.1.1 only supports Node, more to come, Github issue welcome) that prompts Fun to convert code to NAS if it is detected to be larger than the platform limit. The logic of the entire wizard is as follows:

  1. Ask whether to use Fun to automate the configuration of NAS management dependencies? (If yes, enter the wizard, if no, continue publishing process)
  2. Check whether NAS has been configured in the user’s YML
  3. If yes, the system prompts you to select the configured NAS storage function dependency
  4. If no, the system prompts you whether to use NasConfig: Auto to automatically create NAS configurations
  5. If yes is selected, users can automatically configure NAS and VPC resources.
  6. If no is selected, existing NAS resources on the current NAS console are displayed for the user to select
  7. Either way, the NAS and VPC-related configurations will eventually be generated in template.yml
  8. Based on language detection, such as node Runtime, node_modules and.fun/root are mapped to the NAS directory (via.nas.yml).
  9. Automating Fun NAS Sync helps users upload local dependencies to nas services
  10. Automatically executes Fun deploy to help users upload code to functions for calculation
  11. Prompt help information. For HTTP Trigger, the Endpoint of the prompt function can be accessed directly by opening the browser

Can I specify the version of Puppeteer?

Yes, just modify the Puppeteer version in package.json and reinstall it.

Is there any way to change the time zone in the calculation instance to UTC?

The display effect of some web pages depends on time zone. Different time zones may result in different contents. The latest version of Puppeteer, 2.0, provides a new API page. EmulateTimezone (timezoneId) that makes it easy to change the timezone.

If later versions of Puppeteer are updated and rely on more system dependencies, will the approach described in this article still apply?

Fun has built-in. So missing detection mechanism. When running locally, it will intelligently identify the missing dependency library according to the error report, and then give precise installation commands, which can achieve one-click installation.

If new dependencies are added, how are they updated?

If a new dependency is added, such as a new dependency library in the node_modules directory, simply re-execute Fun nas sync for synchronization.

If you change the code, you can simply redeploy using Fun Deploy. Because of the separation of large dependencies and code via NAS, dependencies usually don’t need to change very often, so they are called less frequently than fun deploy’s because they don’t have large dependencies

What other ways can you install Puppeteer with one click?

Fun provides a wide variety of dependency installation methods. In addition to declaring dependencies directly in package.json and then installing them via Fun Install -d, there are many other methods that have their own scenarios:

  1. Imperative installation. Such asfun install -f functionName -p npm puppeteer. The advantage of this setup is that even users who don’t know fun can use it foolishly.
  2. Declarative installation. The advantage of this installation is that it provides a dockerfile-like experience, where most of the instructions for Dockerfile are directly available. Dependencies declared in this way can be committed directly to the release repository. After someone pulls the code, you can also install all dependencies with one click.
  3. Interactive environment installation. The advantage of this installation mode is that it provides a similar installation experience to traditional physical machines. Most Linux commands are available in an interactive environment and can be trial-and-error.

conclusion

This article introduces a relatively simple method to build a distributed Puppeteer Web service from scratch. With this approach, deployment can be done smoothly without worrying about how dependencies are installed or uploaded.

Once deployed, you can enjoy the benefits of functional computation:

  • No need to purchase and manage servers and other infrastructure, just focus on the development of business logic, can greatly reduce project delivery time and labor costs
  • Provides log query, performance monitoring, and alarm functions to quickly rectify faults
  • Operation free, elastic expansion at millisecond level, rapid expansion at bottom to cope with peak pressure, excellent performance
  • The cost is very competitive

“Alibaba cloud native technology circle pays attention to micro-service, Serverless, container, Service Mesh and other technical fields, focuses on cloud native popular technology trends, cloud native large-scale practice, and becomes the technology circle that knows most about cloud native developers.”