Some time ago, WE received a demand from the company, which needs to generate automatic pagination of the text content (rich text format, including pictures) selected by users, and each page has the same page header and page tail PDF. Considering that if you implement typesetting in a back-end language, you need to implement a set of dynamic pagination logic based on rich text format, which is quite troublesome. The idea was to use Web technology for layout and puppeteer to generate PDF.

The implementation encountered some potholes, which are recorded here.

Writing in the front

Puppeteer is an official Google Headless Chrome tool. Can be understood as JavaScript to operate Chrome to complete some tasks (crawlers, screenshots, etc.) without opening Chrome’s graphical interface. There are many articles on how to build the environment and use puppeteer in nuggets, but I won’t go into them here. This article will only show you how to generate autopaginated PDFS.

How to implement paging

Simple paging is easy to implement. When puppeteer is operating with a page size parameter, the puppeteer automatically pagines the PDF captured content according to the preset page size. The effect is similar to that of using Chrome by pressing CTRL +P to save as a PDF. Generate a PDF of A4 size per page with the following code:

const browser = await puppeteer.launch({
  headless: true});const page = await browser.newPage();
await page.goto(url);
const pdfBuffer = await page.pdf({
  format: 'A4'.scale: 1.margin: {
    top: '0'.bottom: '0'.left: '0'.right: '0',},landscape: false.displayHeaderFooter: false});await browser.close();
Copy the code

The results of the generated PDF for the Nuggets home page are shown below:

Implement fixed header and footer for each page

1. Use fixed layout

Paging is easy to implement; the difficulty is getting each page to present the same header and tail. I first tried to use the fixed attribute, which can indeed achieve the effect of fixed page header and page tail. However, due to the fixed layout, the content is obscured by the page header. Take the nuggets home page as an example:

As shown, the red box position is obscured by the fixed layout header.

2. The table layout

Through constant Google and debugging, finally found that the table layout can better meet the requirements. The implementation is to put the header in thead and the footer in Tfoot. The code is as follows:

<table class="table-container">
  <! -- Fixed head for each page -->
  <thead class="table-header">
    <tr class="table-row">
      <th class="table-row-item">
        <div class="page-header-wrapper">
          <header class="page-header">
            <div class="left">
              <div class="logo-wrapper">
                <img class="logo" src="@/assets/images/logo.svg" alt="logo" />
                <div class="user-name">The header</div>
              </div>
            </div>
          </header>
        </div>
      </th>
    </tr>
  </thead>

  <! -- Wrap paragraph container -->
  <tbody class="table-body">
    <tr class="table-row">
      <td class="table-row-item">
        <div class="container">
          <! -- Place the page content here -->
        </div>
      </td>
    </tr>
  </tbody>

  <! -- Fixed end of each page -->
  <tfoot class="table-footer">
    <tr class="table-row">
      <td class="table-row-item">
        <div class="page-footer">footer</div>
      </td>
    </tr>
  </tfoot>
</table>
Copy the code

The implementation effect is as follows:

3. Use pdF-lib to generate the footer

Fixed headers and footers for each page were implemented, and new problems arose. Since the content is dynamic, the content of the last page is not necessarily bottom-to-bottom, and using the above implementation method can lead to the problem of inconsistent style of the last page. As shown in figure:

After debugging for a long time, I could not solve this problem on the Web side, so I changed my thinking and only left blank position at the end of each page. After generating PDF, I used tools to draw the end of the page. Use the pdF-lib node.js library. The code is as follows:

const pdfDoc = await PDFDocument.load(pdfBuffer);
pdfDoc.registerFontkit(fontkit);
const customFont = await pdfDoc.embedFont(SimSun);
const pages = pdfDoc.getPages();
const firstPage = pages[0];
const { width: pageWidth } = firstPage.getSize();
pages.forEach((page, index) = > {
  const text = motto[this.getRandom(motto.length, 0)];
  page.drawText(text, {
    x: 41.y: 23.size: 11.font: customFont,
    color: rgb(0.302.0.302.0.302)}); page.drawText(The first `${index + 1}Pp/total${pages.length}Page `, {
    x: pageWidth - 100.y: 23.size: 11.font: customFont,
    color: rgb(0.302.0.302.0.302)}); page.drawRectangle({x: 41.y: 45.width: pageWidth - 82.height: 0.6.borderColor: rgb(0.941.0.941.0.941),
    borderWidth: 0.6}); });const editedPdfBuffer = await pdfDoc.save();
Copy the code

PS: pdF-lib If you need to use a specific Chinese font, the font will be packaged into the PDF file, resulting in a large increase in file size. You can start by using the Font Clipping Tool to crop out only a few characters that will be used.

The realization effect is shown as follows:

This is the perfect implementation of the requirements!

other

1. Prevent pagination of specific content

If you want to avoid automatic paging, use the CSS property page-break-inside:avoid; Control.

2. Vertical margin attributes lead to content

The vertical margin attribute can sometimes cause the content to be misaligned, because Chrome doesn’t have a way to automatically split the margin when it’s pagged. Instead of margin, we use an empty placeholder div.

<! -- Vue component --> 
<div class="place-holder">
  <div
    class="place-holder-item"
    style="width: 100%; height: 1px;"
    v-for="(item, index) in Array(height)"
    :key="index"
  ></div>
</div>

<! - use - > 
<placeholder :height="30" />
Copy the code

The code address

Finally, the demo code address is attached. If it is helpful to you, please star it. If there is something wrong in the article, please correct it. The Demo making address