Introduction: The current mainstream typesetting engine: Gecko, Blink, Trident, Webkit/Webcore, etc., can meet the basic requirements of graphic and text typesetting, but for complex layouts, especially for book documents, they cannot or are very difficult to achieve, and the typesetting effect is not professional enough. For this purpose, Baidu Wenku/Baidu Reading has developed a set of cross-platform typesetting engine. This article hopes to introduce the related technology of typesetting engine, and show you some realization technology and skills of book (content) typesetting.

The full text is 3680 words, and the expected reading time is 12 minutes.

The background,

Baidu Wenku is an online interactive document sharing platform under Baidu, where users upload and share their own documents, search, read and download other users’ documents.

Since 2013, Baidu Wenku has launched Baidu Reading App and Baidu Wenku App in order to meet the needs of users to read documents and book resources on the mobile end, hoping to create a small and beautiful reading app through exquisite and professional typesetting and original reading experience. The core of the app is the realization of the typesetting engine. This article hopes to introduce some realization technology of book (content) typesetting through the related technology of Baidu Reading/library typesetting engine.

Two, technology selection

The following goals were set at the beginning of the design of the typesetting engine:

  • Android and IOS both use the same set of typesetting engine, the display effect of both ends is consistent, improve development efficiency;

  • Using the standard of book printing industry to achieve professional book typesetting;

  • Support personalized reading experience, rapid iteration of complex typesetting effect;

  • Native end reading experience, typesetting engine package as small as possible;

The mainstream typesetting engines in the market: Gecko, Blink, Trident, Webkit/Webcore, although they can achieve the basic requirements of graphic and text typesetting, they cannot or are very difficult to achieve the complex layout, such as text box vertical heading, wrapping and other effects. At the same time, the typesetting effect is not professional enough to meet the rapid iteration of the future typesetting effect, so Baidu Reading decided to implement their own cross-platform typesetting engine.

3. Technical architecture

3.1 Overall Architecture

Figure 1 Baidu Readership/Wenku layout engine architecture diagram

The typesetting engine is mainly composed of four parts:

Control/Interface module: it accepts the content transmitted from the outside, and adjusts and manages typesetting. At the same time, after typesetting, the typesetting results are returned to the display layer in the form of instructions for rendering;

Parser module: The contents read by users are in various formats, which may be epub, TXT, DOCX, or customized formats of their own lines of business. For all formats, the parser module maps them to the customized DOM tree, and the subsequent typeset only needs to process this DOM tree, which can improve the development efficiency. It also makes it easy to quickly expand to more formats in the future.

The custom DOM tree refers to ooXML structure description, and is simplified and transformed on this basis. The basic structure of DOM tree is as follows:

Figure 2 abstract DOM tree structure

The CBox node is a virtual node, which may be a character, a fragment or an object, such as text box, mathematical formula, pinyin structure and so on. It can have a structure similar to that of CDoc nodes, supporting infinite nesting to describe a variety of complex structures.

Layout module: read DOM tree structure, traverse nodes, combine Layout size, and use a variety of Layout rules, the text, content, object Layout into the display area, the result output into LDF.

This part is the main technology of the whole typesetting engine, which will be described in more detail later.

LDF module: typesetting result description structure.

After the typesetting engine finishes arranging a page of content, it does not call the drawing command to draw and display it directly. Instead, it outputs LDF data and stores it in memory for management. This is done for the consideration of two aspects: first, performance and memory. When the user reads the current page, the typesetting engine will typeset the content before and after the page in the background, cache it, and draw asynchronously when the user turns the page. Second, it provides the data basis required for interactive operations. It calculates regions and data on LDF, quickly locates to the specified node, and extracts the corresponding content.

Of course, how many pages of typesetting results should be prepared in advance and how many pages should be pre-drawn in advance can be dynamically set according to their own needs and equipment conditions to achieve the optimal reading experience.

LDF data structure is more inclined to layout description, and its structure is as follows:

Figure 3 LDF structure

LDF data definitions cover all the elements of the layout, including pages, blocks, columns, paragraphs, lines, and content fragments, to achieve a variety of complex effects through nesting.

3.2 Main Technologies

The process of typesetting involves a lot of technical details and processing skills, such as GLyph processing, punctuation punctuation, line alignment, block typesetting and so on. This paper describes several main typesetting techniques, and other techniques and processing skills can refer to open source typesetting engine. They are similar in many details processing techniques.

3.2.1 Basic Layout

The typesetting process can be simply understood as mapping the document logical structure (DOM tree) into layout structure (LDF) through some technical means. The typesetting process involves the document logical structure and layout structure. The basic typesetting process includes:

  • Para_layout:

Paragraph typesetting processes paragraph properties, such as front-spacing, back-spacing, indentation, paragraph hanging, and so on, and invoics line typesetting.

  • Line_layout:

When the content is finally displayed as a line of content, it is a common process in typography. Other typography engines provide a line interface.

Line typesetting processes attributes such as line spacing, and invokes section typesetting. After typesetting is completed, x-direction and Y-direction alignment are adjusted according to paragraph alignment attributes.

  • Section_layout:

In some cases, a line of text is made up of several small areas, for example, the text is wrapped around the image, and the text is read from left to right. In this case, the line is made up of two pieces from both sides of the image, and the cursor moves across the image to select the text.

Section typesetting computes the extentability of the current section, passes paragraphs, content properties, and calls object typesetting.

  • Obj_layout:

Object typesetting is the basic unit of typesetting engine. It includes character typesetting and structure typesetting. It performs specific and separate processing according to the attributes and types of objects.

3.2.2 Reuse and abstraction

There are many types of content, from ordinary Chinese and English characters to mathematical formulas, pinyin structures, block structures and so on, and even object nested objects, which requires the typesetting engine to abstract all objects, through recursive calls, basic typesetting processing reuse, to achieve all kinds of complex effects.

Complex object, from the point of view of structure, it is composed of one or more sections of content, in this sense, it can be understood as a CDoc node, that is, it is a sub-document (and for the object nested object structure, can be understood as a sub-document nested sub-document), through the data structure abstraction, so as to achieve typesetting reuse.

Layout of an object, namely the subdocument typography in the area of the child, it is also a basic paragraph, line, clips, content, layout, and revise according to object to a specific location (such as pinyin layout, up and down two sub box layout, y direction position adjustment, the x direction center alignment), finally realize the layout of the object.

3.2.3 Relative layout

The end result of typesetting is to position a character, pixel, or image to a specified position on the screen, that is, to assign an X/Y coordinate to the content. In typesetting pure text content, it is very simple to line up a character, calculate the X coordinate of the next character, and so on, until the current line does not fit.

But what about complex structures? By mathematical formulaFor example, it is a 2X2 matrix where the first element is a quadratic formula.

The first element is sorted first. The layout engine does not care about the x/y coordinates of the entry of the element, but uses the quadratic formula with respect to the origin coordinates (0,0).

But the quadratic formula is composed of molecules, the denominator, the typesetting molecules, when the denominator, still adopts the principle of relative layout, starting from the origin of coordinates (0, 0) layout, molecular, publishing as a result, the denominator respectively at the molecular and the denominator layout results, respectively, as a black piece, based on fractional typesetting rules, adjust the molecules, the denominator, Align the center lines and position them properly.

After adjusting the quadratic formula, take its typesetting result as a black block and participate in matrix typesetting and position adjustment. At this time, the relative position of the internal content of the quadratic formula will not change.

Relative position typesetting means that each sub-object does not pay attention to the coordinates of the entry position, but adopts the coordinates of the relative origin for typesetting. After typesetting, it calls correct_XY to correct the X/Y coordinates uniformly according to the characteristics of the object and the information of the entry coordinates, and finally realizes the typesetting of complex objects and the whole page.

3.2.4 Multi-direction typesetting

Different languages have different typesetting directions. For example, Chinese characters are horizontal, ancient books are vertical, Mongolian characters are vertical, uyghur languages are horizontal. How to unify these typesetting directions to support all languages?

First take a look at the reverse die row, the following is the simulation of the forward horizontal row and reverse horizontal row effect:

FIG. 4 Forward and reverse horizontal renderings

Can be seen from above, reverse the horizontally side-by-side is positive relative to the type area central axis of the mirror, understand this, reverse the horizontally side-by-side becomes positive do it again after x coordinate image processing, as for forward vertical placement, reverse landscape also can be used in a similar way to deal with, even in typesetting layer does not handle, and do the coordinate transformation matrix in the drawing. Of course, each language has its own characteristics, and the details may need to be tweaked.

Iv. Summary and outlook

FIG. 5 Effect display

Future development direction of Baidu Reading/Wenku typesetting Engine:

  • According to the content and attributes, accurately discharge the specified effect, support the typesetting of more objects, and constantly improve the typesetting standards and experience;

  • Understanding content, understanding devices, intelligent typography to provide a better reading experience;

For example, for periodical documents, abstracts and keywords can use different fonts and size, and set certain left and right margins.

For example, for large screen display and text winding content, the picture as far as possible to the nine position, and for small screen, dynamic adjustment of the picture size, even cancel text winding and so on.

There is little difference between each typesetting engine in essence, and there is no good or bad. The self-developed typesetting engine is more out of its own business needs, and the typesetting technology used is both universal and has its own characteristics. The purpose of this article is to explore the underlying technology of typesetting and explore the development of typesetting technology together.

Recommended reading:

Goal design of Baidu short video recommendation system

Baidu credibility certification in Taiwan architecture analysis

Graph database in Baidu Chinese application

Architecture and data science behind hundreds of thousands of experiments a year

———- END ———-

Baidu said Geek

Baidu official technology public number online!

Technical dry goods, industry information, online salon, industry conference

Recruitment information · Internal push information · technical books · Baidu surrounding

Welcome to your attention