Author: Sam

preface

Recently I published two courses on the development of Web reading applications on MOOCs, using Vue bucket development. The free course is an entry-level course that initially implements a reader. The practical class is an advanced course to achieve a high-performance Internet reading application. Both projects adopt adaptive layout and support both PC and mobile terminals. Students who want to improve their front-end skills should not miss it.

Free class Quick Start Web Reader Development for immediate learning

Free class DEMO experience immediately

Actual combat class “Vue2.5 actual combat wechat reading comparable to the original App enterprise Web book city” immediately learn

Real class DEMO experience immediately

instructions

This tutorial is a series of tutorials designed to show you how to develop a powerful Web reader. The technical stack is Vue.

Common ebook formats

In the last note, I introduced the common functions of reading apps and the implementation principle of the reader. If you haven’t read it, you can click here. Before developing the reader, we should first understand the common e-book formats, such as TXT, PDF, ePub and MOBI, etc. The following is a brief introduction of these formats:

  • TXT is a pure file file, it can not support multimedia format, fiction applications used more, but can not meet the needs of electronic publications;
  • PDF is a very mainstream e-book format, but the typesetting effect is not good on the mobile terminal, so mobile reading rarely uses PDF format;
  • EPub is the most mainstream format of e-books at present. The contents of e-books are displayed in HTML and controlled by CSS. No matter on PC or mobile, ePub has a very good display effect.
  • Mobi is an e-book format for Amazon’s Kindle that needs to be read on the Kindle.

This tutorial focuses on the development of the ePub reader.

Reader development path

  • Analysis: access to basic information, catalog information, chapter information, etc
  • Render: Display ebook content on the interface, supporting screen size adaptation
  • Page turning: Supports the previous and next page turning operations
  • Font size: The font size can be changed
  • Font: Supports text modification and supports CSS3 Web font
  • Theme: supports switching the background color and text color of the reader
  • Progress: support dynamic switch reader display progress
  • Table of contents: Support multi-level table of contents display. Click the table of contents to jump to the specified section quickly
  • Search: Supports full-text search and chapter search
  • Bookmark: Supports saving the current reading position as a bookmark and can be traced back
  • Note: You can select a text and add notes
  • Adaptation: ADAPTS the PC and mobile terminals

In this series of tutorials, I will take you through the basics of how to develop a reader.

EPub standard

Before we start development, we need to know what the ePub standard is. EPub stands for Electronic Publication. It is an Electronic Publication standard developed by IDPF.

IDPF introduction

  • IDPF is an international organization, full name is the International Digital Publishing Forum, the official website is www.idpf.org, its mission is committed to the promotion of electronic publications, ePub is the standard developed by the organization.
  • IDPF’s board of directors has 14 seats and its revenue comes from membership fees. Currently, IDPF has more than 200 members. Companies with annual revenues of $1 million pay membership fees of $775 a year, and so on.
  • IDPF’s main work is to develop, improve and promote ePub standards, mainly used to solve the electronic problems of paper books, including distribution, management and encryption, etc.

Source: idpf.org/about-us

This section describes ePub standards

The ePub standard is currently available in version 3.0, which contains the following parts:

  • EPUB Publications: Specifies the overall specification and basic semantics of the EPUB standard, such as what is Spine and what is Manifest. The purpose of this is to make ePub ebooks from different publishers produced according to the same standard, so that different readers can correctly parse the same ebook;
  • EPUB Content Documents: it defines EPUB Content standards and how to display ebooks Content with XHTML, SVG and CSS.
  • EPUB Open Container Format: Defines the EPUB packaging standard and how to package a group of files into an EPUB file.
  • EPUB Media Overlays: lays a model for simultaneous processing of text and audio lays, which are not covered in most EPUB ebooks.

Those of you who want to delve deeper into the ePub standard click here. The official documentation is very detailed for us.

EPub ebook analysis

Here we take a look at the inside of an ePub ebook, test ebook click here to download.

Unpack the

Change the suffix of the e-book to zip, and then use the decompression software to decompress, you can get a folder, as shown below:

container.xml

In the meta-INF directory, there is a container. XML file. This is the container file for ebooks.

  • There must be a meta-INF directory in the ePub ebook root path
  • EPub ebooks must contain a file named container.xml in the meta-INF directory (note: the name cannot be changed or parsing will fail)
  • The container. XML file must not be encrypted
  • Container. XML must contain the RootFiles tag. Rootfiles must contain one or more RootFiles tags, which must contain a full-path attribute pointing to the ebook configuration file in the format of OPF (which is actually an XML file).

For those of you who want to delve deeper into the Container standard, you can click here to open the container.xml for the test ebook, which reads as follows:

<?xml version="1.0"? >
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
  <rootfiles>
    <rootfile full-path="OEBPS/content.opf" media-type="application/oebps-package+xml"/>
  </rootfiles>
</container>
Copy the code

From the full-path property of rootFile, we can see that there is a content.opf file in the OEBPS directory. We open the OEBPS directory to find the file

<?xml version="1.0"? >
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="bookid" version="2.0">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
        <dc:title>Agile Processes in Software Engineering and Extreme Programming</dc:title>
        <dc:creator>Juan Garbajosa, Xiaofeng Wang and Ademar Aguiar</dc:creator>
        <dc:language>En</dc:language>
        <dc:rights></dc:rights>
        <dc:publisher>Springer International Publishing, Cham</dc:publisher>
        <dc:identifier id="bookid">The 978-3-31 9-91602-6</dc:identifier>
        <meta name="cover" content="A978-3-319-91602-6_CoverFigure_IMG"/>
        <meta name="epub-converter-version" content="v3.47"/>
    </metadata>

    <manifest>
        <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
        <item id="css1" href="springer_epub.css" media-type="text/css"/>
        <item id="A468350_1_En_1_ChapterPart1" href="A468350_1_En_1_ChapterPart1.html"
              media-type="application/xhtml+xml"/>
        <item id="A468350_1_En_1_Chapter" href="A468350_1_En_1_Chapter.html" media-type="application/xhtml+xml"/>
        <item id="A468350_1_En_5_Fig1_HTML_IMG" href="A468350_1_En_5_Fig1_HTML.gif" media-type="image/gif"/>.</manifest>
    
    <spine toc="ncx">
        <itemref idref="ACoverHTML"/>
        <itemref idref="A468350_1_En_BookFrontmatter_OnlinePDF"/>
        <itemref idref="A468350_1_En_1_ChapterPart1"/>
        <itemref idref="A468350_1_En_1_Chapter"/>
        <itemref idref="A468350_1_En_2_Chapter"/>
        <itemref idref="A468350_1_En_3_ChapterPart2"/>.</spine>
    
    <guide>
        <reference type="cover" title="Cover" href="ACoverHTML.html"/>
        <reference type="toc" title="Table of Contents" href="A468350_1_En_BookFrontmatter_OnlinePDF.html#Toc"/>
        <reference type="text" title="Cosmic User Story Standard" href="A468350_1_En_1_Chapter.html"/>
    </guide>
</package>
Copy the code

The ePub standard specifies that an OPF file root tag is package, which must contain metadata, MANIFEST, and spine. Guide, Bindings, and Collection are optional

  • Metadata: Publishing information about an e-book
  • Manifest: The path and ID of the ebook resource file
  • Spine: The reading order of e-books
  • Guide: Guide information

metadata

Metadata follows the Dublin Core Metadata Element Set (DCMES). The purpose of DCMES is to regulate the description of electronic information. For more information about DCMES, click here. EPub specifies that metadata must contain at least three of the following elements:

  • Dc :title – E-book title
  • Dc: language – the language
  • Dc :identifier – The unique identifier of an electronic publication, usually an International Standard Book Number (ISBN)
 <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
        <dc:title>Agile Processes in Software Engineering and Extreme Programming</dc:title>
        <dc:language>En</dc:language>
        <dc:identifier id="bookid">The 978-3-31 9-91602-6</dc:identifier>
    </metadata>
Copy the code

manifest

Manifest stores an exhaustive list of ebook resource files

<manifest>
        <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
        <item id="ACoverHTML" href="ACoverHTML.html" media-type="application/xhtml+xml"/>
</manifest>
Copy the code
  • Item – Represents a resource
  • Id – The unique identifier of the resource in the ebook
  • Href – The path where the resource is stored
  • Media-type – Indicates the media type of the resource

spine

Provide the order in which ebooks are read

<spine toc="ncx">
        <itemref idref="ACoverHTML"/>
        <itemref idref="A468350_1_En_BookFrontmatter_OnlinePDF"/>
</spine>
Copy the code
  • Itemref – Represents a resource
  • Idref – represents the ID of the resource, which corresponds to the ID in the MANIFEST. In the actual parsing process, the id of the spine will be used to find the corresponding resource in the MANIFEST, which will be presented to the reader for display

EPub standard summary

The core of ePub standard is to use ZIP technology to package different types of files into one ePub file, use XML technology for configuration management, and organize all kinds of resource files (such as HTML, CSS, images, audio and video) in an orderly manner according to the ePub standard, so as to ensure the consistency of all kinds of reader parsing.

conclusion

This article introduces the common formats of ebooks and explains the ePub standard in detail. Understanding the ePub standard is a prerequisite for developing e-book readers. You can try to uncompress the test ebooks provided in this article and then analyze them. You can also try to parse other ePub ebooks through the Internet to see if they meet the ePub standard.