EPUB file structure
EPUB files is basically a ZIP archive. E-book readers can access such files without unpacking them. However, some interesting information can be obtained by unpacking and viewing an EPUB archive.
.
EPUB standard clearly describes the required elements of any EPUB file:
-
OPS folder
-
Mimetype file
-
Container.xml file
OPS folder holds the book contents, such as:
-
fonts used in the publication
-
illustrations (image files)
-
formatting file (CSS stylesheet)
-
text file (HTML)
-
content.OPF file
-
toc.NCX file
OPF file is commonly referred to as content file as it contains e-book metadata. It is an XML-formatted file that describes the format of the publication (Package section). It is further divided into the following elements:
-
metadata – describes such properties as: title, ISBN, or language
-
manifest – lists all the files comprising the publication
-
spine – contains references to publication contents that are listed in the table of contents (must point to a proper toc.NCX file)
-
guide – additional content such as bibliography or list of figures.
toc.NCX file is an XML-formatted table of contents.
Mimetype file contains information necessary to identify given publication. This file is used as a base for validating given publication.
Container.xml file stored in META-INF folder point to the location of content.OPF file.