The Inclusive EPUB 3 guide is intended for content creators and educators who wish to use EPUB 3 and are interested in making their published content broadly usable by a full spectrum of users.
This guide covers a broad range of topics that will help make an EPUB 3 book more usable in different contexts. The information provided illustrates how concepts may be applied but won't make you an "expert" - you are encouraged use this guide as a starting point and to follow the resources provided to gain additional insight.
What is Inclusivity and how is it different from accessibility?
Technology has enabled users, of all abilities, access to information in many different environments and contexts - they are no longer limited in where and how they access information. Because of this freedom, users are now finding themselves in a wide range of situations where their ability to access information depends on many factors - both internal and external:
Is the user in a library where she can not listen to audio at a reasonable volume without being disruptive?
Is the user on a noisy, cramped subway train where network connectivity is limited, there is a lot of ambient noise, and mobility is restricted?
Is the user unable to concentrate fully to the task but still needs a particular piece of information on his mobile device - how can this be done?
Inclusiveness, or the practice of inclusivity, is the belief that the design of a "thing" – whether it is a piece of technology, an everyday object, or even information itself – should be mindful of a broad range of users, their variable abilities, their variety of environments, situations, and contexts.
Inclusiveness is different from accessibility in that inclusivity doesn't specifically address a particular need or problem - rather inclusivity provides a spectrum of tools and features that the end user can choose from to fit his or her requirements in the given context. In short, Inclusiveness is not prescriptive since the user chooses how best to help themselves.
The challenge for content creators is: How do you create content that allows for flexible usage?
EPUB 3 is an open content distribution format which has many features that allow a broad range of users to consume content in a variety of situations. This guide outlines some of the different features of EPUB 3 that will make your content usable, robust, and resilient.
What is EPUB?
EPUB is an ebook format which is a container for web content that can be distributed as a whole and interpreted by supporting EPUB reader systems. At its most basic, EPUB is a ZIP archive of:
HTML containing the structure and text content,
CSS defining the visual and audio style,
audio and visual media,
scripts that add some interactivity (Note: scripting is optional in EPUB, therefore the content must be accessible if scripting is disabled), and
XML data that define the EPUB container and describe its contents using metadata.
What does EPUB offer?
Special semantics which help give basic HTML elements additional functions or meaning when used in an EPUB reading system.
A "spine" which defines the author's intended path through the content. This may be different than a table of contents, as a table of contents may include references to content not considered part of the author's primary path. An example of this scenario would be a textbook with diagrams in boxes or content in the sidebar - the EPUB spine may choose to skip over this material, even though it may appear in the book's table of content.
EPUB is an open format managed by the International Digital Publishing Forum (or IDPF) - a group of industry stakeholders who are primarily technology and content providers. The latest version of the EPUB specification is EPUB 3.
Why Choose EPUB 3 over HTML5?
EPUB 3 is a way to package and distribute HTML 5 content. Therefore EPUB 3 can be perceived as a way to distribute web content offline and includes unique features that make the EPUB 3 format more suitable for page-by-page consumption and academic settings. Whether you choose straight HTML 5 over EPUB 3 largely depends on your requirements and your content.
EPUB 3 is a viable content distribution format because it:
allows ease of offline access to content,
provides native content ordering and navigation,
is being consumed by an increasingly large group of people, at an increasing rate, as a result of the prominence of e-readers,
offers additional semantics that help describe text structure and function, and
helps give content authors a platform to seamlessly include rich features such as text-to-speech, content narration, and media alternatives.
Building Inclusive EPUB 3
EPUB 3 books are increasingly diverse and complex; similarly, so are its end users. Due to the wide variety of personal devices available, the different environments in which users consume content, and the varying ability of each user given their context and environment, publishing a single book that can be used across this spectrum can be challenging.
To improve an EPUB 3 book's utility across all these devices, environments, and contexts - the following is a guide to help an EPUB 3 author create more inclusive publications.
Broadly, the approach to making an inclusive EPUB 3 publicaion takes the following into account (we will go into more detail on each point shortly):
Content is composed and structured in well-formed HTML 5 using standards-compliant formatting and appropriate usage of HTML 5 semantic markup and tags.
Visual styling is done using CSS 2.1 with sensitivity to reader platforms with monochromatic, aural, or tactile displays.
Embedded graphical, audio, and video media are accompanied by alternate formats and textual descriptions.
Interactive content through scripting is accessible if scripting is disabled or unsupported by the user's reader platform.
Media overlays are used to give voiced narrations.
Text-to-speech is facilitated and enhanced by using special audio markup and aural styles.
Content is translated to other languages.
Good descriptive metadata is available that includes accessibility information.
EPUB 3 books are created from HTML5, and the same principles for creating well-formed, semantically sound web content apply. For an EPUB 3 author, there are two tools available: HTML 5 semantic elements, and epub:type.
HTML 5 semantic markup gives meaning to otherwise anonymous containers or text structures. For example, you can use the <section> element to denote changes in topic, or <aside> to indicate content that is secondary in importance.
EPUB 3 also provides semantics more suitable for publications. In addition to using HTML 5 semantic markup, you can describe the same HTML 5 markup as being chapters, or as a sidebar.
Here is an example:
<section epub:type="chapter"> <h1>Chapter 1: At the Start</h1> We always start with a good idea. <aside epub:type="sidebar"> In 2013, 9 out of 10 ideas were great. </aside> </section>
While HTML 5 semantic markup and epub:type go a long way to help your publication's structure and content be understood by reader systems, there are times when WAI-ARIA is needed to help convey relationships and function of elements on a page.
WAI-ARIA is used primarily with scripting, but it can also be used to describe the roles and functions of elements on a page. For example, a list of text items used as a toolbar or navigation menu can be described using WAI-ARIA this way:
Even though scripting is optional in EPUB 3, scripting can be used to add meaningful interactions to an otherwise static EPUB 3 book. If scripting is used in EPUB 3, WAI-ARIA can help describe content that is controlled or updated dynamically. For example, consider the case of a script on the page which behaves like a timer that begins counting down to zero once the user presses a button. Typically, the area that counts down the seconds would not be accessible to users using audio only, but with WAI-ARIA, you can describe the updating numbers this way:
Visual styling in EPUB 3 books should be approached differently than styling for a webpage residing on the Internet. Generally, styling for an EPUB 3 publication should:
Avoid fixed position layouts - displays on reader systems vary greatly, and fixed position content may appear offscreen.
Make sparing use of colours - reader systems have default colours for content, and adding or overriding these colours should be done cautiously. For example, while black text on a blue background may look fine on a colour LCD equipped reader system, the same content on an e-ink device may render as an indistinguishable, dark mass.
Make good use of white space including line spacing, margins, and indentation, to helps improve legibility of content.
Use clear logical layouts with sparing use of asides and sidebars to make it clear what is the primary and secondary content. Using a single column layout also ensures that the content appears consistent across platforms.
Using a gentle approach to visual styling ensures that EPUB 3 content is inclusive to a wide variety of reader platforms and users. Simple styling also makes it easier for users to make their own customizations to suit their own preferences and environments.
Although we strive to be inclusive, some types of content have their challenges. Books that rely heavily on illustrations, graphics, or imagery (like children's books) cannot be reflowed, reformatted, or customized easily; in the next section we will discuss some solutions to make your content more usable and accessible.
The notion of a "typical end user" is increasingly difficult to define. With EPUB 3 and eBooks easily accessible through different devices, even users who were once described as "typical" find themselves in situations where their ability to consume or comprehend content is affected by their environment or context. The challenge for EPUB 3 authors is to create content that is robust and can adapt to the wide spectrum of users and their situations.
For example, a user may need to be discrete and may not want the audio in an EPUB 3 book to disturb others - can the content be understood without the audio?
The key to making embedded media understood in a wide variety of situations is to provide ample alternatives that can be consumed in different ways. The following are some examples of what you can do:
Do not rely on a single media type to convey critical meaning - provide alternates:
For videos, audio, illustrations or graphics: provide a text synopsis.
For audio or video: a text transcript is useful.
For videos: user-selectable audio descriptions (not just captions) can enhance comprehension.
Content fallbacks: allow content to be accessed if a particular reader can not interpret it. e.g. for a video element, a poster image and text can be a fallback.
Media support and media format varies between reader platforms. Do not assume that if content is accessible on one platform that it will be accessible to everyone:
For video, it is recommended by the IDPF to use both H.264 and WebM video codecs.
For audio, EPUB 3 supports MP3 format.
Use a variety of complementary media to reinforce concepts. For example, an illustration may be better understood with an accompanying video.
Provide links to alternatives so users can help themselves. Alternatives can appear alongside the content, be located in an Appendix, or be made available online and accessible using a web link.
Provide captions in plain text as a transcript, or use the <track> element to add synchronized captions to audio and video.
Multiple <track> elements can be used to provide alternate audio tracks in different languages or to provide audio descriptions. Note that some platforms may not support <track> elements.
Scripting in EPUB 3 should be considered an optional feature, and the main content of the book should be understandable without scripting. According to IDPF:
"when the document is rendered by a Reading System without scripting support or with scripting support disabled, the top-level document content must retain its integrity, remaining consumable by the User without any information loss or other significant deterioration."
Using a similar approach to handling embedded media, the following are recommendations on dealing with scripted content in EPUB 3:
Scripted content should not contain information that is critical to understanding the material.
If the scripted content is critical, a suitable alternative should be provided. For example, an alternative to a scripted quiz would be a an answer key.
Scripted content can also benefit from a synopsis so users can gain an understanding without having to run the script. This is particularly useful for reviewing material for study, or for situations where time constraints or the environment hinder the ability to interact with the script.
Interactables contained in <canvas> elements should have transcripts, answer keys / solutions, etc. Use aria-describedby to properly associate any canvas alternatives to the canvas itself.
Media overlays and text-to-speech in EPUB 3 are powerful features that unlock the ability for users to listen to the contents of the book, instead of just reading it.
"Media overlay" is EPUB's term for pre-recorded audio "overlaid" on the content. Media overlays are typically used for narration of text, but they can be used for other purposes, such as commentary. Media overlays are recorded in blocks, typically by section or paragraphs, and synchronized with the actions of the user. At its most basic implementation, as a user advances through an EPUB book, their capable reader system would play back the narration like an audio book.
To add media overlays to an EPUB 3 book, you would follow these basic steps:
Record your narration for each segment of text,
Add unique identifiers in your EPUB content for each corresponding segment of text, and
Create another file (in XHTML) which associates the text with the recorded audio.
Matt Garrish has written an excellent resource called "EPUB 3 Media Overlays" which explains how media overlays work in EPUB 3 and how to add it to your own work.
Another feature EPUB 3 offers is the ability for book creators to specify information that will enhance the experience of text-to-speech (or TTS) on capable reader systems. Unlike Media Overlays, which are pre-recorded, TTS is generated on-the-fly by the reader's platform (like Apple's VoiceOver feature on iOS devices). Special instructions can be embedded within the EPUB book that can change the quality of voice feedback - such as putting emphasis on certain words, adding different speaker styles for different sections of text, specify pronunciation of complex words such as "hemocyanin", or clarify the pronunciation of homographs such as "record" the noun and "record" the verb.
TTS in EPUB 3 consists of 3 working parts:
an aural style sheet to give content clarity and richness (such as different voice and intonation on different page elements, or pauses after reading headers),
SSML (Speech Synthesis Markup Language) mark-up to provide information on pronunciation, volume, pitch, rate, etc., and
PLS (Pronunciation Lexicon Specification) information, which defines pronunciation of special words (i.e. medical terms).
<html xmlns:ssml="http://www.w3.org/2001/10/synthesis" ssml:alphabet="x-sampa">
<link rel="pronunciation" href="lex/en.pls" type="application/pls+xml" hreflang="en" />
The agent from the <abbr style="-epub-speak-as: spell-out">FBI</abbr> was playing a <span ssml:ph="beIs">bass</span> that was shaped like a <span ssml:ph="b&s">bass</span>, while studying acetaminophen.
Choosing Between Media Overlays and Text-to-Speech
In the context of publishing, metadata is typically used to convey information such as the author, publisher, ISBN number and other identifying information. However, metadata can be much more useful and can be an important tool to match content to the diverse consumer.
Accessibility metadata describes the different ways content can be accessed and the quality of the content itself. Using this metadata, a user can decide for themselves whether the content will meet their accessibility requirements. Accessibility metadata can describe:
the different accessibility features, such as large print, captions (for audio or video), or text-to-speech support;
possible cautions or "hazards," such as flashing visuals; or
supported accessibility APIs, such as MSAA, WAI-ARIA, or iOSAccessibility.
Note: Accessibility metadata also describes accessibility control supports such as keyboard or touch controls. This metadata is relevant to the reader platform, rather than the EPUB 3 publication itself. This guide will not cover accessibility control metadata.
Accessibility Metadata in EPUB 3
The information captured by accessibility metadata can be useful to many users; describing a publication as containing transcripts, large print, or having text-to-speech allows users to quickly find what they want, and gives creators a way of differentiating themselves.
Investing the time to write good, rich metadata ensures that your publication is discoverable and relevant as content searching and matching techniques advance.
To describe the accessibility features of an EPUB 3 publication as a *whole*, the metadata would be written in the META-INF/metadata.xml file.
What if your publication has two embedded videos, but only one has captions - how would you write metadata in this case? Since only some of the videos are captioned, it’s incorrect to specify accessibilityFeature:captions in the metadata.xml file.
To specify metadata to a *part* of your publication, you would embed microdata directly into the HTML of the content itself. The following example illustrates how metadata would be created for embedded content.
EPUB 3 uses an XHTML document type that is based on HTML5 and inherits almost all definitions of semantics, structure and processing behaviors from the HTML5 specification. This means that you can create valid HTML5 documents and update the head of the document to define it as XML and declare the epub namespace.
Note: The Schema.org accessibility property accessibilityFeature does not yet have a value that can convey the fact that an EPUB contains an audio narration through the media overlay, but such a value is being proposed.
Until validators recognize the schema: prefix, you must declare it in the package.opf file: