Beyond the Page: EPUB 3 Interactive Features That Power Readers Actually Want

Nothing replaces the feeling of a physical book. We agree.
But most ebooks are not even trying to offer something different. They are text on a screen pretending to be a page. No wonder the paper version still wins.
In this blog post, we ask:
What if digital books stopped imitating print and started doing what only digital can?
A human or AI narrator’s voice that follows your eyes across the page. Ambient sounds that pull you into the world of a story. Technical exercises you can actually do inside the chapter, not just read about. The ability to switch between reading and listening without losing your place.
None of that belongs on paper. All of it belongs in a book.

That is what EPUB 3 was built for. We have spent a lot of time at BookFusion talking about what EPUB 3 interactive features can do for children and students. Through our work with Accessible Digital Textbooks (ADTs) and partners like UNICEF, we have explored how layered audio, embedded quizzes, and tap-to-define glossaries can transform the learning experience for young readers. That work has been groundbreaking, and it continues.
Today, though, we want to talk about you: the adult reader, the power user, the person who reads not because they have to, but because reading is woven into the fabric of how they think, grow, and engage with the world.
EPUB 3 has so much to offer you. And most of it is hiding in plain sight.
The Immersive Reading Revolution
There is a debate in the reading community: Is listening to audiobooks even reading?
Some say yes, some say no.

We say, lol why choose?
Actually, there is already a growing community of readers who are already integrating audio with their reading experience creating immersive reading experiences or guided narration, synchronising professional audio narration with the text on your screen. As you read, the words highlight in time with the narrator’s voice. You can follow along visually while listening, or seamlessly switch between reading and listening depending on whether you are at your desk or on the go.

Research consistently supports this approach. Studies on multimodal learning show that engaging both visual and auditory processing simultaneously leads to deeper comprehension and longer-lasting retention of the text. This is not just useful for children learning to read; it is a powerful tool for any reader who wants to absorb complex material more deeply, learn a language through literature, or simply experience a favourite novel in a richer way.
EPUB 3 makes this possible through a feature called Media Overlays. This is an open standard built right into the EPUB specification. It uses SMIL (Synchronided Multimedia Integration Language) files to map audio clips to specific text fragments, paragraph by paragraph or even sentence by sentence. Unlike proprietary systems (looking at you, Amazon Whispersync), Media Overlays are part of an open format, meaning the books you create or purchase in this format are not locked to a single platform.
BookFusion supports EPUB 3 Media Overlays natively. That means if you have (or create) an EPUB with synchronised narration, you can upload it and enjoy the full read-aloud experience on your phone, tablet, or desktop. You keep your highlights. You keep your notes. You get the best of [ALL] worlds.
Fantasy, Sci-Fi, and the Case for Soundscapes

Now, here is where things get exciting for fiction lovers.
Imagine reading a fantasy novel and hearing the ambient sounds of a tavern as a scene opens: the crackle of a fire, the murmur of conversation, the clink of tankards. Or imagine a sci-fi thriller where the hum of a spaceship fills the background as the tension builds.
This is not a hypothetical. The audiobook world is already moving in this direction. Productions from companies like Graphic Audio and BBC Radio have been layering full cast performances, sound effects, and atmospheric music into audio dramas for years, to extraordinary effect. Audible’s adaptation of The Sandman is frequently praised as a cinematic experience.
The question is: why should these immersive elements be limited to standalone audiobooks?
EPUB 3’s support for embedded audio and JavaScript means that an ebook can contain these atmospheric layers right alongside the text. A publisher or author could embed ambient soundscapes that play in the background as you read, with controls for you to adjust volume, skip audio, or turn it off entirely. The reading app handles playback, and you, the reader, stay in control.
This is the frontier we are watching closely. We would love to see more creators experimenting with layered audio experiences inside EPUB 3 books. If you are an author, publisher, or content creator interested in exploring this, read on. We have some tools to help you get started.
Interactive Technical Books: Learn by Doing, Not Just Reading
If you have ever tried to learn a programming language or a technical skill from an ebook, you know the frustration. You read a chapter about, say, writing a function in Python. The code snippet sits there on the page, inert. You have to switch to a separate environment, type it out, and run it yourself.
What if the book let you do it right there?
EPUB 3 supports JavaScript execution, which opens the door to genuinely interactive technical books. We are talking about:
- Embedded code playgrounds where you can edit and run code snippets directly within the reading experience. Imagine a chapter on CSS where you can tweak values and see the results update in real time, without leaving the page.
- Drag-and-drop exercises where you arrange code blocks into the correct sequence to solve a problem. This approach reinforces pattern recognition and syntax comprehension in a way that passive reading simply cannot.
- Inline quizzes and self-checks positioned right after a concept is introduced, letting you test your understanding immediately. These are not traditional end-of-chapter tests. They are woven into the flow of reading, providing instant feedback and reinforcement.
- Interactive diagrams with labelled hotspots that reveal deeper explanations when tapped. Technical documentation often struggles with the tension between being comprehensive and being approachable. Interactive diagrams solve that problem by layering information.
With our work in the education sector, we have already demonstrated that accessible, interactive EPUB 3 activities can work reliably across browsers and mobile apps, including multiple choice questions, fill-in-the-blank exercises, drag-and-drop interactions, and even essay prompts. Research from Graz University of Technology has also confirmed that EPUB 3 is suitable for a variety of exercise types and can serve as a foundational format for digital textbooks.
We are also inspired by what we see happening in the web development education space. Documentation websites and interactive tutorials have long offered the ability to run code and test knowledge inline. The question we keep asking is:
why not EPUB 3 technical books? The standard already supports it.
BookFusion’s platform supports JavaScript-enabled EPUB 3 content. If you are a technical author or publisher, this is your invitation to build books that teach by doing.
Tools for Creators: Build Your Own Interactive EPUB 3
If you are an author, publisher, or content creator who wants to start building interactive EPUB 3 content, here is a roundup of tools that can help. We have prioritised free and open-source options where possible.
For Audio Narration and Media Overlays
Storyteller (free, open source, self-hosted) is a platform that automatically aligns ebooks and audiobooks using OpenAI’s Whisper for transcription and a fuzzy matching algorithm to synchronise text with audio. It outputs EPUB 3 compliant files with Media Overlays. Books created with Storyteller can be uploaded to BookFusion for reading with synchronised narration.
Sigil (free, open source) is a long-standing EPUB editor that supports EPUB 3 features including audio, video, and SMIL files for Media Overlays. The icarus plugin for Sigil streamlines the process of adding synchronisation attributes to your text and generating the SMIL timing files. Paired with aeneas (a free, open-source forced alignment tool), you can automate the synchronisation of text and audio.
syncabook (free, open source) is another tool specifically designed for creating EPUB 3 books with synchronised text and audio. It uses forced alignment to automatically generate the timing data needed for Media Overlays.
TOBI (free, open source) is a multimedia book production tool designed for creating EPUB 3 and DAISY audiobooks. It provides a visual interface for recording narration and aligning it with text.
For Interactive Content and Exercises
Calibre‘s built-in Edit Book tool gives you direct access to the HTML, CSS, and file structure inside your EPUB, with a live preview that updates as you make changes. For creators comfortable working in that environment, that access is the door to interactivity: you can add JavaScript files directly to the EPUB’s file tree, reference them from your HTML chapters, and build custom interactive elements the same way you would on the web. And since BookFusion integrates directly with Calibre, the books you build and refine there move seamlessly into a reading experience designed to honour every layer of what you created.
Sigil also supports adding JavaScript directly to your EPUB files if you are comfortable with HTML and CSS. For creators with web development skills, this means you can build custom interactive elements from scratch.
For Sound Design and Ambient Audio
Zapsplat and BBC Sound Effects offer extensive libraries of free sound effects, from ambient environments to specific atmospheric sounds.
Freesound.org is a collaborative database of Creative Commons licensed sounds, perfect for sourcing ambient layers.
What We Are Building Toward

The future of reading won’t be a choice between print, digital, and audio. It will be created, blended experiences that honour how people actually read: in fragments, across devices, on the move, and in community.
EPUB 3 gives us the foundation. It is an open standard that supports audio narration, synchronised text, JavaScript interactivity, embedded video, and more. Our job is to make sure these features work beautifully and reliably, on whatever device you happen to have in your hand.
Realistic Voices TTS
From this month, BookFusion will support text-to-speech further than it has ever gone inside a reading app. We will support realistic AI model voices directly into the platform, so that any book in your library can become an immersive listening experience at the tap of a button.
This is not generic robotic narration. These are expressive, natural-sounding voices that bring characters to life, honour the rhythm of a sentence, and make every genre feel the way it was meant to feel. A gripping thriller. A sweeping fantasy. A dense piece of non-fiction. Whatever you are reading, the voice fits.
Paired with EPUB 3 Media Overlays, the experience goes even further: the text highlights in sync with the voice as you read, letting you follow along visually or switch seamlessly between reading and listening without ever losing your place. It is the read-along experience, powered by AI, available for every title across web, iOS, and Android.
This is the first step toward a much bigger vision: a future where every reader, regardless of budget or accessibility needs, can experience books the way a premium audiobook offers today. This replaces the need for readers to manually do this with Storyteller, Sigil or any other tool.
No separate purchase required. No waiting for a publisher to commission a recording. Just open the book and listen.
The Next Frontier:
We want to see more fiction for all ages with layered audio experiences that go beyond basic narration. We want fantasy readers to feel the world around them. We want thriller readers to hear the tension building. We want literary fiction readers to encounter a narrator’s voice that adds emotional dimension to what is already on the page because as good as AI TTS can be, nothing beats a true human touch.
We want to see technical publishers embrace interactive exercises as a standard, not a novelty. The days of code snippets sitting lifeless on a page should be numbered.
We want readers to have more control over their experience. That means better tools for highlights, notes, and annotations within interactive content. It means easier ways to share and loan books. It means cross-device syncing that just works.
And we want to keep contributing to the broader conversation about what books can become. Our work with the Book Industry Study Group (BISG) and our contributions to the Field Guide to Fixed Layout for Ebooks reflect our commitment to pushing the industry forward. Not just for education. For all of us.
Try It Yourself
If you are curious about what interactive EPUB 3 content feels like on BookFusion, start by exploring our shelves for titles that support audio narration and Media Overlays. Realistic Voices TTS is currently available on Web and coming cross-platform this month. Upload your own EPUB 3 content and experience the read-aloud features firsthand. And if you are a creator, we would genuinely love to see what you build.
The best digital reading experience is not one that replicates paper. It is one that goes further.
Experience the future of reading:
📲 Download for Android 📲 Download for iOS 💻 Use on Web
Join the conversation on Discord: https://discord.gg/MkTJqZb9ev

