Parse ePub electronic book files with Node.JS
epub is a node.js module to parse EPUB electronic book files.
NB! Only ebooks in UTF-8 are currently supported!.
npm install epub
Or, if you want a pure-JS version (useful if used in a Node-Webkit app for example):
npm install epub --no-optional
var EPub = require("epub");
var epub = new EPub(epubfile, imagewebroot, chapterwebroot);
Where
<img>
blocks) is going to be /images/IMG_ID/IMG_FILENAME, IMG_ID
can be used to fetch the image form the ebook with getImage
<a>
links) is going to be /chapters/CHAPTER_ID/CHAPTER_FILENAME, CHAPTER_ID
can be used to fetch the image form the ebook with getChapter
Before the contents of the ebook can be read, it must be opened (EPub
is an EventEmitter
).
epub.on("end", function(){
// epub is now usable
console.log(epub.metadata.title);
epub.getChapter("chapter_id", function(err, text){});
});
epub.parse();
Property of the epub object that holds several metadata fields about the book.
epub = new EPub(...);
...
epub.metadata;
Available fields:
flow is a property of the epub object and holds the actual list of chapters (TOC is just an indication and can link to a # url inside a chapter file)
epub = new EPub(...);
...
epub.flow.forEach(function(chapter){
console.log(chapter.id);
});
Chapter id
is needed to load the chapters getChapter
toc is a property of the epub object and indicates a list of titles/urls for the TOC. Actual chapter and it's ID needs to be detected with the href
property
Load chapter text from the ebook.
var epub = new EPub(...);
...
epub.getChapter("chapter1", function(error, text){});
Load raw chapter text from the ebook.
Load image (as a Buffer value) from the ebook.
var epub = new EPub(...);
...
epub.getImage("image1", function(error, img, mimeType){});
Load any file (as a Buffer value) from the ebook.
var epub = new EPub(...);
...
epub.getFile("css1", function(error, data, mimeType){});