Hacker Newsnew | past | comments | ask | show | jobs | submit | acabal's commentslogin

I'm shocked and saddened to hear this. Greg was a deep source of knowledge and support as I started and shepherded Standard Ebooks. He was generous with his time and experience, and unbelievably patient with me, some guy he had never heard of or met before who was just another cold-email in what must have been an endless stream in his inbox. We should all aspire to his high spirit of camaraderie, charity, and kindness. The world has lost a champion of both literature and the free web.

Why are there no unique numbers assigned to Standard Ebook's ebooks? I understand that there is a cost associated with ISBNs, but it's very irritating to not have something that identifies them uniquely. Most (all?) aren't even in Worldcat, so I can't use OCLC numbers for that purpose either.

> no unique numbers

This suggests a misunderstanding of the Standard Ebooks process, which allows continual incremental corrections to the authoritative source of individual books (in XHTML, on GitHub). So, a truly unique identifier would only be valid to the production output(s) from a particular state of the Git-repo sources.

https://standardebooks.org/contribute/report-errors

Recall also that final user content is made available in multiple formats, currently at least six. Example:

https://standardebooks.org/ebooks/geronimo/geronimos-story-o...

Asynchronous to the correction process, Standard Ebooks updates its own production tools. So if an individual book's content requires correction, should the "respin" be done with TOT tools, or with the versions available at time of first publication? Disclaimer: I don't actually know which is current practice -- but using the TOT tool suite is obviously vastly easier.

For most practical purposes, I'd suggest the git-commit date, along with short substrings of author name and title, would suffice.


the ebook identifier uniquely identifies every ebook. standard ebook ebooks use the url as their unique identifier

Those are poor identifiers. A numeric or short alphanumeric identifier that can be part of the filename is important... I have as many as 5 different editions of the same title so title+author doesn't do the trick. Nor am I putting a url into the filename, couldn't if I wanted to as there are disallowed characters in a url in every filesystem I've ever heard of. How difficult is it to keep a incrementing catalog number like Project Gutenberg does? Anything that doesn't have a proper unique just seems unprofessional.


This isn't a solution either. Not sure why you think it is. Here's how I name files, just as an example:

    Meditationes de Prima Philosophia - GTNB•0023306 (2007) - Descartes, René (aut)

    Meditations on First Philosophy - 9780203417621 (2013) - Descartes, René (aut); Haldane, Elizabeth (trl); Ross, G. R. T. (trl) & Tweyman, Stanley (edt,wfw)
Where and how should I put a URI in there, especially considering that they at minimum need the colon (:), which is a problematic character in filenames on NTFS/HFS/APFS/XFS? They're not exactly disallowed, but they create a resource fork or some shit and so it doesn't behave as you would expect. If Standard Ebooks just started numbering their books, then I'd slap the STBK• in front of the number and use that. They're not in Worldcat, or I could use OCLC numbers (but it shouldn't be other people's job to keep the catalog of their own books).

choose your favorite hash

    hash(<dc:identifier>)

Hashes are too long, aren't human-recognizable as to meaning, etc. I don't want half-assed workarounds. They need to uniquely number their books.

- they don’t need to do anything to conform to your arbitrary organization choices

- hashes are as long or short as you need them to be

- publication timestamp is in every ebook’s metadata, is almost guaranteed to be unique, monotonically increases, and has actual semantic meaning compared to an isbn or oclc


>they don’t need to do anything to conform to your arbitrary organization choices

They don't need to. It'd be smart. It's not "arbitrary". It's fucking library science.

>hashes are as long or short as you need them to be

Hashes might uniquely identify a computer file, but they don't uniquely identify an edition/release of a published book. Some jackass on libgen decides to tweak a single byte, now it has a new hash... but it's not a new edition.

>publication timestamp is in every ebook’s metadata

As someone who takes a look at every internal opf file, no... they're not in every ebook.

You're suggesting I go to the extra trouble of doing a job they could do easily, when I can only do it poorly, and I don't know why... because the first person to respond was a dumbass and thought I was attacking him? I swear, 99% of humans are still monkeys.


You don't need to hash file contents (though that is often a useful thing to do). You can hash e.g. the URL that was earlier claimed to be the canonical identifier. Running it through your favorite hash function fixes your complaints about file names (choose your favorite hash function such that it is not too long and only outputs allowed characters).

Ah. The url, so I can substitute one difficult-for-human-readability with another difficult-for-human-readability, both of which are excessively long and opaque-by-design.

>choose your favorite hash function such that it is not too long

ISBN's 13 digits is about as long as is tolerable. Any time there is a list of authors six names long (academic titles) along with a subtitle, it's very easy to bump up against max filename size.

This isn't a problem I can solve on my own. Just trying to bring attention to it. My solution thus far is to just avoid publishers who are so unprofessional as to not provide numbers. It's not tough, Project Gutenberg does it. Anyone can do it. If you're some amateur whose entire catalog is 8 books published, you say "this book is 1, and this book is 2" etc, and it's a done deal. Again, I don't expect anyone to use ISBNs (in the US, you have to pay for them unless you're one of the big 5 publishing houses), but just use your own for god's sake.


Hashes are not excessively long unless you choose to make it so. They might be opaque/random if you want, or they might not. "Remove all special characters and keep only the first 5 characters with space padding" is a string hash function. "Keep only the first 5 vowels with space padding" is a string hash function.

Here's a friendly AI generated hash function to give you an opaque 13 digit number if you're into that:

echo -n "$URL" | sha1sum | awk '{print $1}' | xxd -r -p | od -An -t u8 | tr -d ' \n' | cut -c1-13

For example, for https://standardebooks.org/ebooks/denis-diderot/the-indiscre... you get the ID 4897562473051.

It looks like their ebook sources are all published in git repos online, so you could check out the repos, get the timestamp of the initial commits, and do a monotonic ID on that if you wanted. You could also contribute the change back to them if you think it's something others would benefit from.


> very irritating

I think it’s possible to express this in a less caustic way. Because Standard E-books is high quality and free of charge right?


Have a little respect, for fucks sake. This does not belong here.

Firefox does support XSLT. At Standard Ebooks, our ebook OPDS/RSS feeds are styled with XSLT when viewed with a browser. See for example https://standardebooks.org/feeds/opds/new-releases (use view source to see that it's an XML document).


This is not entirely correct - Kobo also expects a bunch of special <span>s inserted for things like highlighting and page numbers to work.

It kills me that Kobo is so close to having plain epubs rendered with Webkit but for some reason they just won't take the leap!


The manual has some known issues on mobile, I believe there's a GitHub issue open about it. It's low priority because the manual is rarely read on mobile. PRs welcomed!


It makes a lot of sense when you recall that HTML and its ancestors were designed to mark up and format documents, i.e. books. One of the most fundamental elements is <p>, which stands for... paragraph.

Each renderer differs in capabilities, and most are stuck in a subset of early-2000s capabilities, so designing an ebook is very much like designing for the 90s era web. Lots of hacks are required to get the same file to look good on many different renderers, and achieving that is one of the goals of Standard Ebooks.


TEI is something like that, but the amount of effort required to mark a book up like that would be astronomical.


Starts to sound like the kind of task an AI could do reasonably well though


If the goal of these tags are metadata for AI consumption, and the solution to generate them is “use an AI”… what is the point?


Specialization I presume, so one produces the metadata that can be consumed by another.

Also, the thing from the above post that stood out to me would be to act as a reminder for the reader. Not so much the location and emotion, but the character data. I've often found myself wondering who the character is that's appeared in a scene, forgetting that they previously appeared earlier.


You can also join our Patrons Circle to have this book added to our Wanted Ebooks list, which is a list of suggestions for our volunteers to work on: https://standardebooks.org/donate#patrons-circle


Editor-in-chief here, happy to answer any questions, as always. We also recently celebrated Public Domain Day with an especially notable crop of books, including The Sound and the Fury, All Quiet on the Western Front, John Steinbeck's first novel, some Hemingway, Gandhi, two Dashiell Hammett novels, and more: https://standardebooks.org/blog/public-domain-day-2025


Another question - in https://standardebooks.org/contribute/producing-an-ebook-ste... you talk about "modernising" spelling, e.g. changing "some one" to "someone". This may be against the implicit goal of making these accessible for a general reader, but I prefer to read what was originally written, and it feels like it crosses a line into editorialising rather than letting the original feel stand as-is. (Although of course these texts have already been "editorialised" by their original editors!) Totally your decision given the amount of effort that has clearly gone into this, but I'd be interested to read the rationale for that decision.


I respect this choice of modernization, and I suppose some readers enjoy it, but it makes the publisher's whole work useless to me. When a text has been altered, I can't trust it respects the intent of the author, and any style inconsistency I find may be a by-product of the publisher's mangling.

So, when I care about a book, I never read Standard Ebooks' edition.

By the way, the modernization is more than joining a few words. Sometimes, Standard Ebooks replaces the word used at the time the book was written. For instance:

    This time, however, the mountain was going to [-Mahomet;-]{+Muhammad;+}
The previous quote is from Galsworthy's "Forsyte Saga". The author used many French words and French spellings – like "Tchekov" for the Russian playwriter that was living in Paris. These subtleties are lost with the modernization.

I also think some alterations are plain mistakes. For instance in the same book:

    if she wanted a good book she should read [-“Job”-]{+Job+};
    his father was rather like Job while Job still had land.


Anyone who has read books for classes in high school and above knows that even classics are routinely fucked with by publishers. Even early in the work's history. I remember even in middle school someone would invariably end up with a different publisher's edition of a book for summer reading or whatnot and we'd find changes.

Unless the book is specifically declared to be the original text - and it may have to specify which original text - they're going to be edited.

However, in electronic form it should be possible to include both in one file, or two files with the original in a repo branch once all the document structure stuff has been added. That text will never change, so merging formatting-only changes should be pretty painless.


For every book, Standard Ebooks provides a hyperlink to the original scan, a hyperlink to the original transcription, and a full revision history in which all spelling updates have been clearly marked. To me, this already seems to be going above and beyond—most ebook repositories provide less. I can’t imagine that the marginal benefit from keeping multiple parallel branches would be worth the cost in volunteer time and labor, when maintaining pristine first editions isn’t even a goal of the project.


And of course, none of this matters in the slightest for translated works, which almost by definition includes the vast majority of works ever written.

"As it was written" is a very high bar that is simply not attainable for anything other than fairly recent works in your native language.


> I also think some alterations are plain mistakes. For instance in the same book:

That one appears to not be a mistake, [0] suggests that not quoting the name of the book of the bible being referred to (so [Job] rather than ["Job"]) is the style accepted by Chicago, MLA, and APA.

[0] https://en.wikipedia.org/wiki/Bible_citation#Common_formats


I respect their choice too, but like you the reason for my question was that I feel I can't trust the end product. Alex said "We only make sound-alike changes, like to-morrow -> tomorrow", which I could just about get along with, but Mahomet -> Muhammad creates an entirely different flavour for me. As Alex said, that's fine, in that it doesn't mean the other editions aren't available, but it is a shame for me when I essentially don't want to use something that has been put together so painstakingly.


I'm disappointed to learn of this editing in Standard Ebooks, having had the misfortune to buy a Barnes & Noble copy of the complete Sherlock Holmes that had a similar approach taken. Book looks lovely, but has an altered chapter order, Americanised spellings and lots of typos. There is a certain amount of editing needed to render the likes of Shakespeare and Samuel Pepys readable, as Middle/Old English is quite a different language, but slight variants from 150ish years ago, or dialects, or the correct spelling according to the Queen's English, add flavour and should not be altered.


That's fine! Our editions didn't erase any of the other editions you can find online and in print. You're more than welcome to select any edition that fits your reading preferences.


Apologies if that came across as at all critical. Genuinely interested in the rationale rather than it being a how-dare-you demand for you to explain yourself!


Spelling varies widely across the eras our ebooks were published in. Therefore we attempt to standardize spelling to what a modern reader might be familiar with. We only make sound-alike changes, like to-morrow -> tomorrow.

This is a common practice that editors and publishers have quietly engaged in for centuries. For example, today you are not reading Shakespeare in the way it was spelled in its first printing.


A wonderful project!

After reading this comment I couldn't help but picture medieval monks, toiling away copying old manuscripts into "modern" English. Normally a thankless task, so thank you!


And you're for sure not speaking it like he would have


Is there epub-specific html markup you could add to changed words to indicate their original spelling? Like alt text for images, but in a span around a word? There's the html "title" attribute, of course, which would work (mouseover shows the title attribute's value), but that isn't semantically correct for the purpose.


No, there are too many things to track, but all of it is in the git history. Editorial changes have a commit message prefaced with [Editorial].


Fair enough - thanks for the explanation.


> For example, today you are not reading Shakespeare in the way it was spelled in its first printing.

However, we call modernised Shakespeare “abridged”.


Abridged means shortened, not modernized.


I appreciate this service you are doing, but it would be much much better to also have an original version with archaic spelling. Double bonus points for have optional (hidden by default) explanations of words. This would be tremendously helpful to some students.


[flagged]


> "Don't like it? Here is a full refund and you are free to read some other version."

That is not at all what I said.

> You can't claim to care about preserving the works while changing them, and that is changing them.

We do not and have never made that claim. We are creating our own editions of these public domain books, not engaging in historical preservation.

If you want to read classic books in their original spelling, then you must locate first editions. Editors and publishers have updated both spelling and punctuation as a matter of course for centuries. Just look at any three editions of any Jane Austen novel - and you could never read an edition of Shakespeare more recent than 1800.


That’s how I read it. What do you mean then? It sounds like the only edition you may offer is the editorialized one, if applicable.

As someone who writes I greatly dislike this. These are my words, not yours.

A translation across time and generations is a completely different matter.


I think it's important to note that in the past, typesetters and printers had a much more editorial role than the process today. Authors would submit handwritten manuscripts and the typesetters in many cases would have to fix the author's mistakes, spelling, etc. to conform the manuscripts to printing standards with the author having limited communication or ability to proof the final plates

Today, it's much easier for authors to have a greater say in the final presentation due to the digital composition process


You can't use an appeal to tradition as the argument for revision.

I don't see why anyone should care that publishers have edited in the past anyway, even in this particular discussion where my own argument is for conservation. Publishers have done all kinds of things that this very project itself criticises and pointedly set themselves apart by doing differently. So, it's a weak argument for them.

Aside from that, what any other publishers do, even if it's totally common and even universal, doesn't change the argument that they were making that they wish to suggest that those edits cross a line that fixing typos doesn't cross.


By the time they reach the public domain they aren't though, and the public can and should do with them as they see fit

Modernizing / adapting is the least damaging change to be done here


For what it’s worth, that’s also exactly how I read your response, which was (to repeat) ‘That's fine! Our editions didn't erase any of the other editions you can find online and in print. You're more than welcome to select any edition that fits your reading preferences.’

I think that Standard Ebooks is a great-sounding project, but I honestly found your response not just flippant, but passive-aggressively rude to the original poster.

But — full disclosure — I also think that it would be a good idea to preserve the spellings found in the original editions you are digitising. So perhaps I inclined to feel the bite of your response more than someone who just doesn’t care.


> I honestly found your response not just flippant, but passive-aggressively rude to the original poster.

I didn’t read it that way at all. How would you have worded it in such a way as to sincerely express the stated sentiment without coming across as passive‐aggressively rude?


> How would you have worded it in such a way as to sincerely express the stated sentiment without coming across as passive‐aggressively rude?

Something like ‘While we understand that some people would prefer to read the original texts (modulo typos, formatting errors and the like), we think that it is preferable to modernize spelling because X, Y and Z.’

In other words, the polite response to ‘I like most of what you’re doing, but I dislike this particular thing’ is not ‘Fine! You’re free to go elsewhere,’ with an implied ‘don’t let the door hit you on the backside on your way out,’ but rather to engage and explain.

Again, I have to admit my own bias against the policy and consequent bias in favour of the original poster.


It is what you said. And for the record, I love the idea of this project. I just agree with the other poster about the location of this line that's all.


The text you have in your “quote” is a lot more snarky and rude than the original message. Did they edit their comment or something? Otherwise—why not quote an actual quote?


Considering the thrust of my comment, I don't understand the question. Obviously paraphrasing someone else's words into ones you like better is a fine and acceptable thing to do. So clearly I am just illustrating the problem by example.

The real answer is twofold.

1. We don't have a special 3rd kind of quote or other punctuation mark for reinterpreted references.

2. The real one: This is not a quote that lies as you imply. It is a new message, that merely uses quotes to denote a speaker, as in a pure fictional work, where the characters dialog is in quotes, even though no actual human was actually quoted.

Are there any other conundrums and baffling mysteries I can clear up for you?


When you use that syntax it looks like you are calling out an explicit quote; you may think that it's a reasonable paraphrase but I think most readers will see what you did as a strawman instead of a paraphrase.

Better to write inline "I feel like what you said amounts to [...]" to reduce the perception they you're making up quotes they someone didn't say or even clearly imply.


No one literate is in any danger of misinterpreting this very basic technique. I don't care about anyone else because it doesn't matter, they will misinterpret regardless, deliberately.


“I wanted a pure fictional speaker to argue against.”

Ok, thanks, that makes sense.


Ah but I did paraphrase, and you did not. My paraphrasing was not a lie, and yours is.


> Obviously paraphrasing someone else's words into ones you like better is a fine and acceptable thing to do.

Wrong. Not only is it tasteless and dishonest (not "fine"), it is against the rules of this site. But regardless of whether it's allowed elsewhere, you still shouldn't do it. (See "tasteless and dishonest".)


What's the point of including books that aren't public domain yet in your collections?

It makes it hard to browse those collections to find actual books to read. The first 3 series I clicked on all said "not P.D." (which at first I didn't know what "P.D" meant - remember your audience does not have your level of familiarity with your context, perhaps a tooltip on that badge would help)..

Then I see "this book will enter public domain in 2050"..

I commend you for this project, it's really awesome work.. From a user's experience, it would be great to have a filter on your various lists that restricts only to books that are available, and excludes these books that are not yet in your collection.


In addition to what Robin mentioned below, some of these placeholders are for books on our Wanted list. I also think it's useful to show readers that particular books are looking for volunteers to produce, and also to show that some books they might want are locked away by copyright for possibly decades. In that sense it's partly a political message.


It sounds like implementing the filter gp suggested would still send the political message though.


Whenever we add a collection, the books that are in that collection but not yet in PD in the US get placeholders. But a filter might not be a bad idea.


Which ebook reader works well with standard ebooks in 2025?

(More concretely my reader is a 2nd-gen kindle which is basically useless these days and I’d love an idea of something that can display standard ebooks with all their advanced formatting)

Thanks!


I read on an old Kobo, using Kepub files. Their Kepub renderer is quite good.

I think Kindle's renderer hasn't changed significantly for many years, and it had always been pretty bad. I always say that Kindle seems to have been created by people who hate books.

The best renderer around is iBooks on an iPad, which as far as I can tell uses an up-to-date Webkit.


I'd suggest KOReader, on various devices, as the best renderer and interface.


I read standard .epub files with KOReader on my Kobo Aura H2O. It's faster, nicer-looking, and more customizable than the stock reader, and the installation instructions were complete, correct, and easy to follow.


Thanks! I don’t like reading on a backlit screen (hurts the eyes) so iPad is a no-go, but a kobo would probably work!


Kobo Libra 2 is a great e-reader. Works well one-handed (screen rotates for left/right hands), has buttons for page turns. Integrates with Overdrive (what Libby uses). Drawbacks are Kobo's bookstore is weaker than Amazon/Apple. Screen is also not flush which means some dust can collect in the recess.


I also use a Kobo and occasionally an iPad. Do you know if it's possible to sync progress between the two.

I've been meaning to try calibre-web, but I'm doubtful iBooks will support OPDS.


A note for Kobo users: a lot of us (myself included) use Calibre to manage and upload our ebooks. Something about Calibre messes up Kepub files and strips out a lot of the formatting (including the book’s cover).

If I want to appreciate a nice Kepub from Standard Ebooks, I upload it directly to the Kobo.


A Kobo would be a great choice. I use a Kobo Libra 2 and love it a lot more than my old Kindle Paperwhite that got stolen: https://gl.kobobooks.com/products/kobo-libra-2 The Kobo Sage is also good because it has an 8" screen.

Standard eBooks offers kepub format for Kobo devices and files, they use their advanced Webkit-based renderer: https://standardebooks.org/help/how-to-use-our-ebooks#kobo-f...


What did you do with purchased books you had in your kindle? Rebuy them? Just “let them go”?

Thanks for the recommendation!


Fortunately, I had them backed up to a cloud folder. I remember almost deciding not to go to the trouble to back them up, but isn't that how it always works with backups? The Kobo also works with epub.


I recently purchased a Pocketbook Era. It is pretty much the perfect device for me - supports open standards and does not require any cloud account signups to start using it. It is not hostile to the user, 3rd party applications such as Koreader can be simply dropped in and they appear in the menus without any shenanigans like jailbreaking or custom launchers needed.

In my ideal world all devices would be like this.


Piggybacking: for computers, what is a good epub viewer?

What I'm personally looking for:

- Linux and/or OS X

- No ‘import’ requirement (a viewer, not a collection manager)

- Single page or continuous (no forced double spread)

- No required animations

- At least basic control over font size, spacing, margins.

- Keyboard navigation (at least next/previous page)


Check out Foliate, it's a really nice reader and Standard Ebooks display quite nicely using Foliate IMO.


For Linux, Foliate is very nice.


Apple Books on macOS is pretty nice


That’s calibre viewer, but it may require some customization to get something nice. Foliate is ok, but it’s a library. i’d say that’s OK because epub is a zip file and you need to extract it to read it.


Zathura is nice. Has vim bindings and a minimal UI.


OS X: FB Reader


Alexandria.


KOReader for Kindle? https://github.com/koreader/koreader

It does a good job of modernising old Kindles.


For Android, Moon Reader Pro.

Unmatched UI tweaking features which make reading a pleasure. Syncs bookmarks with cloud services, thus across different devices.


My Kindle is 8 years old and works excellent with standard ebooks. I think you can select any device that you prefer and it will be good.


Oh so you have one of the new Kindles!!

For reference my gen 2 kindle is 16 years old.


You are not alone. I know my partner is going to hate it when their antique reader dies and they will need to deal with a touch screen.


I love this. However, I couldn't find an alphabetical list of authors, which is the way I wanted to browse on my first visit. Instead my only option is to show 48 on a page and paginate through, which is tedious. I know there are author pages - e.g. https://standardebooks.org/ebooks/william-makepeace-thackera... - so I presume it's feasible. An author index would significantly increase my likelihood of understanding what's available and engaging with the content.


We don't have a list of authors yet, but that's a good idea to add!


You could reuse whatever process generates the sitemap: https://standardebooks.org/sitemap

All the author pages come before any pages with books from those authors.



Hi, Alex. Is there anyway to browser the ebooks filtered by languages? I tried to find some texts in French, but it doesn't seem to have any.


Standard Ebooks only works on English-language books, as typography varies between languages and we're only experts in English.


I can tell you there is a lot of appetite for other languages. I looked at the project and the amount of stuff that would need to be rewritten to work with multiple languages was daunting. I would consider working on making your documentation and workflow functional with multiple languages.


Lots of people have tried similar projects in other languages but as far as I know none have persevered.

Personally I think it's important to have one person in charge who is able to approve of the quality of all the project's output; for now, at SE, that person is me and I'm only an expert in English.


Project Runeberg seems to be still going after 30-odd years.


Project Runeberg is trying to be a nordic Project Gutenberg, not a nordic Standard Ebooks.


Enlightening comment!


Same for me. I think it's english only.


Great work! Gutenberg project books have always been a pain to read. Thank you for caring!


I am from India. Could you add local UPI based donation option at some point? Not everyone has card here.


Wonderful project! One thing I wish the website would have is being able to find the right book to read out of this enormous list — e.g. showing / sorting by Goodreads ratings (which I realize you might not want to do), or at least having some kind of a "Featured" section with the most critically acclaimed / must read books of the project on one page.


There are around a dozen collections on the (not prominently featured) collections page[1] like Le Monde's 100 Best Books of the Century and Modern Library's 100 Best Novels, etc.

1. <https://standardebooks.org/collections>


Steinbeck was the first name I searched for, so this was great to see even if his major works won't be available for some time. I do wonder how badly the Steinbeck or Faulkner estates are hurt by the sudden loss of royalties? Imagine working hard to write a book to make a living and then just under a hundred years it's taken away from you. Also, AI.


Is there an API or downloadable catalog of the titles? Happy to feature them on meetnewbooks.com so more readers can find them.


Yes, we have complete feeds available for our Patrons: https://standardebooks.org/feeds


Been using Standard Ebooks for a while now, but wanted to drop by here and say how great this site is! It's replaced P.G. for me (for whatever is on this site, at least) and I like the much nicer formatting on the texts. It's great on both my physical Kindle and Apple Books on my iPhone.


I’d love to know more about the pattern of keeping each book in individual repos, rather than in a singular repo.


Each repo is a history of the ebook including editorial changes, typos fixes, and the like. Having a single repo containing thousands of ebooks and their histories would be pretty annoying to browse.


Presumably to keep the repo size reasonable. Say I want to make an ad hoc contribution to a book, if step 1 is "download this multi-gigabyte repo" then that's a fairly big hurdle.


Really appreciate the work Standard Ebooks puts into making these texts not just available, but readable


Roughly speaking, how long does it take you to produce a single ebook?


Once you're very familiar with the process, you could get a draft of a basic prose novel ready for proofreading in a few hours. Then it has to be proofread and completed.

Beginners, and people working on more advanced books, can take much, much, much longer.


it varies widely depending on the length and type of book and how much free time the volunteer has to devote to it

Anywhere between 1 week for the simplest (straight narrative, not too much verse or endnotes) and ~1 year (thousands of endnotes, pages of verse, drama, in-line references to book titles, use of technical terms, etc)


In your opinion, what is the ebook reader you like the most ?


ooo tempted to reprint faulkner as part of a small press, thanks for the idea


We use the Flesch-Kincaid algorithm to calculate reading ease. For most books it works pretty well, but for avant-garde prose like The Sound and the Fury it fails pretty badly. It also considers Ulysses to be "fairly easy"!


Some placeholders like this one exist because they are part of other collections for which we have more items. So when the reader is viewing the other collection, they can see which items we have and which we don't.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: