Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts

Join the discussion

Become a redditor
883

800 GiB torrents with 1500k public domain paywalled papers from before 1923

Happy public domain day to all! A special thought for people in the USA, who until yesterday suffered a 20-years-long winter of ever-expanding copyright.

Let me return to the racket of academic publishers who abuse copyright to enslave thousands of researchers and leech billions in public funding from cultural institutions every year. Following my release of public domain IEEE papers and seeing the format of some other releases by other users, today I bring your attention to Scholarly works published until 1909 (torrent) and in 1909–1922 (torrent).

Please download and seed the torrents above! Or if you prefer you can add the hashes/magnet links directly, but not all clients support the web seeds provided this way.

Hashes:

5a17b09511034fcf8dfebcf00a0499660154cfb6

70ecab072b2792c9239ab8197d3f52cc1d075be1

Magnet links:

magnet:?xt=urn:btih:5a17b09511034fcf8dfebcf00a0499660154cfb6&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969%2Fannounce&tr=http%3A%2F%2Fbt1.archive.org%3A6969%2Fannounce&as=https%3A%2F%2Farchive.org%2Fdownload%2F

magnet:?xt=urn:btih:70ecab072b2792c9239ab8197d3f52cc1d075be1&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969%2Fannounce&tr=http%3A%2F%2Fbt1.archive.org%3A6969%2Fannounce&as=https%3A%2F%2Farchive.org%2Fdownload%2F

These datasets contain 1,518,078 PDF files from various sources, often with added OCR. They were all published before 1923 in international journals. I'm not providing legal advice, but if you consider them simultaneously published to USA they should all be in the public domain in the USA. Yet, publishers apply indiscriminate copyright statements to the contrary, which may constitute copyfraud, and lock nearly all of them behind paywalls or other hurdles, hoping to milk some more profit for who knows how many centuries.

You can also download PDFs by individual publisher, going by their DOI prefix and checking the full list of DOIs (1909, 1923). Internet Archive can also support the direct download of individual files inside the ZIP files but that's probably best handled by other repositories.

Library users have already reused and curated these public domain datasets to enrich some knowledge bases and open repositories which make the works more accessible. Given publishers fail to perform their duties, it's on us to comb through the copyright status and metadata for all the scientific knowledge.

48 comments
98% Upvoted
What are your thoughts? Log in or Sign uplog insign up
level 1

Link them up on academictorrents.com

level 2
Original Poster10 points · 23 days ago

Thanks for reminding me! I've registered years ago, I think, but I've never used it yet.

level 2

Thanks I'm drooling over it

level 2
Original Poster15 points · 23 days ago

Here you go with the first: http://academictorrents.com/details/70ecab072b2792c9239ab8197d3f52cc1d075be1/tech

The admin kindly upgraded my account to uploader, but I had forgotten that the upload form requires the torrent file to already contain their own tracker. Luckily torrent-file-editor allows to change such details without changing the info_hash, otherwise it would have fragmented the node swarm.

level 1
48TB49 points · 23 days ago

Trying to jump on with a seedbox

I'm getting this from the Tracker: [Failure reason "Requested download is not authorized for use with this tracker."]

For both .torrent files. Same result using DHT (I've enabled DHT for this).

level 2
Original Poster32 points · 23 days ago

Thanks for helping! Yes, the reply from the archive.org tracker is bogus at the moment. You should still get the webseed working, and if you add the other tracker (e.g. udp://tracker.coppersurfer.tk:6969/announce) you should now see several nodes.

level 3
48TB6 points · 23 days ago

Thanks

level 2
4 points · 23 days ago

Yeah, IA does this. It'll work, just slow

level 2
3 points · 23 days ago

Can you or anyone recommend a seedbox?

level 3
Original Poster5 points · 23 days ago

Looking for a Seedbox recommendation? Read this First. There are plenty of suggestions in other parts of Reddit. I can say that the fastest node I currently see connected to these torrents is on Whatbox.

level 3

feralhosting

level 4
2 points · 23 days ago

+1 for certain

level 3

This is a perfect job for KS-2 on kimsufi since they give you 1TB of storage on dedicated server:

https://www.kimsufi.com/us/en/servers.xml

4.99€ / mo or $5.99 USD / mo

level 4
1 point · 20 days ago

I can't find it anywhere but do they have a data usage charge for egress and ingress?

level 5

Not that I know of. It is best effort 100Mpbs or better. So you could push 35 TB per month theoretically . The CPU nor memory is not great so you can't saturate bandwidth. Need $20/mo server for that. But using just as a seed box it is fine.

level 6
1 point · 19 days ago

I’ll check em out when they replenish the servers, thanks.

level 7
2 points · 19 days ago

there was a discord that kept a track of availability. i see there's this service to tell you: https://kimsufi-notifier.com/

I got a KS-1 but it took me several weeks (3.99€/mo)... Good luck with KS-2. If I had a budget, I'd seed one for science, but I'm not that motivated.

level 3
12TB - Raid 101 point · 23 days ago

Seedhost is great.

level 1

This is the kind of technology rebels that I like ... the kind who steals fire and gives to the world.

level 1
Someone's else computer12 points · 23 days ago

Your first link seems to be broken (Scholarly works published until 1909 (torrent)), specifically the torrent link. Here's the good link: https://archive.org/download/crossref-pre-1909-scholarly-works/crossref-pre-1909-scholarly-works_archive.torrent

level 1
4 points · 20 days ago

Any particular reason 1923 is not included? The US threshold changed to 1924 on January 1 of this year, so anything published before 1924 (including 1923) is now in the public domain in the USA. I don't particularly need anything, just curious.

level 2
Original Poster3 points · 20 days ago

I started assembling the dataset months ago and I posted it on the Internet Archive when it was still 2018, so 1923 was the threshold at the time.

level 1
4 points · 19 days ago

Are those papers available in Sci-Hub and Library Genesis? I think the best and easiest way to keep those papers alive would be to add them to those projects and help them there. They already have a good infrastructure that can be replicated :)

level 2

+1 Does anyone know how to get in touch with Sci-Hub? They'd have to resolve the DOIs to get the rest of the metadata. But someone who knows what they're doing could code this pretty fast... if Sci-Hub doesn't already have that coded.

Silly publishers, giving away all that delicious metadata for free. (rubs hands together)

level 3
1 point · 15 days ago

If I remember correctly, sci hub first integrates the PDF into library Genedus. if this torrent contains the PDFs and the metadata is simple to upload there. There's a forum for library genesis discussions, mainly in Russian but since guys there speak English also. I'm on holidays now so I can help much.

level 2
Original Poster1 point · 17 days ago

AFAIK they are, didn't check recently. This dataset is mostly to aid those who need a corpus large enough and easy enough to download at once, or who need a more "mainstream" access venue.

level 3
1 point · 10 days ago

Good to know! AFAIK you can download library genesis via torrent.

level 1
6 points · 23 days ago

The first torrent doesn't work.

level 2
Original Poster10 points · 23 days ago

Typo in the link, fixed. Thanks!

level 3
5 points · 23 days ago

Thanks. Added to my seedbox, though it always hates IA torrents for some reason and runs a bit slow

level 4
Original Poster6 points · 23 days ago

Thanks! The web seeders rarely go above 1-2 MiB/s in my experience, sometimes much less from Europe. That's one reason it's helpful to have more seeders from around the world. :-)

level 1
Nice Try FBI7 points · 23 days ago

Thank you for sharing.

level 1

Downloading now, was looking for this specific project. I'll plan to seed forever. Thanks!

level 1
1 point · 20 days ago

Should one build a seedbox for local/private use or should one buy a per monthly one?

level 2
Original Poster1 point · 20 days ago

This is not the latest blockbuster in ultra-high resolution, it doesn't move that much traffic. If you don't already have a seedbox it's frankly better that you just torrent it at home from an external hard disk and let it sit there as a distributed backup for the future.

level 1
4TB1 point · 12 days ago

you should put this on legittorrents too, just for duplication. I don't have the storage space to seed this at the moment, but hope to later this year when I finally have the funds to build my nas (HDD's are expensive yo)

level 1
-23 points · 23 days ago(0 children)
level 2
5TB peasant10 points · 23 days ago

What in the ever loving fuck is wrong with you? What does Trump have to do with any of this?

Copyright was extended years ago, by Congress. He had nothing to do with it. And no, he could not undo it via EO, because Presidential EO does not carry the same authority as Congressional law.

level 3
-8 points · 23 days ago(0 children)
level 4
5TB peasant9 points · 23 days ago

Not the president.

See my second point.

Trump is literally powerless regarding current standing law. That's the checks part of the checks and balances.

level 5
-6 points · 22 days ago(0 children)
level 6

400.0 miles ≈ 643.7 kilometres 1 mile ≈ 1.6km

I'm a bot. Downvote to remove.


| Info | PM | Stats | Opt-out | v.4.4.6 |

level 2
Original Poster6 points · 23 days ago

Denouncing the Bern convention? Eh. Meanwhile: The USMCA and Copyright Reform: Who is Writing Canada’s Copyright Law Anyway?; A Mix Of Good And Bad Ideas In NAFTA Replacement; NAFTA Replacement Extends Canada's Copyright Term to Life +70 years.

EFF regularly has some call to action for copyright issues in USA where it's useful to call representatives, follow: https://www.eff.org/issues/innovation

level 2

Get trump cum out of your mouth dude. I get it you like him but calling him "Big Dick Dadyo" is pathetic. Seems like you want him to fuck you up your fat incel ass.

level 3

The Democrats who's dick is in hollywood's mouth would make all copyright content in 20 years? Never in a million years.

level 2

Copyright terms are excessive today, most things are not commercially viable within a decade or two after release anyways. You're off your rocker if you think Trump would do anything about it, even if he could.

Community Details

116k

Subscribers

475

Online

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

Create Post
Moderators
u/madhi19
To the Cloud!
u/deityofchaos
30.5 TB RaidZ
u/FHayek
8TB and cloud
u/thesared
striped for her pleasure
u/SN4T14
6TB peasant
u/Nooco24
510TBaguettes