{"posts":[{"no":108914628,"closed":1,"now":"05\/26\/26(Tue)21:24:02","name":"Anonymous","sub":"\/asdiq\/ Archiving, storage tech, development, in-depth history\/analysis, and questions general","com":"This is a general which is focused on archiving, but also interested in other related topics.<br><br>Storage technology and file sharing:<br>Hardware, software, services, shadow libraries, backups, home server, and networks such as tape drives, HDDs, file systems, archive.today, IPFS, Arweave, BitTorrent, etc.<br><br>Development:<br>Example topic: web archiving is much harder in 2026 compared to 2016. Too many websites are walled off by systems such as Cl0udflare, making it impossible for services such as web.archive.org, archive.is, and megalodon.jp to capture their webpages. That&#039;s a big chunk of important data that easily disappears with no web archive captures. We have to develop solutions to this, such as using the SingleFile extension and other stuff.<br><br>In-depth history:<br>Examples: get into the &quot;minutia and trivia&quot; about the history of websites and all the little changes, or, talk about more important web history events and future events such as sites closing.<br><br>Analysis:<br>Examples: analyzing files and folders that you obtained from scraping or data hoarding, or, what you&#039;re sad was lost and not archiving, what you&#039;re glad was archived.<br><br>Questions:<br>Ask whatever questions about any of this.","filename":"library of babel, ai image?","ext":".png","w":680,"h":680,"tn_w":250,"tn_h":250,"tim":1779845042635533,"time":1779845042,"md5":"nxE\/t158GKzCJSNJst1jEA==","fsize":948290,"resto":0,"archived":1,"bumplimit":0,"archived_on":1780418211,"imagelimit":0,"semantic_url":"asdiq-archiving-storage-tech-development-indepth","replies":126,"images":62,"tail_size":50},{"no":108914639,"now":"05\/26\/26(Tue)21:25:07","name":"Anonymous","com":"Inspirations for this general:<br><br><br>\/dhg\/ - Data Hoarding General<br><span class=\"quote\">&gt;Links<\/span><br><span class=\"quote\">&gt;Rentry: https:\/\/rentry.org\/dhg<\/span><br><span class=\"quote\">&gt;<\/span><br>What is \/dhg\/<br><span class=\"quote\">&gt;In this thread we discuss and create technology and software for data-hoarding, archiving, scripts, and more.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;gallery-dl - scrape images, manga, videos and more from many websites<\/span><br><span class=\"quote\">&gt;https:\/\/github.com\/mikf\/gallery-dl<wbr><\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;Hydrus Network<\/span><br><span class=\"quote\">&gt;https:\/\/hydrusnetwork.github.io\/hy<wbr>drus\/<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;Stash<\/span><br><span class=\"quote\">&gt;https:\/\/github.com\/stashapp\/stash<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;SmartImage<\/span><br><span class=\"quote\">&gt;https:\/\/github.com\/Decimation\/Smar<wbr>tImage<\/span><br><br><br>\/dapp\/ P2P Decentralized Applications General<br><span class=\"quote\">&gt;Share your favourite dapps here.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;Examples:<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;brig https:\/\/brig.readthedocs.io\/<\/span><br><span class=\"quote\">&gt;ipfs https:\/\/ipfs.io\/<\/span><br><span class=\"quote\">&gt;ZeroNet https:\/\/zeronet.io\/<\/span><br><span class=\"quote\">&gt;Arweave https:\/\/github.com\/ArweaveTeam\/arwe<wbr>ave<\/span><br><span class=\"quote\">&gt;Gitopia https:\/\/gitopia.org\/<\/span><br><span class=\"quote\">&gt;BitTorrent<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;Leave your suggestions below.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;These components collectively make up the future internet known as web3.<\/span><br><br><br>\/dshag\/ thread<br><span class=\"quote\">&gt;Data scraping, hoarding and analytics general thread.<\/span><br><span class=\"quote\">&gt;What are you scraping, hoarding or analyzing frens? Also post some pics so I can post them from next time, anime also works<\/span>","filename":"KPC-Blog-Tape-Library","ext":".jpg","w":1140,"h":502,"tn_w":125,"tn_h":55,"tim":1779845107517752,"time":1779845107,"md5":"A8aLCeuCOe+LR\/aw1O5uQg==","fsize":153774,"resto":108914628},{"no":108914651,"now":"05\/26\/26(Tue)21:28:08","name":"Anonymous","com":"I wanted to make this about more topics than just archiving and data hoarding as I don&#039;t think that attracts many posters.<br><br>Also, \/asdiq\/ sounds like &quot;ass dick&quot;. HAha, hope this general never dies. At least it isn&#039;t exactly another AI slop general.","filename":"tape-storage.net-assets-images-bg_a_01","ext":".jpg","w":870,"h":580,"tn_w":124,"tn_h":83,"tim":1779845288502330,"time":1779845288,"md5":"hI3qV1f35AxLylcgf275Rw==","fsize":424070,"resto":108914628},{"no":108914674,"now":"05\/26\/26(Tue)21:32:44","name":"Anonymous","com":"<a href=\"#p108914628\" class=\"quotelink\">&gt;&gt;108914628<\/a><br>Nice idea. I&#039;ve always felt that archival is going to become more and more important with the passage of time, especially in the face of rising storage costs, increasing surveillance and corporate greed.","time":1779845564,"resto":108914628},{"no":108914675,"now":"05\/26\/26(Tue)21:33:03","name":"Anonymous","com":"With IPFS gateways, I can have whatever URL path at https:\/\/site.com\/ipfs\/[cid]\/[path] or https:\/\/[cid].ipfs.site.com\/[path]<br><br>This is great, and directly helpful for archiving, but is there a way to have the URL contain a question mark? Not possible with ipfs gateways. Possible with a .onion site, but I don&#039;t want to use that anymore.<br><br>Do I really have to pay for some domain name so I can run this?:<br>https:\/\/site.com\/memento\/2026020304<wbr>0506\/https:\/\/othersite.com\/index.ph<wbr>p?id=123<br><br>(Using ipwb.)","time":1779845583,"resto":108914628},{"no":108914820,"now":"05\/26\/26(Tue)22:03:07","name":"Anonymous","com":"<a href=\"#p108914674\" class=\"quotelink\">&gt;&gt;108914674<\/a><br>Yup. Corporate greed makes grabbing some websites basically impossible. More reasons that archival becomes more important:<br><br>We live in the enshittification era of the Internet. Both web.archive.org and archive.org\/details\/ are enshittified procensorship hellholes that shouldn&#039;t be trusted. We need more alternatives and support for existing alternatives.<br><br>A year after BitTorrent was created, there was maybe tens or hundreds of terabytes of torrents. Decades later, that&#039;s ballooned into a much bigger and much harder to manage size if you want to capture a large part of it. Same can be said of other stuff. Many things drop off and are forever lost.<br><br>The world creates so much more data per year than it did last year. So far, it&#039;s an ever increasing trend. I learned that from reading about Filecoin (kinda sucks); I hope they finally got this FilBeam thing working:<br>https:\/\/docs.filecoin.cloud\/referen<wbr>ce\/filoz\/synapse-sdk\/filbeam\/toc\/","time":1779847387,"resto":108914628},{"no":108914940,"now":"05\/26\/26(Tue)22:22:40","name":"Anonymous","com":"<a href=\"#p108914675\" class=\"quotelink\">&gt;&gt;108914675<\/a><br>some chatgpt solutions:<br><br><span class=\"quote\">&gt;Encode the archived URL so it fits into path<\/span><br><span class=\"quote\">&gt;Instead of raw ?, encode the full target URL (base64, percent-encode, or use a path-safe encoding) and have your ipwb or handler decode it. This avoids needing special host handling. Example path: \/memento\/20260203040506\/https%3A%2F<wbr>%2Fothersite.com%2Findex.php%3Fid%3<wbr>D123<\/span><br><br><span class=\"quote\">&gt;Use a free TLS proxy (ngrok \/ localtunnel \/ Cloudflare Tunnel)<\/span><br><span class=\"quote\">&gt;Cloudflare Tunnel (free) with a free workers.dev or *.trycloudflare.com address can front your local ipwb server and accept queries. Ngrok has paid TLS subdomains for custom domains; free subdomains rotate.<\/span>","time":1779848560,"resto":108914628},{"no":108914954,"now":"05\/26\/26(Tue)22:24:59","name":"Anonymous","com":"How hard is it to have a hard drive and a pi running on a crt 24\/7 ish simulating say Boomerang AMC reruns but instead of shitty old TV my favorite phonepost doomscrolls?","filename":"a14031aaeb14cc901f60104c3c9c8baf","ext":".gif","w":498,"h":282,"tn_w":125,"tn_h":70,"tim":1779848699459637,"time":1779848699,"md5":"3BMFtY5DFLTCxE0Nb29bjw==","fsize":1054167,"resto":108914628},{"no":108914975,"now":"05\/26\/26(Tue)22:30:01","name":"Anonymous","com":"<a href=\"#p108914954\" class=\"quotelink\">&gt;&gt;108914954<\/a><br>Sounds fairly easy once you have all the hardware and connectors to the CRT TV.<br><br>Collection of images named this<br>img001.jpg<br>img002.jpg<br>img003.png<br>...<br>(GIF probably also works)<br><br>Then<br>ffmpeg -framerate 1\/6 -i img%03d.jpg -c:v libx264 -r 30 -pix_fmt yuv420p out.mp4<br><br>Then play the &quot;out.mp4&quot; video. Done, slideshow of images at 6 seconds per image.<br><br>Reminds me of my time copying VHS tapes to DVDs. I could say more about that.","time":1779849001,"resto":108914628},{"no":108915667,"now":"05\/27\/26(Wed)01:38:45","name":"Anonymous","com":"<a href=\"#p108914940\" class=\"quotelink\">&gt;&gt;108914940<\/a><br>Percent encoding method didn&#039;t work (I think I knew this months ago but forgot). https:\/\/archive.is\/hFLPb is proof that it fails.<br><br>A file named<br>&quot;https%3A%2F%2Fsite.com%2Findex.php<wbr>%3Fpage%3Dpost%26s%3Dview%26id%3D12<wbr>345679&quot;<br><br>Becomes this in a gateway (double percent encoded):<br>https:\/\/[cid].ipfs.ipfs-02.hypha.co<wbr>op\/memento\/20260527051814\/https%253<wbr>A%252F%252Fsite.com%252Findex.php%2<wbr>53Fpage%253Dpost%2526s%253Dview%252<wbr>6id%253D12345679<br><br>We need it to be \/memento\/20260203040506\/https:\/\/oth<wbr>ersite.com\/index.php?id=123 (or single percent encoded?) so archive.today can index it to othersite.com and not just *.hypha.coop","time":1779860325,"resto":108914628},{"no":108915771,"now":"05\/27\/26(Wed)02:05:26","name":"Anonymous","com":"<a href=\"#p108914940\" class=\"quotelink\">&gt;&gt;108914940<\/a><br><span class=\"quote\">&gt;localtunnel<\/span><br>This would be fuckin dope if it worked with no walls:<br><span class=\"quote\">&gt;https:\/\/theboroer.github.io\/localt<wbr>unnel-www\/<\/span><br><span class=\"quote\">&gt;$ sudo npm install -g localtunnel<\/span><br><span class=\"quote\">&gt;$ ipfs daemon &amp;<\/span><br><span class=\"quote\">&gt;$ ipwb replay 20260527051814-https---rule34.xxx-i<wbr>ndex.php-page-post-s-view-id-136567<wbr>08.cdxj &amp;<\/span><br><span class=\"quote\">&gt;$ lt --port 2016<\/span><br><br>I got the random tranny porn web capture to show up in clearweb at<br>https:\/\/tidy-meals-feel.loca.lt\/mem<wbr>ento\/20260527051814\/https:\/\/rule34.<wbr>xxx\/index.php?page=post&amp;s=view&amp;id=1<wbr>3656708<br><br>BUT ONLY after clicking\/copy-pasting on some verification shit. Works flawlessly if using a .onion site:<br>https:\/\/archive.is\/ysIMX<br><br>but I said I didn&#039;t want to use that anymore.","time":1779861926,"resto":108914628},{"no":108915974,"now":"05\/27\/26(Wed)02:58:40","name":"Anonymous","com":"<a href=\"#p108915771\" class=\"quotelink\">&gt;&gt;108915771<\/a><br>It&#039;s sad that the Tor2clearweb gateways have all went extinct. I could have used those. I&#039;m now trying to use this thing:<br>https:\/\/localxpose.io\/apps\/nginx<br><br>Works:<br><span class=\"quote\">&gt;$ sudo npm install -g loclx<\/span><br><br>Fails:<br><span class=\"quote\">&gt;$ loclx tunnel http --to http:\/\/localhost:2016<\/span><br><span class=\"quote\">&gt;bash: loclx: command not found<\/span><br><span class=\"quote\">&gt;$ sudo npx loclx tunnel http --to http:\/\/localhost:2016<\/span><br><span class=\"quote\">&gt;sh: line 1: loclx: command not found<\/span><br><br>Works?<br><span class=\"quote\">&gt;$ npm config set prefix &quot;$HOME\/.local&quot;; npm install -g loclx<\/span>","time":1779865120,"resto":108914628},{"no":108916079,"now":"05\/27\/26(Wed)03:34:15","name":"Anonymous","com":"Archive-related news:<br><br>Deathwatch<br><span class=\"quote\">&gt;https:\/\/wiki.archiveteam.org\/index<wbr>.php\/Deathwatch#2026-05<\/span><br><span class=\"quote\">&gt;May: Bucknell University Press will close at end of the 2025-26 school year.[61]<\/span><br><span class=\"quote\">&gt;May: The Primary School will close at end of the 2025-26 school year.[62]<\/span><br><span class=\"quote\">&gt;May: Sterling College will close at the end of the 2025-26 school year.[63]<\/span><br><span class=\"quote\">&gt;May: Trinity Christian College will close at the end of the 2025-26 school year.[64]<\/span><br><span class=\"quote\">&gt;2026-05-31: University of Houston Digital History will close.<\/span><br><span class=\"quote\">&gt;2026-05-31: Tistory will remove all uploaded videos.<\/span><br><span class=\"quote\">&gt;2026-05-31: plus a, a site documenting theater, will shut down.[65]<\/span><br><span class=\"quote\">&gt;2026-05-30: https:\/\/minelli.fr\/[66]<\/span><br><span class=\"quote\">&gt;2026-05-30: ruru-jinro.net, ruru-jinro is an online Japanese werewolf game server that has been operating since May 2009. It is scheduled to close on May 30th (JST).[67]<\/span><br><span class=\"quote\">&gt;2026-05-29: Tele2 will be discontinued by it parent company Odido[68]<\/span><br><span class=\"quote\">&gt;2026-05-28: NIKKEI COMPASS will close service.[69]<\/span><br><br>Silicon Valley VCs Invest in Head-Mounted Cameras on Workers in India For Training AI<br><span class=\"quote\">&gt;https:\/\/web.archive.org\/web\/202605<wbr>27022137\/https:\/\/gizmodo.com\/silico<wbr>n-valley-vc-backs-startup-that-gath<wbr>ers-ai-datasets-from-head-mounted-c<wbr>ameras-on-workers-in-india-20007610<wbr>62<\/span><br><span class=\"quote\">&gt;Human Archive believes its technology &quot;will become foundational infrastructure for automating manual labor.&quot;<\/span><br><span class=\"quote\">&gt;A video went viral in India about a month ago appearing to show a vast number of garment workers wearing tiny, head-mounted cameras while they worked in a dreary-looking factory. A widespread hunch was the technology the video depicted was a system for what\u2019s known as egocentric data collection\u2014gathering first-person footage of people in action to train AI models, in order to replace the workers with robots. But it wasn\u2019t totally clear if the video was real, let alone if the footage would or could be used to replace the workers.<\/span>","time":1779867255,"resto":108914628},{"no":108916160,"now":"05\/27\/26(Wed)03:50:46","name":"Anonymous","com":"Is there a localhost to Internet thing which doesn&#039;t suck? Hoping one exists that doesn&#039;t require a login\/verification. Such things did in fact exist in the past: see Tor2web and <a href=\"#p108915771\" class=\"quotelink\">&gt;&gt;108915771<\/a> before it required said verification.<br><br>Otherwise, I&#039;ll have to make account(s) and pay for it.<br><br><a href=\"#p108915974\" class=\"quotelink\">&gt;&gt;108915974<\/a><br><span class=\"quote\">&gt;Works?<\/span><br>Nope:<br><span class=\"quote\">&gt;$ ~\/.local\/lib\/node_modules\/loclx\/bin<wbr>\/loclx tunnel http --to http:\/\/localhost:2016<\/span><br><span class=\"quote\">&gt;Error: unauthenticated access<\/span>","time":1779868246,"resto":108914628},{"no":108916412,"now":"05\/27\/26(Wed)04:48:22","name":"Anonymous","com":"This is an email from 1995-12-26 10:46. It has the subject line &quot;Red Neck&quot;.<br><br>This image was deleted off of https:\/\/archive.org\/details\/ because that website is ran by petty fucks.<br><br>Full\/original image in ar:\/\/:<br>- meta: https:\/\/thuanannew1.store\/raw\/B99wT<wbr>2us-zAEYox4b1tSVGpgwYGw_N5V5XRNlKjQ<wbr>UvM<br>- data: https:\/\/bienchecung.store\/raw\/LaM_O<wbr>MXzH7bxlANb9_K_IF8u9F7F-kg3KfpN0W66<wbr>q0k","filename":"from_QmRVN5FTsjth7YxQppaiLYrkXs8qnMybC4QFLmGZUkcHEm","ext":".jpg","w":1524,"h":2032,"tn_w":93,"tn_h":125,"tim":1779871702717694,"time":1779871702,"md5":"AO36LbVTzTj0UiQv612xFA==","fsize":2643249,"resto":108914628},{"no":108916448,"now":"05\/27\/26(Wed)04:57:32","name":"Anonymous","com":"<a href=\"#p108914639\" class=\"quotelink\">&gt;&gt;108914639<\/a><br>Forgot about this general which I first saw months ago:<br><br>\/AAD\/ - Archiving And Donating computer resources general<br><a href=\"\/g\/thread\/108890811#p108890811\" rel=\"nofollow ugc\" class=\"quotelink\">&gt;&gt;108890811<\/a><br><br>Most recent thread in that general died in 2026-05-24:<br>https:\/\/desuarchive.org\/g\/thread\/10<wbr>8890811\/<br><br>Last post was:<br><span class=\"quote\">&gt;Another bump. I just wanted to say that I can&#039;t live without the Wayback machine anymore. I&#039;m working on a project that often involves dead links and it would have been far more difficult to complete without it, maybe impossible. Whatever happened to &quot;if it was uploaded to the Internet, it&#039;s there forever&quot; or however the saying went?<\/span>","time":1779872252,"resto":108914628},{"no":108916490,"now":"05\/27\/26(Wed)05:09:42","name":"Anonymous","com":"<a href=\"#p108914639\" class=\"quotelink\">&gt;&gt;108914639<\/a><br><span class=\"quote\">&gt;https:\/\/ipfs.io\/<\/span><br>Sadly, since May 13, 2026 all of the https:\/\/ipfs.io\/ipfs\/[cid] links redirect to<br><span class=\"quote\">&gt;title: IPFS Service Worker Gateway | HEAD@[7 hex characters]<\/span><br><span class=\"quote\">&gt;url: https:\/\/[cid].ipfs.inbrowser.link\/<\/span><br>which is an inferior IPFS gateway.","time":1779872982,"resto":108914628},{"no":108916825,"now":"05\/27\/26(Wed)06:39:09","name":"Anonymous","com":"A month or two ago, I bought a used 4-TB HDD for 15 USD per terabyte. I have reason to believe that it was only lightly used. I catted it all out to \/dev\/null and saw no storage medium errors. U jelly?","time":1779878349,"resto":108914628},{"no":108917635,"now":"05\/27\/26(Wed)08:56:46","name":"Anonymous","com":"<a href=\"#p108914628\" class=\"quotelink\">&gt;&gt;108914628<\/a><br><span class=\"quote\">&gt;hoard a bunch of shit in 2012<\/span><br><span class=\"quote\">&gt;it just lies on the NAS for over a decade, providing zero value to anybody<\/span><br>idk man, the zombie apocalypse just aint coming","time":1779886606,"resto":108914628},{"no":108919012,"now":"05\/27\/26(Wed)12:32:43","name":"Anonymous","com":"<a href=\"#p108917635\" class=\"quotelink\">&gt;&gt;108917635<\/a><br>Breakdown of what you have?<br><br>I could think of value that it has such as<br>- deleted YouTube videos<br>- torrents which are dead now","time":1779899563,"resto":108914628},{"no":108920500,"now":"05\/27\/26(Wed)15:33:16","name":"Anonymous","com":"<a href=\"#p108914940\" class=\"quotelink\">&gt;&gt;108914940<\/a><br><span class=\"quote\">&gt;solutions<\/span><br>Another one would be to enable port forwarding in the router. I don&#039;t want to do that.<br><br><span class=\"quote\">&gt;Cloudflare Tunnel<\/span><br>This NetworkChuck idiot spergs out about how wonderful that is even though you have to put in credit card info for their free tier:<br><br>&quot;EXPOSE your home network to the INTERNET!! (it&#039;s safe)&quot;<br>https:\/\/www.youtube.com\/watch?v=ey4<wbr>u7OUAF3c","time":1779910396,"resto":108914628},{"no":108921219,"now":"05\/27\/26(Wed)17:05:35","name":"Anonymous","com":"<a href=\"#p108920500\" class=\"quotelink\">&gt;&gt;108920500<\/a><br>Went with ngrok, but it&#039;s not working in a weird way.<br><br>Worked:<br><span class=\"quote\">&gt;$ # made an account, use a password manager if you&#039;re not a fucking retard<\/span><br><span class=\"quote\">&gt;$ pass generate me@email.com@ngrok.com 28<\/span><br><span class=\"quote\">&gt;$ pass show me@email.com@ngrok.com | xsel -ib<\/span><br><span class=\"quote\">&gt;$ # Run the ngrok program<\/span><br><span class=\"quote\">&gt;$ wget https:\/\/bin.ngrok.com\/c\/bNyj1mQVY4c<wbr>\/ngrok-v3-stable-linux-amd64.tgz<\/span><br><span class=\"quote\">&gt;$ sudo tar -xvzf ~\/Downloads\/ngrok-v3-stable-linux-a<wbr>md64.tgz -C \/usr\/local\/bin<\/span><br><span class=\"quote\">&gt;$ ngrok config add-authtoken $str # https:\/\/dashboard.ngrok.com\/get-sta<wbr>rted\/setup\/linux<\/span><br><span class=\"quote\">&gt;$ ngrok http 80 # or port 2016 or port 8080<\/span><br><br>Failure:<br>Nothing shows up at https:\/\/directed-snoring-available.<wbr>ngrok-free.dev\/<br><br>Debug:<br>Running &quot;ngrok diagnose&quot; says this at the end<br><span class=\"quote\">&gt;Report written to \/tmp\/ngrok-diagnose1685308347\/diagn<wbr>ose.json<\/span><br><span class=\"quote\">&gt;ERROR: Error establishing ngrok connection:<\/span><br><span class=\"quote\">&gt;ERROR: No tunnel servers were reachable via TCP.<\/span><br><span class=\"quote\">&gt;ERROR: (ERR_NGROK_8007)<\/span><br><span class=\"quote\">&gt;ERROR: https:\/\/ngrok.com\/docs\/errors\/(err_<wbr>ngrok_8007)<\/span><br><br>(Doing this just to have web archive captures from &quot;?&quot;-containing-URLs show up in clearweb.)","filename":"1682642570004","ext":".png","w":1120,"h":495,"tn_w":125,"tn_h":55,"tim":1779915935412387,"time":1779915935,"md5":"2j6yqUaK6LcSIZVRI60T+A==","fsize":205756,"resto":108914628},{"no":108921969,"now":"05\/27\/26(Wed)19:07:46","name":"Anonymous","com":"web.archive.org has excluded approximately 2049 websites (pic related).<br><br>archive.today has excluded approximately 3 websites.<br><br><a href=\"#p108921219\" class=\"quotelink\">&gt;&gt;108921219<\/a><br>Seems to be an old binary executable which connects to addresses which aren&#039;t there anymore: command &quot;ngrok diagnose&quot; said<br><span class=\"quote\">&gt;dial tcp 54.176.167.82:443 (connect.ngrok-agent.com): i\/o timeout<\/span><br><span class=\"quote\">&gt;dial tcp 52.53.56.252:443 (connect.ngrok-agent.com): i\/o timeout<\/span><br><span class=\"quote\">&gt;dial tcp 54.193.166.121:443 (connect.ngrok-agent.com): i\/o timeout<\/span><br><span class=\"quote\">&gt;dial tcp 52.53.75.151:443 (connect.ngrok-agent.com): i\/o timeout<\/span><br><span class=\"quote\">&gt;dial tcp 204.236.189.107:443 (connect.ngrok-agent.com): i\/o timeout<\/span><br><span class=\"quote\">&gt;dial tcp 52.9.131.203:443 (connect.ngrok-agent.com): i\/o timeout<\/span><br><br>The update command checks a 404&#039;d page:<br><span class=\"quote\">&gt;$ ngrok update<\/span><br><span class=\"quote\">&gt;[ https:\/\/update.ngrok-agent.com\/chec<wbr>k = HTTP 404 ]<\/span><br><span class=\"quote\">&gt;$ ngrok --version<\/span><br><span class=\"quote\">&gt;ngrok version 3.39.5<\/span><br><br>Fairly useless info at https:\/\/ngrok.com\/docs\/errors\/err_n<wbr>grok_8007 - it should say &quot;Maybe those server addresses aren&#039;t being used by ngrok anymore.&quot;<br><br>Could run it via docker instead. Run &quot;docker pull ngrok\/ngrok&quot; and so on.","filename":"fXH16","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1779923266571587,"time":1779923266,"md5":"aC7PQRGCbP2+KZ7cJ76Bcw==","fsize":49046,"resto":108914628},{"no":108923830,"now":"05\/28\/26(Thu)02:28:36","name":"Anonymous","com":"If you load a blog.csdn.net webpage in web.archive.org, it&#039;ll redirect to https:\/\/www.csdn.net\/ ( example: https:\/\/archive.is\/ezhfP ). Therefore, archive.today can&#039;t get a copy of it.<br><br>Solution = use SingleFile+ipfs(+ipwb+Tor):<br>https:\/\/archive.is\/JlGGK<br><br><a href=\"#p108921969\" class=\"quotelink\">&gt;&gt;108921969<\/a><br><span class=\"quote\">&gt;Could run it via docker instead<\/span><br>Lastest docker image is also version 3.39.5. This thing still isn&#039;t working:<br><span class=\"quote\">&gt;$ # https:\/\/dashboard.ngrok.com\/get-sta<wbr>rted\/setup\/docker<\/span><br><span class=\"quote\">&gt;$ docker run --net=host -it -e NGROK_AUTHTOKEN=$str ngrok\/ngrok:latest http --url=directed-snoring-available.ng<wbr>rok-free.dev 8080<\/span><br>Going to directed-snoring-available.ngrok-fr<wbr>ee.dev with noscript:<br><span class=\"quote\">&gt;You are about to visit directed-snoring-available.ngrok-fr<wbr>ee.dev, served by [IPv6 address]. This website is served for free through ngrok.com. You should only visit this website if you trust whoever sent the link to you. (ERR_NGROK_6024)<\/span><br>which is boilerplate.<br><br>I could try installing the software on another computer and see if that works.","filename":"JlGGK","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1779949716317297,"time":1779949716,"md5":"McGTfJkwKA5VR5qN79QZSQ==","fsize":100300,"resto":108914628},{"no":108923876,"now":"05\/28\/26(Thu)02:42:05","name":"Anonymous","com":"This jeet talks about how sites are angry because people\/bots are using web.archive.org to bypass the original sites&#039; rate limiting or other annoying restrictions:<br><br>&quot;AI Companies Are Killing The Internet Archive...&quot;<br>https:\/\/www.youtube.com\/watch?v=WsY<wbr>XXFT9SiM<br><br>Inb4 all the original sites request that their website be removed from Wayback Machine (WBM), then WBM complies because they&#039;re procensorship.<br><br><a href=\"#p108923830\" class=\"quotelink\">&gt;&gt;108923830<\/a><br><span class=\"quote\">&gt;https:\/\/archive.is\/JlGGK<\/span><br><span class=\"quote\">&gt;Original: https:\/\/blog.csdn.net\/qq_33472553\/a<wbr>rticle\/details\/143965935<\/span><br><span class=\"quote\">&gt;28 May 2026 06:23:27 UTC<\/span><br>Not entirely correct. The .cdxj file was originally this<br><span class=\"quote\">&gt;[...]&quot;original_uri&quot;: &quot;https:\/\/web.archive.org\/web\/202506<wbr>29102618\/https:\/\/blog.csdn.net\/qq_3<wbr>3472553\/article\/details\/143965935 [...]<\/span><br>had to remove all references to web.archive.org; otherwise, ipwb wouldn&#039;t work.<br><br>I forgot to change 20260528055120 to 20250629102618 in the .cdxj, oops.","time":1779950525,"resto":108914628},{"no":108923986,"now":"05\/28\/26(Thu)03:05:58","name":"Anonymous","com":"<a href=\"#p108914639\" class=\"quotelink\">&gt;&gt;108914639<\/a><br>Saw another one:<br><br><span class=\"quote\">&gt;https:\/\/web.archive.org\/web\/202605<wbr>28065757\/https:\/\/desuarchive.org\/g\/<wbr>thread\/79634154\/<\/span><br><span class=\"quote\">&gt;Anonymous Mon 11 Jan 2021 02:56:41 No.79634154 View ViewReport<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;\/web archiving general\/<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;what snap app would you recommend to save a web2.0 (or 3.0) with all the markup CSS JavaScript and html5 things working ?<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;&gt;like YouTube dl but for the whole in browser page<\/span><br><span class=\"quote\">&gt;&gt;freezing ffox or chromium state and storing the page perpetually would theoritically do the same but I&#039;m a math grad so I can&#039;t do shit<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;discuss your web crawlers, what zone of the web do you consider more important, unexplored or simply most lulz worthy and how do you explore and store it.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;&gt;data hoarders and lifehack savers welcomed<\/span><br>Some one replied:<br><span class=\"quote\">&gt;https:\/\/github.com\/Y2Z\/monolith<\/span><br>Description of that:<br><span class=\"quote\">&gt;CLI tool and library for saving complete web pages as a single HTML file<\/span><br><br>So that&#039;s basically SingleFile-CLI, but a different project.","filename":"XjYsi","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1779951958788205,"time":1779951958,"md5":"yuJgAPpztLe9OgH48+IDxA==","fsize":31809,"resto":108914628},{"no":108924009,"now":"05\/28\/26(Thu)03:10:04","name":"Anonymous","com":"<a href=\"#p108923986\" class=\"quotelink\">&gt;&gt;108923986<\/a><br><span class=\"quote\">&gt;tab rehab<\/span><br>LOL","time":1779952204,"resto":108914628},{"no":108924128,"now":"05\/28\/26(Thu)03:38:58","name":"Anonymous","com":"<a href=\"#p108924009\" class=\"quotelink\">&gt;&gt;108924009<\/a><br>tabhab","time":1779953938,"resto":108914628},{"no":108925208,"now":"05\/28\/26(Thu)07:56:53","name":"Anonymous","com":"This is a screenshot of a website about optical illusions. This file has a timestamp of 2007-09-04!<br><br>This image was deleted off of https:\/\/archive.org\/details\/ because that website is ran by absolute turds.<br><br>An amount of ETH was spent to upload this pic to a better archival system; here&#039;s the TX:<br>https:\/\/superstone.site\/raw\/IvNMUsM<wbr>-eqJkV-Ru20P6OMEUAhAmIjq5iWPZcIAjEF<wbr>Y","filename":"Homepage - Dragon Illusion @ Grand Illusions","ext":".png","w":1007,"h":1075,"tn_w":117,"tn_h":125,"tim":1779969413847234,"time":1779969413,"md5":"4K9u5AQVj3li1cjhgi92Hg==","fsize":428805,"resto":108914628},{"no":108925288,"now":"05\/28\/26(Thu)08:18:10","name":"Anonymous","com":"First time hearing that PDF files could be uploaded to 4chan was when the 4chan hack happened, done by syjak(s).<br><br>Here&#039;s a .pdf that was posted to <a href=\"\/\/boards.4chan.org\/tg\/\" class=\"quotelink\">&gt;&gt;&gt;\/tg\/<\/a>:<br>https:\/\/desu-usergeneratedcontent.x<wbr>yz\/tg\/image\/1515\/63\/1515633945730.p<wbr>df<br><br>It&#039;s &quot;The Library of Babel, by Jorge Luis Borges (1941)&quot;. Only 8 pages, go read it. That essay or short story is related to archiving.","filename":"1502190998919","ext":".png","w":725,"h":750,"tn_w":120,"tn_h":125,"tim":1779970690318265,"time":1779970690,"md5":"5rvNumWd\/vPY6a3rBt9TYQ==","fsize":37352,"resto":108914628},{"no":108926406,"now":"05\/28\/26(Thu)11:37:46","name":"Anonymous","com":"<a href=\"#p108921969\" class=\"quotelink\">&gt;&gt;108921969<\/a><br>It might be true about the exclusions, but 9\/10 times a page I am looking for, if it is archived anywhere, will be on the wayback machine and not on archive.today. It&#039;s a pretty wide variety of sites I am looking at, too. I think it&#039;s because wayback uses a crawler, I&#039;m not sure archive.today does or whether it&#039;s all just manual. If they have a crawler it seems much worse than the wayback one.","time":1779982666,"resto":108914628},{"no":108926979,"now":"05\/28\/26(Thu)13:12:47","name":"Anonymous","com":"<a href=\"#p108926406\" class=\"quotelink\">&gt;&gt;108926406<\/a><br>First, we must understand that<br>- archive.today uses a &quot;frozen page&quot; system (similar to SingleFile and Monolith)<br>- web.archive.org uses a WARC-based system (WARCs are used by or created by grab-site, GNU Wget, InterPlanetary Wayback \/ ipwb which also uses IPFS, etc.)<br>- both use browsers to capture web data, we know that at least web.archive.org uses non-browser tools as well<br><br>The last time Wayback Machine MAYBE used it&#039;s own freestanding crawler was years ago, like a decade ago. Ever since then they only get web data from:<br>- Save Page Now (SPN): users go to their site and manually, one-by-one, submit URLs to be saved<br>- ArchiveTeam: they run a distributed virtual machine system that people use to mass download websites that are scheduled to be shutdown or something. This &quot;ArchiveTeam Warrior&quot; software uses grab-site internally or something (I know the VM is based on Alpine Linux). I stopped caring to do that anymore due to my own petty dislike of ArchiveTeam&#039;s IRC channels; they have Reddit-tier mods, so fuck them.<br>- Common Crawl: terabytes (or petabytes?) of web crawl data! They should have grabbed websites harder because I know of so many websites\/pages\/forums that they&#039;re missing.<br><br>Open up a Wayback Machine (WBM) capture and click the &quot;About&quot; button\/link on the timeline thing at the top. It&#039;ll tell you the &quot;Why?&quot;: if this capture created by an ArchiveTeam project, if the capture was created by SPN, if the capture was created by something else.<br><br>All the fucking AI sloppers scrapped the web in an unethical way. What they should have done, especially in the early days of AI:<br>1. Used grab-site to download many webpages; this creates many .warc.gz files<br>2. Donate that data to archive.org (the pages may also be allowed to show up in WBM)<br><br>It&#039;s that simple, but they were greedy fucks. Also WBM and Internet Archive (IA) sucks balls so fuck them as well.","time":1779988367,"resto":108914628},{"no":108927013,"now":"05\/28\/26(Thu)13:18:01","name":"Anonymous","com":"<a href=\"#p108926979\" class=\"quotelink\">&gt;&gt;108926979<\/a><br>Interesting anon, I appreciate the explanation. Yeah sucks that AI providers couldn&#039;t donate all their scraping back but I would have been surprised if they did, desu.<br><span class=\"quote\">&gt;Also WBM and Internet Archive (IA) sucks balls<\/span><br>Because of the exclusions\/censorship?","time":1779988681,"resto":108914628},{"no":108927102,"now":"05\/28\/26(Thu)13:29:16","name":"Anonymous","com":"archive.today is 100% manual, one-by-one user submitted<br><br>wayback machine maybe ran it&#039;s own crawler years ago, but they mainly get data from other peoples&#039; and other organizations&#039; crawls + manual one-by-one user substitutions<br><br>WARC-based archival systems are ultimately better than frozen page archival systems. There&#039;s pros and cons:<br>- pro: WARC has much better one-to-one correspondence to the original web raws and server headers<br>- con: WARC is usually based on CLI tools, and sometimes it&#039;s impossible for those to grab pages running some Cuckflare or Anubis anti-bot\/anti-archival thing<br>- pro: Frozen can have better archival fixity<br>- con: Frozen has no server headers saved<br>- con: Frozen doesn&#039;t work in any of the WARC \/ web replay systems without extra programming and working on it<br>- con: Frozen has no or little record of the functionality of JavaScript and WebAssembly<br><br>I could maybe or probably go on and on about this stuff. Oh, one time I was talking to this retarded furfag who said that he&#039;d rather have the web page raws and not the .warc.gz files. What a dumb bastard. The furry like persistently argued that having all the uncompressed data as .html, .css, .js, etc. was better; he said he didn&#039;t care about server headers and the metadata found in WARC files. One thing to realize is if it was all uncompressed as .html\/.css\/.js\/etc. then many times things would simply not work. You need the WARCs + a WARC replay system to correctly have the web data replay and not be broken. And sometimes it&#039;s necessary like for https:\/\/dropbox.com\/example?fileId=<wbr>892182189892898 where the MEANINGFUL filename is only in the server header otherwise you have some random filename like &quot;892182189892898&quot; if you only have the web raws.<br><br>So if you opened https:\/\/dropbox.com\/example?fileId=<wbr>892182189892898 in the replay system (an custom Electron\/Chrome browser or something) it would say &quot;where do you want to download file &#039;2026-04-26-091018_1280x1024_scrot.<wbr>png&#039;?&quot; or whatever the filename is.","time":1779989356,"resto":108914628},{"no":108927201,"now":"05\/28\/26(Thu)13:44:43","name":"Anonymous","com":"<a href=\"#p108927013\" class=\"quotelink\">&gt;&gt;108927013<\/a><br><span class=\"quote\">&gt;Because of the exclusions\/censorship?<\/span><br>Yes. Internet Archive is literally and figuratively ran by trannies. What do trannies do? They erase history and support censorship. The https:\/\/archive.org\/details\/ section of their shitty website operates much like JewTube: if they dislike you then they will delete you account or most of its items.<br><br>They mass delete thousands of archive.org\/details\/ items for completely asinine or hypocritical reasons. One time the upper-level IA turd(s) said in a blog post that the Wayback Machine is the jewel of IA. Like I was saying, WBM is WARC-based. The IA jannies deleted multiple archive.org\/details\/ items which were fully WARC grabs of entire websites. They are discriminatory against anyone who isn&#039;t their ArchiveTeam buttbuddies. So that&#039;s terabytes of web data -- which they basically said they value the most, especially it it&#039;s .warc files -- lost, because it was uploaded by non-ArchiveTeam accounts.<br><br>All WARCs from non-ArchiveTeam accounts uploaded to IA were ingressed into WBM; that ended in some year, maybe 2015 due to distrusting &quot;Internet randos&quot;, they didn&#039;t want them to modify the WARC data to make fake web archive captures. Sounds like stupid elitism; WBM could have a thing where you click on &quot;About&quot; in the capture and it says &quot;from a non-ArchiveTeam WARC&quot;. archive.today chads keep winning (<a href=\"#p108923876\" class=\"quotelink\">&gt;&gt;108923876<\/a>): they have a thing which works with memento URLs.<br><br><a href=\"#p108927102\" class=\"quotelink\">&gt;&gt;108927102<\/a><br>Rather &quot;+ manual one-by-one user submissions&quot;<br><br>About the furry who stupidly wanted an HTTrack-style copy of the website: at least he was interested.","time":1779990283,"resto":108914628},{"no":108927262,"now":"05\/28\/26(Thu)13:53:18","name":"Anonymous","com":"Man, saucenao can&#039;t even find pictures from Pixiv anymore. The walls are getting bad bros.","filename":"1763211459008515","ext":".jpg","w":1719,"h":1240,"tn_w":125,"tn_h":90,"tim":1779990798402678,"time":1779990798,"md5":"x\/pNxdt5j7817nF2fxezdA==","fsize":1385165,"resto":108914628},{"no":108927266,"now":"05\/28\/26(Thu)13:53:42","name":"Anonymous","com":"<a href=\"#p108927201\" class=\"quotelink\">&gt;&gt;108927201<\/a><br>Rather &quot;archive.org\/details\/ section of their shitty website operates much like JewTube: if they dislike you then they will delete your account&quot;<br><br>Rather &quot;IA jannies deleted multiple archive.org\/details\/ items which were full WARC grabs&quot;<br><br>Rather &quot;they value the most, especially if it&#039;s .warc files&quot;<br><br><span class=\"quote\">&gt;dumb furry<\/span><br>We were talking about a 1-TB torrent of a full-site WARC of a shutdown website. That torrent data was uploaded to archive.org\/details\/ and subsequently deleted.<br><br><a href=\"#p108916448\" class=\"quotelink\">&gt;&gt;108916448<\/a><br>\/AAD\/ is like the pro-Archive.org general.<br><br>\/asdiq\/ is or can be the anti-Archive.org general. Archival underground or something.<br><br>I wish either general would have a million posters as Friendly GNU\/Linux Thread general and other &gt;&gt;&gt;\/g generals have. Unfortunately, I don&#039;t think that will happen as I think there&#039;s little interest in archiving and we archivists will remain in the minority. All based on what I&#039;ve observed with forum posters&#039; and peoples&#039; interests in archiving.","time":1779990822,"resto":108914628},{"no":108929045,"now":"05\/28\/26(Thu)18:07:20","name":"Anonymous","com":"<a href=\"#p108914628\" class=\"quotelink\">&gt;&gt;108914628<\/a><br><span class=\"quote\">&gt;storage tech and questions<\/span><br>I found an old file container that I think is TrueCrypt, I know the pass is less than 10 characters but can&#039;t remember it. How do I brute force mounting it? I&#039;m pretty sure of the first 3 characters so that only makes the length at most 7 chars so should not take too long since I also know it only has English alphabet characters.","time":1780006040,"resto":108914628},{"no":108929172,"now":"05\/28\/26(Thu)18:23:14","name":"Anonymous","com":"<a href=\"#p108929045\" class=\"quotelink\">&gt;&gt;108929045<\/a><br><span class=\"quote\">&gt;only has English alphabet characters<\/span><br>All lower case? If so, then that&#039;s 26^7 which is 8,031,810,176 permutations. 8 billion different permutations would be done shortly, as long as the thing handling the password doesn&#039;t make you wait 1 second between each attempt.<br><br>(Made me think of the total amount of Monero hashes I&#039;ve checked\/mined over the months: 272,989,758,847, which is 273 billion, but the system was designed so that it takes a while to calculate each one.)","time":1780006994,"resto":108914628},{"no":108929245,"now":"05\/28\/26(Thu)18:35:32","name":"Anonymous","com":"<a href=\"#p108929172\" class=\"quotelink\">&gt;&gt;108929172<\/a><br>I think both upper and lower unfortunately but if I&#039;m right and it&#039;s a TrueCrypt container it shouldn&#039;t take long to try one password. Would love a GPU brute-forcer program if there is one.","time":1780007732,"resto":108914628},{"no":108930648,"now":"05\/28\/26(Thu)23:08:13","name":"Anonymous","com":"Ways to upload to arweave from a CLI, using an API or something?","time":1780024093,"resto":108914628},{"no":108930781,"now":"05\/28\/26(Thu)23:36:44","name":"Anonymous","com":"<a href=\"#p108925288\" class=\"quotelink\">&gt;&gt;108925288<\/a><br><span class=\"quote\">&gt;The Library of Babel, by Jorge Luis Borges (1941)<\/span><br><span class=\"quote\">&gt;Like all men of the Library, I have traveled in my youth; I have wandered in search of a book, perhaps the catalogue of catalogues; now that my eyes can hardly decipher what I write, I am preparing to die just a few leagues from the hexagon in which I was born. Once I am dead, there will be no lack of pious hands to throw me over the railing; my grave will be the fathomless air; my body will sink endlessly and decay and dissolve in the wind generated by the fall, which is infinite. I say that the Library is unending.<\/span><br>Me when I die in the universe-sized library.<br><br><a href=\"#p108929045\" class=\"quotelink\">&gt;&gt;108929045<\/a><br><span class=\"quote\">&gt;found an old file container that I think is TrueCrypt<\/span><br>An older segment of data from you past I assume. I&#039;m guessing you created that and forgot part of the password.<br><br>I have someone else&#039;s PS4 HDD. It&#039;s multiple TB in size, and I basically can&#039;t decrypt it as I don&#039;t have the PlayStation 4 to get the keys out of. More on that:<br><span class=\"quote\">&gt; https:\/\/desuarchive.org\/g\/thread\/10<wbr>8785672\/#108861085<\/span><br><span class=\"quote\">&gt; &gt;decrypted only UFS2 fs table<\/span><br><span class=\"quote\">&gt; Command in that script that does that:<\/span><br><span class=\"quote\">&gt; &gt;$ sudo cryptsetup -r create -c aes-xts-plain64 -d ${TOOLKIT_PATH}\/keys\/${KEY_ES} -s 256 ps4hdd_es ${DEVICE}<\/span><br><span class=\"quote\">&gt; where $DEVICE is \/dev\/sdx<\/span><br>Ah, I remember now, no one&#039;s found out how to decrypt such external PS4 HDDs. They do know how to decrypt internal PS4 HDDs (and of course you can move data from the external one to the internal one).<br><br><a href=\"#p108929245\" class=\"quotelink\">&gt;&gt;108929245<\/a><br>52^7 = 1,028,071,702,528 = about 1 trillion. Sounds doable in not too long.","time":1780025804,"resto":108914628},{"no":108931323,"now":"05\/29\/26(Fri)01:17:54","name":"Anonymous","com":"<a href=\"#p108929245\" class=\"quotelink\">&gt;&gt;108929245<\/a><br>Install Python and Hashcat.<br><br>In a command line, navigate to the hashcat directory:<br>cd hashcat<br><br>Get the hash for your TrueCrypt file:<br>python &quot;tools\/truecrypt2hashcat.py&quot; &quot;tcdir\/tcfile.tc&quot; &gt; &quot;tcfile.hash&quot;<br><br>List backend devices so you can choose which one you want:<br>hashcat --backend-info<br><br>Then you can run Hashcat on your chosen GPU (mine is 2):<br>hashcat --backend-devices 2 -a 3 -m 29321 -1 ?l?u --increment --increment-min=3 --increment-max=10 &quot;tcfile.hash&quot; &quot;ABC?1?1?1?1?1?1?1&quot;<br><br>Replace ABC with whatever the first 3 characters are if you know them.<br>You can replace ?1 (custom charset 1, defined here as uppercase or lowercase) with ?l (just lowercase) or ?u (just uppercase) if you think a certain character is one or the other for sure and reduce the search.<br>You can check hash mode 29313 for RIPEMD160, 29323 for SHA512, or 29333 for Whirlpool. The last digit is the max number of chained ciphers it will check. If you only have one (like AES) then you can put it as 1 (like 29321) and it should be a bit faster, though this won&#039;t crack it if you have two or three.<br><br>If you used an old version of TrueCrypt before XTS was implemented, Hashcat doesn&#039;t support that.","time":1780031874,"resto":108914628},{"no":108931795,"now":"05\/29\/26(Fri)02:42:51","name":"Anonymous","com":"<a href=\"#p108925288\" class=\"quotelink\">&gt;&gt;108925288<\/a><br>PDF:<br><span class=\"quote\">&gt;each book is of four hundred and ten pages; each page, of forty lines, each line, of some eighty letters which are black in color<\/span><br>Web incarnation:<br><span class=\"quote\">&gt;https:\/\/libraryofbabel.info\/search<wbr>.cgi<\/span><br><br>These mostly-meaningless books aren&#039;t so easy to archive. They compress to about 800,000 bytes at the smallest. Uncompressed size: about 1,333,000 bytes. It would be easier if each book was smaller than 100,000 bytes when compressed. Here&#039;s a Library of Babel book titled &quot;swj cftauthd gwlb&quot; (AKA &quot;fuckai&quot;) and an edited\/abridged version of it:<br>https:\/\/hupsoapsoap.store\/raw\/bFV6_<wbr>CKqQMMYDbS-1l1VsJ_Qozq_frnbFpO2-Yu-<wbr>92k","filename":"t3hL67xHNes-maxres","ext":".jpg","w":1280,"h":720,"tn_w":125,"tn_h":70,"tim":1780036971809678,"time":1780036971,"md5":"hAVZ0ZKwytUv+pbLVDG5iA==","fsize":160147,"resto":108914628},{"no":108931943,"now":"05\/29\/26(Fri)03:14:05","name":"Anonymous","com":"<a href=\"#p108931795\" class=\"quotelink\">&gt;&gt;108931795<\/a><br><span class=\"quote\">&gt;libraryofbabel.info<\/span><br>Site history:<br><br>2018: had it&#039;s own forums<br>https:\/\/web.archive.org\/web\/2018051<wbr>1224119\/https:\/\/libraryofbabel.info<wbr>\/<br>https:\/\/web.archive.org\/web\/2018050<wbr>4232955\/http:\/\/www.libraryofbabel.i<wbr>nfo\/forum\/?page_id=14<br><br>2019: storage device failure = forums lost, I guess, hopefully not<br>https:\/\/web.archive.org\/web\/2019051<wbr>3174601\/https:\/\/libraryofbabel.info<wbr>\/<br><span class=\"quote\">&gt;My apologies; Due to hardware failure libraryofbabel.info had some downtime. I am still working to restore the forums. -JB<\/span><br><br>2020: uses Reddit as forums<br>https:\/\/web.archive.org\/web\/2020051<wbr>9031603\/https:\/\/libraryofbabel.info<wbr>\/<br>https:\/\/web.archive.org\/web\/2020062<wbr>6041708\/https:\/\/www.reddit.com\/r\/Ba<wbr>belForum<br><br>I remember posting on those forums back when they weren&#039;t Plebbit.<br><br><span class=\"quote\">&gt;image<\/span><br>YouTube ID is https:\/\/www.youtube.com\/watch?v=t3h<wbr>L67xHNes<br><br><span class=\"quote\">&gt;ebook filesizes<\/span><br>In the Babel Image Archives section: while the IDs for the images are always approximately 940 KB in size, the images themselves are often smaller than 100 KB. Picrel is an example of that.","filename":"2a14e2c5-e110-4251-a8d2-a3290a2b3a68","ext":".jpg","w":640,"h":416,"tn_w":125,"tn_h":81,"tim":1780038845765571,"time":1780038845,"md5":"tB3kLaxNh9\/pKwmp4YPeQw==","fsize":95307,"resto":108914628},{"no":108932083,"now":"05\/29\/26(Fri)03:40:47","name":"Anonymous","com":"<a href=\"#p108921219\" class=\"quotelink\">&gt;&gt;108921219<\/a><br>Running &quot;ngrok http 8080&quot; now magically works for some reason.<br><br>However, ngrok was a big WASTE OF TIME. Same crap happened that happened with <a href=\"#p108915771\" class=\"quotelink\">&gt;&gt;108915771<\/a> except this time it says:<br><span class=\"quote\">&gt;ERR_NGROK_6024 - You are about to visit directed-snoring-available.ngrok-fr<wbr>ee.dev, served by [IPv6 address]. This website is served for free through ngrok.com. You should only visit this website if you trust whoever sent the link to you.<\/span><br><span class=\"quote\">&gt;[click button to see this webpage]<\/span>","filename":"ngrok","ext":".png","w":1280,"h":937,"tn_w":125,"tn_h":91,"tim":1780040447200522,"time":1780040447,"md5":"fAYGGhzTII9VSS9UoekxQw==","fsize":94901,"resto":108914628},{"no":108932181,"now":"05\/29\/26(Fri)04:03:35","name":"Anonymous","com":"<a href=\"#p108930781\" class=\"quotelink\">&gt;&gt;108930781<\/a><br><span class=\"quote\">&gt;An older segment of data from you past I assume. I&#039;m guessing you created that and forgot part of the password.<\/span><br>Yeah it&#039;s my own old encrypted file container. I don&#039;t remember what&#039;s in it but I don&#039;t want to delete it without knowing.<br><br>I looked at the desuarchive link you gave and the 4dOp-QA4VK4 video it contained, seems to be unrelated BitLocker stuff? My container was made with TrueCrypt or VeraCrypt (I think TrueCrypt since it is an old file).<br>If I misunderstand and you were trying to tell me of a program to brute force the container please tell me again.","time":1780041815,"resto":108914628},{"no":108932186,"now":"05\/29\/26(Fri)04:05:32","name":"Anonymous","com":"<a href=\"#p108932083\" class=\"quotelink\">&gt;&gt;108932083<\/a><br>And yes clicking the button did show the webpage, but this doesn&#039;t help in my goal of web archiving. Maybe I will buy some .xyz or .space domain name (some cheap TLD)...<br><br>Interesting site:<br>https:\/\/dnhub.io\/bulk-tld-check<br><br>You put in some text and it sees if there&#039;s any sites for that. So you can put in &quot;fuckai&quot; and it&#039;ll show<br>https:\/\/fuckai.studio<br>https:\/\/fuckai.lol<br>https:\/\/fuckai.se<br>etc.","filename":"fuckai.studio","ext":".mp4","w":1280,"h":940,"tn_w":125,"tn_h":91,"tim":1780041932792237,"time":1780041932,"md5":"BJ1SP9JAcPryDq7pu+hQfw==","fsize":2930769,"resto":108914628},{"no":108932219,"now":"05\/29\/26(Fri)04:13:41","name":"Anonymous","com":"<a href=\"#p108932181\" class=\"quotelink\">&gt;&gt;108932181<\/a><br>I know basically nothing about TrueCrypt. The desuarchive thread I linked was just to show the investigation into PS4 jailbreaking and is unrelated to your problem or project.<br><br>I posted about it because I had a similar problem, but no solution in my case. The PlayStation 4 formats HDDs with some crap that makes things difficult; cryptsetup is involved, also unrelated to TrueCrypt.<br><br>See this anon&#039;s advice, he seems to be the genius with a solution: <a href=\"#p108931323\" class=\"quotelink\">&gt;&gt;108931323<\/a>","time":1780042421,"resto":108914628},{"no":108932238,"now":"05\/29\/26(Fri)04:16:12","name":"Anonymous","com":"<a href=\"#p108932219\" class=\"quotelink\">&gt;&gt;108932219<\/a><br>oh shit i didn&#039;t see <a href=\"#p108931323\" class=\"quotelink\">&gt;&gt;108931323<\/a> for some reason.<br>will look at it but it looks too hard for me. had hoped for some gui program for babby&#039;s first data recovery.","time":1780042572,"resto":108914628},{"no":108932599,"now":"05\/29\/26(Fri)05:35:15","name":"Anonymous","com":"<a href=\"#p108932083\" class=\"quotelink\">&gt;&gt;108932083<\/a><br>If it&#039;s suggesting client-side changes then that&#039;s certainly a waste of time. Otherwise, messing with the reverse proxy local server: could make it send those headers server-side = removes that verification page. Here&#039;s hoping.<br><br><a href=\"#p108916412\" class=\"quotelink\">&gt;&gt;108916412<\/a><br>This is a photo from 2020-08-03 in USA. It shows a bench \/ outdoor table wrapped in plastic wrap due to Covid-19.<br><br>This image was deleted off of https:\/\/archive.org\/details\/, also not deleted by the uploader.<br><br>Photo with metadata retained (has no GPS info):<br>https:\/\/liluandinhcao.store\/raw\/K8u<wbr>CucJk8fTrud-Yry9sgH6ka3KhNDNFlx2sfF<wbr>YNP58","filename":"IMG_20200803_153438","ext":".jpg","w":1600,"h":1200,"tn_w":125,"tn_h":93,"tim":1780047315014836,"time":1780047315,"md5":"oy8AHglfSh+9FUBwC0qQcA==","fsize":248710,"resto":108914628},{"no":108934397,"now":"05\/29\/26(Fri)11:49:58","name":"Anonymous","com":"<a href=\"#p108932238\" class=\"quotelink\">&gt;&gt;108932238<\/a><br>Are you using Windows or Linux? How large is the file?<br><br>Things should be easier now. You can talk to a chatbot at duck.ai and it&#039;ll probably help you. Or ask questions in this thread.<br><br>You can ask the LLM chatbot to analyze that command or ask it for where to get those files and so on. Here&#039;s another breakdown of what part of that does:<br><br>https:\/\/explainshell.com\/explain?cm<wbr>d=hashcat+--backend-devices+2+-a+3+<wbr>-m+29321+-1+%3Fl%3Fu+--increment+--<wbr>increment-min%3D3+--increment-max%3<wbr>D10+%22tcfile.hash%22+%22ABC%3F1%3F<wbr>1%3F1%3F1%3F1%3F1%3F1%22<br><br>I&#039;m curious to know what lies forgotten in said encrypted data.","time":1780069798,"resto":108914628},{"no":108936132,"now":"05\/29\/26(Fri)16:11:13","name":"Anonymous","com":"<a href=\"#p108932599\" class=\"quotelink\">&gt;&gt;108932599<\/a><br><span class=\"quote\">&gt;If it&#039;s suggesting client-side changes then that&#039;s certainly a waste of time. Otherwise, messing with the reverse proxy local server: could make it send those headers server-side = removes that verification page. Here&#039;s hoping.<\/span><br>Request headers are what the browser sends to the server. I don&#039;t think you can configure an nginx reverse proxy to inject certain Request Headers. Therefore, it&#039;s just client-side crap. It would be better if it wanted response headers to be changed (to removed the verification wall), as that can be controlled by the reverse proxy server.","time":1780085473,"resto":108914628},{"no":108936477,"now":"05\/29\/26(Fri)17:03:13","name":"Anonymous","com":"<a href=\"#p108934397\" class=\"quotelink\">&gt;&gt;108934397<\/a><br>Windows, size is only just under 1 GB. Might just be old school work. But it could be important so I must know.<br>Thanks for the tip about LLM. And explainshell.com was neat.","time":1780088593,"resto":108914628},{"no":108937519,"now":"05\/29\/26(Fri)19:29:19","name":"Anonymous","com":"Years ago, I had an HDD, model WDC WD5000AVVS-6 (Disk \/dev\/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors). I may still have that 500GB hard drive, or a .img file of it via &quot;sudo cat \/dev\/sdb &gt; raw.img&quot;.<br><br>It&#039;s a DirecTV HR22 Receiver 500GB hard drive which had XFS as its file system. It maybe needs a decryption key to read all of it&#039;s data. I assume that key is in the device it was in (not in the HDD). In the year 2021, I was able to carve JPGs\/PNGs out of it with Binwalk.<br><br>Output of command<br>$ sudo xfs_ncheck \/dev\/sdb2 &gt; xfs.txt<br>is here:<br>- meta: https:\/\/sieucapchinhtri.store\/raw\/F<wbr>jM_K255ApAY1eCct840-hsJ3k9o64PN_dHb<wbr>FPQ6DgA<br>- data: https:\/\/ong3.xyz\/raw\/5cKzEEO6xtnLoZ<wbr>W7Yly3myBoZJyvoSK9DZC8ngKsnaY<br><br>It lists inodes and paths, such as<br><span class=\"quote\">&gt; 1397141 network\/apg_data\/extgdb_objs_go_hd\/<wbr>0000078b\/extgdb_go_0078bb5c<\/span><br><span class=\"quote\">&gt; 14025 viewer\/segments\/Rcrd-03-18-2015-102<wbr>7-08-10405-ch501-min65535-src2.mpg\/<wbr>0000000006023020544<\/span><br><span class=\"quote\">&gt; 1142995 network\/apg_data\/extgdb_objs_go_hd\/<wbr>00000ad7\/.<\/span><br><span class=\"quote\">&gt; 1142996 dms_data\/encrypted.tmp<\/span><br><span class=\"quote\">&gt; 41197 viewer\/recordingsMessageCache\/recor<wbr>ding1094\/event0000001433334281085<\/span><br><span class=\"quote\">&gt; 4390 viewer\/indexfile\/Rcrd-03-03-2015-19<wbr>00-00-5-ch21-min65535-src2.mpg\/meta<wbr>_man.xma<\/span><br><span class=\"quote\">&gt; 63322816 backup\/viewer\/indexfile\/Rcrd-03-22-<wbr>2015-1900-00-26-ch254-min65535-src2<wbr>.mpg\/meta_man.xmi<\/span><br><span class=\"quote\">&gt; 54526098 network\/font_cache\/Direct Gothic_Medium_28_2_1_33_0_2_1_1_5A_<wbr>2_C0_2_5A_0_26<\/span><br><br>I was unable to extract or access those files with the usual methods; xfs_ncheck could see them, so there must be a way? I don&#039;t have the full HDD or .img now, but I have 250-MB sections of the .img...","filename":"7RfaT","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780097359087119,"time":1780097359,"md5":"2xdnhMJO7v3e3d+SHh5mCQ==","fsize":39189,"resto":108914628},{"no":108937861,"now":"05\/29\/26(Fri)20:37:15","name":"Anonymous","com":"<a href=\"#p108936132\" class=\"quotelink\">&gt;&gt;108936132<\/a><br><span class=\"quote\">&gt;it&#039;s just client-side crap<\/span><br>Yeah, opening devtools in Brave Browser and running this in the console<br><span class=\"quote\">&gt;(async () =&gt; {<\/span><br><span class=\"quote\">&gt;const res = await fetch(&#039;https:\/\/directed-snoring-ava<wbr>ilable.ngrok-free.dev\/[...]&#039;, {<\/span><br><span class=\"quote\">&gt;method: &#039;GET&#039;, headers: {<\/span><br><span class=\"quote\">&gt;&#039;ngrok-skip-browser-warning&#039;: &#039;1&#039;<\/span><br><span class=\"quote\">&gt;}, credentials: &#039;omit&#039;, mode: &#039;cors&#039; }); const html = await res.text();<\/span><br><span class=\"quote\">&gt;console.log(html); })();<\/span><br>results in the webpage with not wall.<br><br>Running the same thing with a line commented out or change to some other header<br><span class=\"quote\">&gt;\/\/&#039;ngrok-skip-browser-warning&#039;: &#039;1&#039;<\/span><br>results in the verification-walled ngrok webpage.","time":1780101435,"resto":108914628},{"no":108938074,"now":"05\/29\/26(Fri)21:23:05","name":"Anonymous","com":"<a href=\"#p108937519\" class=\"quotelink\">&gt;&gt;108937519<\/a><br>&quot;Data mining&quot; info: I have<br><br>File &quot;wdc_wd5000avvs-63zwb0_skip0_count5<wbr>00000_bs512&quot; (256 MB, .gz?) = start of the drive, head bytes of the .img:<br><pre class=\"prettyprint\">$ sudo fdisk -l wdc_wd5000avvs-63zwb0_skip0_count50<wbr>0000_bs512~<br>Disk wdc_wd5000avvs-63zwb0_skip0_count50<wbr>0000_bs512~: 801.77 MiB, 840714240 bytes, 1642020 sectors<br>Units: sectors of 1 * 512 = 512 bytes<br>Sector size (logical\/physical): 512 bytes \/ 512 bytes<br>I\/O size (minimum\/optimal): 512 bytes \/ 512 bytes<br>Disklabel type: dos<br>Disk identifier: 0x00000000<br><br>Device                                           Boot    Start       End   Sectors   Size Id Type<br>wdc_wd5000avvs-63zwb0_skip0_count50<wbr>0000_bs512~1            64   1060289   1060226 517.7M 82 Linux swap \/ Solaris<br>wdc_wd5000avvs-63zwb0_skip0_count50<wbr>0000_bs512~2       1060296  32531624  31471329    15G 83 Linux<br>wdc_wd5000avvs-63zwb0_skip0_count50<wbr>0000_bs512~3      32531632 976768064 944236433 450.2G 83 Linux<br>$ # 256,000,000-byte file deleted off of archive.org\/details\/<\/pre><br><br>File &quot;...&quot; = middle partition, XFS according to years-old records of mine:<br><pre class=\"prettyprint\">$ lsblk -f<br>NAME   FSTYPE LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINT<br>[...]<br>sdb<br>\u2500sdb1 swap                                                             [SWAP]<br>\u2500sdb2 xfs          a870119e-1278-4c3c-a075-5793bde4788<wbr>d<br>\u2500sdb3<br>[...]<br>$<\/pre><br><br>File &quot;wdc_wd5000avvs-63zwb0_skip60500000<wbr>_count500000_bs512&quot; = start offset of 30,976,000,000 bytes (31 GB), filesize of 256 MB<br><br>Carve files out by using foremost:<br><span class=\"quote\">&gt;$ foremost -t all -o \/pathTo\/EmptyDir\/ wdc_wd5000avvs-63zwb0_skip0_count50<wbr>0000_bs512~<\/span><br><br>Foremost got<br>- thousands of files out of the first partition (a SWAP partition), such as picrel<br>- zero files out of part of the third partition (unknown filesystem)","filename":"00053296","ext":".png","w":616,"h":420,"tn_w":125,"tn_h":85,"tim":1780104185336117,"time":1780104185,"md5":"9ah8m01jPMYcsSATZJajCg==","fsize":60734,"resto":108914628},{"no":108938207,"now":"05\/29\/26(Fri)21:44:17","name":"Anonymous","com":"<a href=\"#p108938074\" class=\"quotelink\">&gt;&gt;108938074<\/a><br>Ugh, I&#039;d have to look through the files (each one increments by 500,000)<br>&quot;wdc_wd5000avvs-63zwb0_skip0_count5<wbr>00000_bs512&quot;<br>&quot;wdc_wd5000avvs-63zwb0_skip500000_c<wbr>ount500000_bs512&quot;<br>...<br>&quot;wdc_wd5000avvs-63zwb0_skip60500000<wbr>_count500000_bs512&quot;<br><br>then find where the offset of 542,871,552 bytes is for partition 2<br><br>then fine where the offset of 16,656,195,584 bytes is for partition 3<br><br>all just to try again at a problem I couldn&#039;t solve years ago. Maybe foremost would succeed where Binwalk failed. The better program for carving out files seems to be foremost. (Plus, maybe LLMs could help me: weren&#039;t a thing in 2021.)","filename":"00052408","ext":".png","w":616,"h":420,"tn_w":125,"tn_h":85,"tim":1780105457353123,"time":1780105457,"md5":"jTDdhe0skDX19\/Lum3r5YA==","fsize":57269,"resto":108914628},{"no":108938346,"now":"05\/29\/26(Fri)22:14:06","name":"Anonymous","com":"<a href=\"#p108938207\" class=\"quotelink\">&gt;&gt;108938207<\/a><br>foremost also craved zero files out of this file, which is in the partition 2 section<br>&quot;wdc_wd5000avvs-63zwb0_skip2000000_<wbr>count500000_bs512&quot;<br><br>Worse, I suspect that those split files are from a file named something like<br>&quot;wdc_wd5000avvs-63zwb0.img.gz&quot;<br><br>and in that case, I&#039;d have to download 20 to 30 GB of the files, decompress it, then struggle more to get this locked-down stuff to work in ways I want it to.","filename":"00051928","ext":".png","w":616,"h":560,"tn_w":125,"tn_h":113,"tim":1780107246941226,"time":1780107246,"md5":"9lmLQIjffD1fnYoNC+AeDQ==","fsize":49174,"resto":108914628},{"no":108938361,"now":"05\/29\/26(Fri)22:17:38","name":"Anonymous","com":"<a href=\"#p108938346\" class=\"quotelink\">&gt;&gt;108938346<\/a><br>So would that all be worth it to, say, possible see someone&#039;s DVR recordings from 2015 (.mpg video files or something)? I can at least get some images and HTML files out of it. I can&#039;t get all the files that it claims to contain as per <a href=\"#p108937519\" class=\"quotelink\">&gt;&gt;108937519<\/a>","filename":"00048240","ext":".png","w":372,"h":372,"tn_w":124,"tn_h":124,"tim":1780107458233642,"time":1780107458,"md5":"ObNGC5YqNpg7mPQCEnyQRg==","fsize":30635,"resto":108914628},{"no":108939910,"now":"05\/30\/26(Sat)04:16:16","name":"Anonymous","com":"There&#039;s a trend of normie content winning out over everything else. This victory of globohomo (global homogenization) means that the videos, for example, that people watch are mostly completely sanitized politically correct private-equity-funded YouTube videos.<br><br>This trend is fueled by the censorship that happens on all major platforms (including archive.org\/details\/).<br><br>Some videos which aren&#039;t globalist homogenization:<br>https:\/\/web.archive.org\/web\/2026050<wbr>3041834\/https:\/\/chanii.ddns.net\/b\/r<wbr>es\/76.html<br>https:\/\/web.archive.org\/web\/2026050<wbr>4173705\/https:\/\/chanii.ddns.net\/b\/r<wbr>es\/577.html<br><br>That website appeared to have went offline forever in around May 6, 2026. Last post I know of was <span class=\"deadlink\">&gt;&gt;779<\/span> at 05\/04\/26 (Mon) 21:56:37 UTC. Said in one of the MP4s\/WebMs, something like: &quot;A man lives three lives. First, the lose of innocence. Second, the lose of naivety. And lastly, the lose of life itself.&quot;","time":1780128976,"resto":108914628},{"no":108939930,"now":"05\/30\/26(Sat)04:21:30","name":"Anonymous","com":"<a href=\"#p108939910\" class=\"quotelink\">&gt;&gt;108939910<\/a><br><span class=\"quote\">&gt;Some videos which aren&#039;t global homogenization:<\/span><br>Or, just go to <a href=\"\/\/boards.4chan.org\/wsg\/\" class=\"quotelink\">&gt;&gt;&gt;\/wsg\/<\/a> and <a href=\"\/\/boards.4chan.org\/gif\/\" class=\"quotelink\">&gt;&gt;&gt;\/gif\/<\/a> if you like the 4chan way of watching videos and talking about them.","filename":"txz7mtcnbx931-1643389619","ext":".jpg","w":640,"h":1138,"tn_w":70,"tn_h":125,"tim":1780129290846122,"time":1780129290,"md5":"TWbiY2dJVECMFrcCVZvVeA==","fsize":159121,"resto":108914628},{"no":108939995,"now":"05\/30\/26(Sat)04:38:44","name":"Anonymous","com":"Aaron Swartz is the Jewish co-founder of Reddit.<br><br>He mass downloaded JSTOR in a ridiculous way: sent like a billion requests per second. Folks, this is why when downloading a website, you don&#039;t exceed a download concurrency of 4.<br><br>That resulted in him getting into legal trouble which was very ridiculous. Later on, his cause of death was suicide by hanging.<br><br>Saw this thing today about him:<br>https:\/\/web.archive.org\/web\/2026053<wbr>0083519\/https:\/\/desuarchive.org\/g\/t<wbr>hread\/30706801\/<br>https:\/\/web.archive.org\/web\/2013051<wbr>5063511\/http:\/\/aaronsw.archiveteam.<wbr>org\/<br><br>It was encouraging people to download www.jstor.org web data as a fuck you to the tards who ran JSTOR.","filename":"Tr4Yt","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780130324872131,"time":1780130324,"md5":"8uH51bythDCut\/8kM8wK9Q==","fsize":142126,"resto":108914628},{"no":108940063,"now":"05\/30\/26(Sat)04:53:20","name":"Anonymous","com":"<a href=\"#p108939995\" class=\"quotelink\">&gt;&gt;108939995<\/a><br><span class=\"quote\">&gt;Aaron Swartz<\/span><br>Wonder what happened to the files that he specifically downloaded. I guess those files on his devices were deleted or lost.<br><br>I was reading that \/g\/ thread from 2013: apparently JSTOR put many public domain documents behind paywalls. The absurdity...","filename":"1358218820141s","ext":".jpg","w":250,"h":250,"tn_w":125,"tn_h":125,"tim":1780131200027753,"time":1780131200,"md5":"w++ripTy\/FedwcqQLSvjpQ==","fsize":13685,"resto":108914628},{"no":108940183,"now":"05\/30\/26(Sat)05:30:27","name":"Anonymous","com":"Anyone else notice how YouTube has been deleting full album videos in favor of individual music tracks? Why? Probably due to money\/greed.<br><br>I can still listen to full music albums as single files in IPFS: used it yesterday to relisten to some &quot;The Residents&quot; albums (picrel):<br>https:\/\/archive.is\/http:\/\/148.113.1<wbr>64.86:8080\/ipfs\/*<br><br><a href=\"#p108931943\" class=\"quotelink\">&gt;&gt;108931943<\/a><br>The Library of Babel has more books than atoms in the universe. I was reflecting on that short story; one of the things I have to say: they&#039;re not as meaningless as you think. Some of them are in fact completely meaningless, others are completely coherent, but written in unknown languages and encodings (or encrypted).<br><br>Number of atoms on\/in Earth: 10^40<br>Number of atoms in the universe: 10^80<br>Number of books in The Library of Babel: 10^4677<br>Number of images in the Babel Image Archives: 10^961755<br>(Numbers according to https:\/\/www.youtube.com\/watch?v=Sd0<wbr>tB3tR3yQ video)","filename":"s-l960","ext":".jpg","w":960,"h":769,"tn_w":125,"tn_h":100,"tim":1780133427962546,"time":1780133427,"md5":"T5E0sM3q4Z1zsVQmIobA5A==","fsize":148357,"resto":108914628},{"no":108940263,"now":"05\/30\/26(Sat)05:52:03","name":"Anonymous","com":"<a href=\"#p108940183\" class=\"quotelink\">&gt;&gt;108940183<\/a><br>YouTube uses DistroKid and stuff like that to autogenerate band pages. I&#039;ve even seen it smush together bands that have nothing to do with each other aside from similar names. It even did that with singers and comedians that have similar names. It&#039;s probably deleting everything not approved by the copyright holder even more aggressively than usual in order to push the official versions. Which is very bad considering that some rare and not so rare songs are georestricted or just not made publicly available at all (at least officially).<br><br><span class=\"quote\">&gt;Library of Babel<\/span><br>I&#039;ve stumbled upon this a bunch of times while browsing the indie web. I&#039;m not knowledgeable about Borges, so it just felt like a gimmick to me (even if the concept itself is very interesting). Did anyone navigate it productively, like actually finding things that are comprehensible, even meaningful? I wonder if some day it will be connected to some kind of LLM, as silly as the idea might seem now.","filename":"1779733571208315","ext":".jpg","w":600,"h":523,"tn_w":125,"tn_h":108,"tim":1780134723063687,"time":1780134723,"md5":"lUM30CewIcy3PqFDiuUAWw==","fsize":55755,"resto":108914628},{"no":108940283,"now":"05\/30\/26(Sat)05:57:48","name":"Anonymous","com":"Also, interesting thread OP, thanks! It feels quite esoteric, even without the Borges stuff.","filename":"1772787101010905","ext":".jpg","w":1024,"h":971,"tn_w":125,"tn_h":118,"tim":1780135068164359,"time":1780135068,"md5":"91vSf4roKHMRt1Lq+qQzEA==","fsize":90073,"resto":108914628},{"no":108940430,"now":"05\/30\/26(Sat)06:39:17","name":"Anonymous","com":"<a href=\"#p108938361\" class=\"quotelink\">&gt;&gt;108938361<\/a><br>Rather &quot;possibly see someone&#039;s DVR recordings from 2015&quot;<br><br><a href=\"#p108939995\" class=\"quotelink\">&gt;&gt;108939995<\/a><br><span class=\"quote\">&gt;when downloading a website, you don&#039;t exceed a download concurrency of 4.<\/span><br>On the flip side, some sites are dumb: &quot;Downloading at a concurrency of 10?! I&#039;m being heckin DDoSed! I&#039;m under attack!&quot;<br><br><a href=\"#p108940263\" class=\"quotelink\">&gt;&gt;108940263<\/a><br>Didn&#039;t know that about YouTube autogenerated music data. I&#039;ve seen such things happen on non-YouTube sites: &quot;smush together bands that have nothing to do with each other aside from similar names&quot;.<br><br>Sad to see the non-official music uploads from regular YouTube users being deleted (along with their channels and the comments on the videos). Another move from &quot;YouTube: broadcast yourself&quot; to &quot;YouTube: broadcast your corporation&quot;.<br><br><span class=\"quote\">&gt;Did anyone navigate it productively, like actually finding things that are comprehensible, even meaningful?<\/span><br>Statistically, it&#039;s basically impossible to find sensical short-or-medium\/long-length sentences, especially if not looking at &quot;random English words&quot; book pages. I&#039;ve personally found sorta meaningful small bits of texts in it. Such as<br><span class=\"quote\">&gt;fedbgayxsshifr<\/span><br>in the &quot;swj cftauthd gwlb&quot; book <a href=\"#p108931795\" class=\"quotelink\">&gt;&gt;108931795<\/a>, but this is like schizo-levels of pattern recognition and meaning assignments.<br><br><span class=\"quote\">&gt;&gt;Library of Babel<\/span><br><span class=\"quote\">&gt;I wonder if some day it will be connected to some kind of LLM, as silly as the idea might seem now.<\/span><br>That&#039;s one way of looking through it. 10^4677 = a number with 4678 decimal digits.<br><br>Terabyte (TB) = trillion bytes, petabyte (PB) = quadrillion bytes, exabyte (EB) = quintillion bytes, zettabyte (ZB) = 10^21 bytes, yottabyte (YB) = 10^24 bytes, ronnabyte (RB) = 10^27 bytes, quettabyte (QB) = 10^30 bytes. All of these units are inadequate to describe the Library of Babel if it was a static complete set of files. It has a data size of 10^4647 QB.<br><br><span class=\"quote\">&gt;felt like a gimmick to me<\/span><br>Important thing that should be done if not already done: make libraryofbabel.info website software free and open source, standard identifiers on everything = long-lasting dynamic database\/system","filename":"browsehex","ext":".gif","w":315,"h":320,"tn_w":123,"tn_h":125,"tim":1780137557661930,"time":1780137557,"md5":"oxb0OVnkrv7qPAOQmZDu0Q==","fsize":33981,"resto":108914628},{"no":108940671,"now":"05\/30\/26(Sat)07:26:13","name":"Anonymous","com":"<a href=\"#p108940430\" class=\"quotelink\">&gt;&gt;108940430<\/a><br><span class=\"quote\">&gt;standard identifiers<\/span><br><br>In libraryofbabel.info:<br>Each page in each book can be uniquely identified with 3266 characters (3,266 bytes). Character set: lower case alphanumeric. 36^3253 (number with 5063 decimal digits) is larger than 10^4677. Of course, each individual page is smaller than 100 KB. All findable books are basically 1 MB in size (can compress down to about 800 KB). The IDs look like this:<br><span class=\"quote\">&gt;Book Location:[3253 characters here]-w1-s3-v21:111<\/span><br>where w=wall, s=shelf, v=volume, and :[number]=page.<br><br>In libraryofbabel.app:<br>Uses a different system for IDs. Going to a random page in that site -- https:\/\/libraryofbabel.app\/ref\/@ce9<wbr>0f1c41a06d76e571dbc2232516865015695<wbr>2a561f548e3d771454dc100b91.1.1.30.1<wbr>78 -- I see &quot;Room 1acbbhjh...cd98hpet \/ Wall 1 \/ Shelf 1 \/ Book 30 \/ Page 178&quot; with a link to https:\/\/libraryofbabel.app\/fullref\/<wbr>@ce90f1c41a06d76e571dbc223251686501<wbr>56952a561f548e3d771454dc100b91.1.1.<wbr>30.178 = the ID looks like this:<br><span class=\"quote\">&gt;[1.3 megabytes of lower case alphanumeric text].1.1.30.178<\/span><br>first number after &quot;.&quot;=wall, second number=shelf, third number=volume, 4 number=page. The ID in the URL is<br><span class=\"quote\">&gt;ce90f1c41a06d76e571dbc223251686501<wbr>56952a561f548e3d771454dc100b91.1.1.<wbr>30.178<\/span><br>same thing, except for the first part, which is 64 hexadecimal characters instead. 16^64 = a measly 10^77. Seemingly, the address space of libraryofbabel.app&#039;s URLs can only map to less than one quintillionth of all of the books in the Library of Babel megastructure.<br><br>There&#039;s apparently other Library of Babel websites, but for now the most glaring problem is the lack of a standardized system (books in libraryofbabel.app have !,?,- = not the case with libraryofbabel.info) and the lack of standardized IDs.<br><br><a href=\"#p108931795\" class=\"quotelink\">&gt;&gt;108931795<\/a><br><span class=\"quote\">&gt;.txt file of &quot;The Library of Babel&quot; by Jorge Luis Borges, formatted into the style that he described<\/span><br><span class=\"quote\">&gt;in the other [very small closet], satisfy one&#039;s fecal necessities<\/span><br>But then what do any of those people eat? Story doesn&#039;t have to be completely fleshed out, still makes me wonder","filename":"shelves","ext":".png","w":1400,"h":471,"tn_w":125,"tn_h":42,"tim":1780140373960389,"time":1780140373,"md5":"qs7F6Ht2eZk+C5dYEOuzdw==","fsize":909446,"resto":108914628},{"no":108940869,"now":"05\/30\/26(Sat)08:03:03","name":"Anonymous","com":"<a href=\"#p108940430\" class=\"quotelink\">&gt;&gt;108940430<\/a><br><span class=\"quote\">&gt;has a data size of 10^4647 QB<\/span><br>Actually, if each Library of Babel book is 1 MB in size, then the true size is 10^4653 QB (10^4683 bytes).<br><br><a href=\"#p108940671\" class=\"quotelink\">&gt;&gt;108940671<\/a><br>This other site says:<br><span class=\"quote\">&gt;https:\/\/babel.zwyx.dev\/intro<\/span><br><span class=\"quote\">&gt;There are 29^1,312,000 books in the Library of Babel \u2014 a number with 1,918,667 digits.<\/span><br>So what&#039;s the correct total number of books? &quot;Each book contains 410 pages. Each page, 40 lines. Each line, 80 characters. Each character can be a space, a letter, a comma, or a period.&quot; That&#039;s 1,312,000 characters per book, so yeah, charset^slots = 29^1312000. <a href=\"#p108940183\" class=\"quotelink\">&gt;&gt;108940183<\/a> I guess that dumb YouTuber messed up on the calculations in his Sd0tB3tR3yQ video.<br><br>And that site uses yet another system of IDs:<br><span class=\"quote\">&gt;https:\/\/babel.zwyx.dev\/random<\/span><br><span class=\"quote\">&gt;Book ID = [24,000 ASCII non-whitespace characters]<\/span><br><span class=\"quote\">&gt;Location in the library = Room number of 46,793 digits, wall, shelf, volume<\/span><br>Book ID can also be downloaded as a 19,610-byte PNG image.","filename":"about_bafkreibtgqb7egoalr76tfrt5djbd3s2bicaaciebwul4is5qn62dshtji","ext":".jpg","w":640,"h":416,"tn_w":125,"tn_h":81,"tim":1780142583483178,"time":1780142583,"md5":"fXCz3KYSAG9SVnSPvk6EXA==","fsize":55972,"resto":108914628},{"no":108941070,"now":"05\/30\/26(Sat)08:35:34","name":"Anonymous","com":"<a href=\"#p108938361\" class=\"quotelink\">&gt;&gt;108938361<\/a><br>Not sure I&#039;ll do this soon. Both things take some work (either combining the split files and so on or finding the HDD and sticking it internally into a computer of mine).<br><br>I can say that I used &quot;xfsprogs_3.2.1_amd64.deb&quot; in Lubuntu in 2021 to gain some info about what was in that drive. Here&#039;s that file:<br>- meta: https:\/\/ong3.xyz\/raw\/tpNRY30SwyTHzL<wbr>qzgbrX55ETXUfsaTwu-HW9TDCKn-s<br>- data: https:\/\/xacminh.store\/raw\/gvCNNKxHX<wbr>Tq2DTPigyaMwva81FcjCO3YHbWXwLG1UiA<br><br>Both files &quot;xfsprogs_3.2.1_amd64.deb&quot; and &quot;xfsprogs_3.2.1_i386.deb&quot; were deleted off of https:\/\/archive.org\/details\/, also not deleted by the uploader, same as <a href=\"#p108925208\" class=\"quotelink\">&gt;&gt;108925208<\/a><br><br>Attached: another pic carved out of that &quot;XFS hard drive&quot; (or &quot;DVR hard drive&quot;).","filename":"MistyTasteOfMoonshine","ext":".jpg","w":200,"h":300,"tn_w":83,"tn_h":125,"tim":1780144534323280,"time":1780144534,"md5":"3Ifrow5HXa50kffHv5UmBA==","fsize":113687,"resto":108914628},{"no":108942646,"now":"05\/30\/26(Sat)13:14:08","name":"Anonymous","com":"Is there any way to recover domains that the wayback machine nuked from their archive?<br>I think it&#039;s bullshit that someone can just buy up an expired website and then request wayback to remove it from their index.","time":1780161248,"resto":108914628},{"no":108943227,"now":"05\/30\/26(Sat)14:37:13","name":"Anonymous","com":"<a href=\"#p108942646\" class=\"quotelink\">&gt;&gt;108942646<\/a><br>There&#039;s ways to get captures of excluded websites, but one problem is that not enough people get and share WARCs. What sites are you looking for?<br><br>Well-known methods:<br>- Look for the site at https:\/\/archive.is\/site.com = can sometimes find captures; searching that will show all URLs for that site, searching https:\/\/archive.is\/*.site.com shows all subdomains of it<br>- Look for URLs at https:\/\/megalodon.jp\/[URL] = very rarely it&#039;ll be here. I don&#039;t know how to search megalodon.jp (\u30a6\u30a7\u30d6\u9b5a\u62d3) like how you can search archive.is in the previous bullet point.<br><br>Lesser-known methods:<br>- Search the indexes of WARCs in archive.org that were downloaded around the time of that website&#039;s existence under a certain webmaster; search the indice for &quot;site.com&quot; using grep<br>- Do the same search but wherever else you might find WARCs (like in some torrents)<br><br>Problems: not enough people grab WARCs of websites and the non-ArchiveTeam non-normal-user WARCs in archive.org are un-downloadable; for example:<br><span class=\"quote\">&gt;$ curl -I https:\/\/web.archive.org\/web\/2026021<wbr>8003722\/https:\/\/www.meridian.space\/<wbr>blog\/introducing-pay-per-byte-a-new<wbr>-era-for-filecoin-retrieval<\/span><br>says:<br><span class=\"quote\">&gt;x-archive-src: CC-MAIN-2026-08-1770395863965.96-00<wbr>37\/CC-MAIN-20260217225554-202602180<wbr>15554-00755.warc.gz<\/span><br>that&#039;s<br><span class=\"quote\">&gt;https:\/\/archive.org\/download\/CC-MA<wbr>IN-2026-08-1770395863965.96-0037<\/span><br><span class=\"quote\">&gt;Files marked with lock are not available for download [even if you&#039;re logged in, and all the important files are marked with a lock symbol]<\/span><br>WARC indexes are the .cdx.gz\/.cdx files and, if they were created by grab-site, &quot;wpull.log&quot;. grab-site and Wget can create WARC files; each .warc.gz is around 5 GB in size and contains thousands of webpages.<br><br>The lesser-known method was used to successfully obtain this webpage which was removed from web.archive.org and not in archive.is:<br>- title: &quot;Twilight hugging Moondancer by MrPoniator on DeviantArt&quot;<br>- url: https:\/\/www.deviantart.com\/mrponiat<wbr>or\/art\/Twilight-hugging-Moondancer-<wbr>544149993<br>- screenshot: attached","filename":"bafybeigh4o2opd5cg33ekkfr5hlrh5eqljr3tukn4hugyvzp2u4hmlseuy-image1","ext":".png","w":1351,"h":2258,"tn_w":74,"tn_h":125,"tim":1780166233034634,"time":1780166233,"md5":"z9XAmlr+BYdeZOOgW8eJfw==","fsize":304159,"resto":108914628},{"no":108943297,"now":"05\/30\/26(Sat)14:48:39","name":"Anonymous","com":"<a href=\"#p108943227\" class=\"quotelink\">&gt;&gt;108943227<\/a><br><span class=\"quote\">&gt;each .warc.gz is around 5 GB in size<\/span><br>grab-site standard for Web ARChive files (WARC files). WARCs created by GNU Wget can be whatever size, doesn&#039;t have a default size, I think. Wget-created-WARCs also have indexes in CDX files, I think.<br><br><span class=\"quote\">&gt;lesser-known method [WARCs] was used to successfully obtain this webpage which was removed from web.archive.org and not in archive.is<\/span><br>Oh, and the original \/ live \/ source webpage was deleted in around the year 2020. Source code of the page when rendered in ReplayWeb.page (WARC replay software) includes this text:<br>http:\/\/localhost:5471\/w\/id-1e7546e7<wbr>fe91\/20190903230859\/https:\/\/www.dev<wbr>iantart.com\/mrponiator\/art\/Twilight<wbr>-hugging-Moondancer-544149993<br><br>The HTML file at \/ipfs\/[CID]\/i_localhost.htm renders as plain HTML with no CSS and JS. Realizations:<br>- if running ReplayWeb.page with that WARC open (the .warc.gz is in some torrent): then you can maybe go to that http:\/\/127.0.0.1:5471\/ link and it will render with the .css and .js<br>- I can replay that page via ipwb and get archive.is to capture it. It would be better if I used SingleFile to capture 127.0.0.1:5471\/... then replay that SingleFile-created-HTML with ipwb<br><br>Torrent which has that WARC of &quot;great interest&quot; (probably dead now):<br>magnet:?xt=urn:btih:3850e42c8449a43<wbr>e2959db46ad4985ded54408aa","time":1780166919,"resto":108914628},{"no":108943386,"now":"05\/30\/26(Sat)15:03:44","name":"Anonymous","com":"This mega.nz folder was deleted:<br>https:\/\/mega.nz\/folder\/papA0DIa#crI<wbr>_OpajKXo_r_ZL1jJ5dg<br>https:\/\/archive.is\/2024.09.08-22075<wbr>9\/https:\/\/mega.nz\/folder\/papA0DIa%2<wbr>3crI_OpajKXo_r_ZL1jJ5dg<br><br>I currently have a copy of it at<br>\/mnt\/sshfs\/zd\/b\/z2\/data\/0221061\/htt<wbr>ps(u003a)(u002f)(u002f)mega.nz(u002<wbr>f)folder(u002f)papA0DIa(u0023)crI_O<wbr>pajKXo_r_ZL1jJ5dg\/<br><br>Size: 58 GB. Contents: hashes and metadata of millions of images that were\/are on the web (tumblr).","filename":"JP2WW","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780167824924262,"time":1780167824,"md5":"x5GVkdbIKuFQf3cSi1YNCw==","fsize":77289,"resto":108914628},{"no":108943528,"now":"05\/30\/26(Sat)15:25:37","name":"Anonymous","com":"Is the future of Discord archival (without getting your account banned) hopeless?<br>https:\/\/github.com\/Tyrrrz\/DiscordCh<wbr>atExporter\/issues\/1497","time":1780169137,"resto":108914628},{"no":108943620,"now":"05\/30\/26(Sat)15:41:44","name":"Anonymous","com":"<a href=\"#p108943386\" class=\"quotelink\">&gt;&gt;108943386<\/a><br>Sample:<br>https:\/\/xacminh.store\/raw\/jp73AEGIm<wbr>AJIkB8i_w8HW_zcI6YfNHFOYc2CSz2OLfc<br><br>Pic related.<br><br><a href=\"#p108943528\" class=\"quotelink\">&gt;&gt;108943528<\/a><br>I hope not. With the amount of people who use that dogshit, there&#039;s sure to be important things.","filename":"mega.nz+papA0DIa+crI_OpajKXo_r_ZL1jJ5dg","ext":".png","w":1280,"h":891,"tn_w":125,"tn_h":87,"tim":1780170104205603,"time":1780170104,"md5":"8aXv6HgPn61\/xIpcrH5gBA==","fsize":91540,"resto":108914628},{"no":108943641,"now":"05\/30\/26(Sat)15:44:09","name":"Anonymous","com":"<a href=\"#p108943227\" class=\"quotelink\">&gt;&gt;108943227<\/a><br><span class=\"quote\">&gt;- Look for the site at https:\/\/archive.is\/site.com = can sometimes find captures; searching that will show all URLs for that site, searching https:\/\/archive.is\/*.site.com shows all subdomains of it<\/span><br>This is really cool feature. Does archive.org have something like this as well?<br><br>Or while we&#039;re on the topic, any other site\/search engine, archive related or not, that lets you list subdomains like this? I know many search engines use wildcards *, but I&#039;ve never considered their uses beyond very basic searches. Something that could list all the pages on a website might prove useful both for archiving and beyond.","filename":"1775971945296130","ext":".jpg","w":1191,"h":1380,"tn_w":107,"tn_h":125,"tim":1780170249365723,"time":1780170249,"md5":"hsS+lxYj\/sLPFZ3SpT81Zw==","fsize":344738,"resto":108914628},{"no":108943737,"now":"05\/30\/26(Sat)15:58:31","name":"Anonymous","com":"<a href=\"#p108943620\" class=\"quotelink\">&gt;&gt;108943620<\/a><br><span class=\"quote\">&gt;With the amount of people who use that dogshit, there&#039;s sure to be important things.<\/span><br>Tell me about it. As the most egregious example I&#039;ve experienced to date, there&#039;s this fan-port of a game where the only way to get it is to make a thread in their Discord server proving you own the original (because they don&#039;t want to get sued or whatever) and wait for the lead developer (who made a pinned thread saying he&#039;s not only overwhelmed with the thousands of requests, but is currently on vacation) to DM it to you.","time":1780171111,"resto":108914628},{"no":108943842,"now":"05\/30\/26(Sat)16:15:08","name":"Anonymous","com":"<a href=\"#p108943528\" class=\"quotelink\">&gt;&gt;108943528<\/a><br>I had no such problem with dht.","time":1780172108,"resto":108914628},{"no":108945112,"now":"05\/30\/26(Sat)19:40:39","name":"Anonymous","com":"<a href=\"#p108943641\" class=\"quotelink\">&gt;&gt;108943641<\/a><br><span class=\"quote\">&gt;Does archive.org have something like this as well?<\/span><br>Yeah, but maybe not the subdomain thing; example:<br><span class=\"quote\">&gt;https:\/\/web.archive.org\/web\/*\/http<wbr>s:\/\/ipfs.nftstorage.link\/*<\/span><br><span class=\"quote\">&gt;https:\/\/web.archive.org\/web\/timema<wbr>p\/json?url=https%3A%2F%2Fipfs.nftst<wbr>orage.link%2F&amp;matchType=prefix&amp;coll<wbr>apse=urlkey&amp;output=json&amp;fl=original<wbr>%2Cmimetype%2Ctimestamp%2Cendtimest<wbr>amp%2Cgroupcount%2Cuniqcount&amp;filter<wbr>=!statuscode%3A%5B45%5D..&amp;limit=100<wbr>00<\/span><br>shows everything captured from that site under http:\/\/ and https:\/\/ (picrel = traditional art as an NFT from that site)<br><br><span class=\"quote\">&gt;any other site\/search engine, archive related or not, that lets you list subdomains like this?<\/span><br>This paid search engine -- https:\/\/www.shodan.io\/ -- can maybe enumerate all known subdomains of sites. (Shodan does stuff like list exposed APIs which shouldn&#039;t be exposed, or just general intelligence about which IP addresses are running some service and at what port. I can already do that for free, but on a small and targeting two specific services.) I used some FOSS software that got subdomains from websites in the past; it worked OK, not great.<br><br><span class=\"quote\">&gt;Something that could list all the pages on a website might prove useful both for archiving and beyond.<\/span><br>Sites used to have a sitemap.xml that listed all paths (pages, files) in the site. One way to get &quot;all the URLs&quot; of a site is to, for example, look at the WARC indices of ArchiveTeam uploads to IA when they were saving Imgur before it enshittified further. You can get ~millions of Imgur links using that method. (I could go into more detail on this topic.)","filename":"bafybeidzalqqeo5pwvd7j3shhjwlq6qbzroy4idauei2j3td6d5zeorj3e","ext":".jpg","w":2048,"h":2048,"tn_w":125,"tn_h":125,"tim":1780184439626253,"time":1780184439,"md5":"MytXRxooPrAPRtrAcAKd4g==","fsize":2276155,"resto":108914628},{"no":108945706,"now":"05\/30\/26(Sat)21:28:25","name":"Anonymous","com":"<a href=\"#p108943842\" class=\"quotelink\">&gt;&gt;108943842<\/a><br><span class=\"quote\">&gt;I had no such problem with dht.<\/span><br>As in Discord History Tracker? How long ago did you last use it?<br><a href=\"#p108945112\" class=\"quotelink\">&gt;&gt;108945112<\/a><br>If you only want results archived with a status in the 200 range, change<br><span class=\"quote\">&gt;filter=!statuscode%3A%5B45%5D..<\/span><br>to<br><span class=\"quote\">&gt;filter=statuscode%3A2..<\/span>","time":1780190905,"resto":108914628},{"no":108946689,"now":"05\/31\/26(Sun)01:23:25","name":"Anonymous","com":"<a href=\"#p108943386\" class=\"quotelink\">&gt;&gt;108943386<\/a><br>Couting up the records in those SQL files:<br>- #-d, p, s, z = dunno the exact row count because I Zstandard-compressed them; however we can think about the average rows per byte: 0.00328900696299. Doing calculations with the uncompressed sizes = 89029400 rows<br>- everything else = 115780158 rows<br><br>Total = 204,809,558 records. I could share these compressed SQL files (picrel is from one of the URLs).<br><br>That&#039;s more than 200 million tumblr URLs <a href=\"#p108943641\" class=\"quotelink\">&gt;&gt;108943641<\/a> <a href=\"#p108945112\" class=\"quotelink\">&gt;&gt;108945112<\/a> as you were talking about getting a site map. 204809558 rows =<br>- 58 GB uncompressed<br>- 17 GB compressed (level 19 .zst) and each file compresses to &lt;2 GB<br><br><a href=\"#p108945112\" class=\"quotelink\">&gt;&gt;108945112<\/a><br>Rather &quot;on a small scale and targeting two specific services&quot;","filename":"tumblr_pdgw5nPTlh1rn5mvzo5_1280","ext":".gif","w":800,"h":450,"tn_w":125,"tn_h":70,"tim":1780205005487881,"time":1780205005,"md5":"dIZu1gHVJVT3pZjbmwXDlg==","fsize":2339238,"resto":108914628},{"no":108946703,"now":"05\/31\/26(Sun)01:27:26","name":"Anonymous","com":"<a href=\"#p108946689\" class=\"quotelink\">&gt;&gt;108946689<\/a><br><span class=\"quote\">&gt;Couting up<\/span><br>Counting up","filename":"tumblr_pdgw5nPTlh1rn5mvzo7_1280","ext":".gif","w":800,"h":450,"tn_w":125,"tn_h":70,"tim":1780205246976541,"time":1780205246,"md5":"V+Zv2Roh2lf6G6\/mBg8dmQ==","fsize":2640335,"resto":108914628},{"no":108947281,"now":"05\/31\/26(Sun)04:27:08","name":"Anonymous","com":"I&#039;m really liking Blombooru for local image\/video archiving.<br><br>https:\/\/github.com\/mrblomblo\/blombo<wbr>oru<br><br>It&#039;s not as full features as big booru software but super easy to get running. Performance has been good so far too. I&#039;m using one instance for memes and general images, and a separate instance (with auth) for pron.","filename":"blombooru","ext":".png","w":1249,"h":912,"tn_w":125,"tn_h":91,"tim":1780216028966749,"time":1780216028,"md5":"7AInOFL2sYAt1jeWDWmkFA==","fsize":1114666,"resto":108914628},{"no":108948621,"now":"05\/31\/26(Sun)09:38:33","name":"Anonymous","com":"Bump.<br><br>Also, thoughts on this extension? https:\/\/addons.mozilla.org\/en-US\/fi<wbr>refox\/addon\/view-page-archive\/<br><br>I haven&#039;t had much luck using it to search for stuff, if something isn&#039;t on archive.org or some archive.today mirror, then it might as well never have existed. But it seems interesting nonetheless. Pic related is the list of archives it tries to search.","filename":"1755353303311666","ext":".png","w":337,"h":483,"tn_w":87,"tn_h":125,"tim":1780234713279310,"time":1780234713,"md5":"+2mQFW42eDiLdINDN\/2Bug==","fsize":20176,"resto":108914628},{"no":108949837,"now":"05\/31\/26(Sun)13:02:17","name":"Anonymous","com":"I&#039;ve noticed that starting like a month ago, archive.today can no longer capture direct image links if the target image is larger than archive.today&#039;s browser height. Pretty sure that archive.today&#039;s browser viewport is 1024x768. So, high resolution pictures where you can click to zoom in, larger than 640x480 or 1024x768 = fails to capture it, eventually says &quot;Not Found (yet?)&quot;; I&#039;ve seen this happen multiple times.<br><br>I know of about one work around: involves making the highres image show up in a certain webpage.<br><br>For an example, attached is a photo of disc 3A of Azumanga Daioh: The Animation (&quot;KIBA-9799&quot;) which was deleted off of archive.org\/details\/ by non-uploader. Adding it (https:\/\/junnew.site\/raw\/e2pxrafWfR<wbr>p0t00kcjhuctRYV2ZhSh2mmyuWa2X6H3k = metadata) to archive.today:<br><br>Fails<br>https:\/\/archive.ph\/?url=https:\/\/lam<wbr>sachmay.store\/raw\/euuCXvWBN_qbXe4Wh<wbr>LVPba9mhj0PkDmequClMZGcFUE<br><br>Works<br>https:\/\/archive.ph\/?url=https:\/\/web<wbr>.archive.org\/https:\/\/lamsachmay.sto<wbr>re\/raw\/euuCXvWBN_qbXe4WhLVPba9mhj0P<wbr>kDmequClMZGcFUE","filename":"dvd3a1","ext":".png","w":1918,"h":1917,"tn_w":125,"tn_h":124,"tim":1780246937662046,"time":1780246937,"md5":"u0PVFEQs2yV8sAP6EtkrZQ==","fsize":691461,"resto":108914628},{"no":108949975,"now":"05\/31\/26(Sun)13:24:09","name":"Anonymous","com":"<a href=\"#p108947281\" class=\"quotelink\">&gt;&gt;108947281<\/a><br>Concerns:<br><br>Time it takes to tag everything.<br>I have more than 1 million images and nearly 1 million videos. I have only 1 me, and not a million idiots who are willing to be abused by booru.org when using the deletionist websites that booru.org owns (such as rule34.xxx); so, not a million idiot users who will tag my stuff. Need some AI \/ image recognition thing unless I want to spend months manually doing things (I don&#039;t).<br><br>Search systems heavily lend themselves to centralization.<br>Search systems, especially if large scale are almost entirely a server-side only thing, and only one server has it (localhost or a remote server). We need many servers with the same data if server-side only (like Elasticsearch) or a large scale client-side search thing. I&#039;ve always wondered if there&#039;s a thing that can effectively search millions of tagged images using only client-side tools: HTML, JS, maybe also WebAssembly. There&#039;s Arweave, a decentralized and distributed permanent storage network; theoretically, there could be multiple online services that use GraphQL to search it, but last I checked there&#039;s only one: Goldsky&#039;s. <br><br>Example query using Goldsky:<br><span class=\"quote\">&gt;https:\/\/arweave-search.goldsky.com<wbr>\/graphql?query=query%20just_values{<wbr>transactions(first:9,tags:[{name:%2<wbr>2IPFS-Hash%22,values:%0A%22[CID here]%22}]){edges{node{id%20tags{na<wbr>me%20value}}}}}<\/span><br><span class=\"quote\">&gt;https:\/\/web.archive.org\/web\/202502<wbr>08164940\/https:\/\/megalodon.jp\/2025-<wbr>0209-0148-57\/https:\/\/archive.is:443<wbr>\/r2ade<\/span>","time":1780248249,"resto":108914628},{"no":108950185,"now":"05\/31\/26(Sun)14:04:19","name":"Anonymous","com":"<a href=\"#p108948621\" class=\"quotelink\">&gt;&gt;108948621<\/a><br><span class=\"quote\">&gt;Perma.cc<\/span><br>I vaguely remember using this in the past. I had to make an account to save some web page if I remember correctly. Could only make 3 to 10 captures per year. We need more of &quot;a web we can return to&quot; and perma.cc isn&#039;t helping me with that:<br><span class=\"quote\">&gt;https:\/\/perma.cc\/docs\/perma-link-c<wbr>reation<\/span><br><span class=\"quote\">&gt;Memento<\/span><br><span class=\"quote\">&gt;Memento is a framework for accessing archived versions of web resources. Like many other web archiving services, Perma has implemented Memento. As a result, all public Perma Records are available via the Memento framework.<\/span><br>OK, where?! Typically that&#039;s at web root, but I see nothing under this or at the live page:<br>https:\/\/web.archive.org\/web\/*\/https<wbr>:\/\/perma.cc\/memento\/*<br><br>Nothing under that path, also proven by checking archive.is. Perhaps I can find captures labeled &quot;perma.cc&quot; at mementoweb.org though I have like no experience using mementoweb.org (hope I can at least search it like https:\/\/archive.is\/site.com = all captures under a site or specific subdomain).<br><br>I know for sure in the past I got perma.cc to save some https:\/\/xbooru.com\/ webpage. Looks like all perma.cc uploads are also uploaded to archive.org\/details\/; however, like I said, IA is untrustworthy. I no longer see that perma.cc xbooru.com capture in IA. I did see it in the past, probably deleted off of IA now. I can&#039;t search or find the capture in perma.cc as I no longer have the perma.cc ID which looks like ABCD-1234.","filename":"ip2lg","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780250659891857,"time":1780250659,"md5":"JL2PeGbIapvmcLuhzzkyQQ==","fsize":27562,"resto":108914628},{"no":108951098,"now":"05\/31\/26(Sun)16:12:57","name":"Anonymous","com":"<a href=\"#p108950185\" class=\"quotelink\">&gt;&gt;108950185<\/a><br><span class=\"quote\">&gt;perma.cc memento, where?<\/span><br>existed in the past, not anymore<br><span class=\"quote\">&gt;https:\/\/archive.is\/2026.05.31-1824<wbr>04\/https:\/\/web.archive.org\/web\/2025<wbr>0727181358\/https:\/\/groups.google.co<wbr>m\/g\/memento-dev\/c\/XHB4IezBiqA<\/span><br><span class=\"quote\">&gt;Tomorrow (Weds Feb 4, 2020), Perma.cc will begin the process of deploying completely reimplemented support for timegates, timemaps, and memento-related headers on Perma Link\/memento playbacks. Our timemaps, timegates, and memento-related headers have been broken since early last summer; we apologize for the frustration, and that it took us so long to address.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;We expect the full re-indexing to take several hours, possibly up to a full day. During this time, Perma will initially return 404 for all timemap and timegate queries; partial results will be exposed in real time as the index is re-built. I&#039;ll post to this list again when it&#039;s complete.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;Timemaps will subsequently be available at:<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/timemap\/link\/&lt;url<wbr>&gt; (replacing https:\/\/perma-archives.org\/warc\/tim<wbr>emap\/*\/&lt;url&gt;)<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/timemap\/json\/&lt;url<wbr>&gt; (newly available)<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/timemap\/html\/&lt;url<wbr>&gt; (non-standard, browser-friendly format, replacing https:\/\/perma-archives.org\/warc\/*\/&lt;<wbr>url&gt;)<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;and timegates will be available at:<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/timegate\/&lt;url&gt; (replacing https:\/\/perma-archives.org\/warc\/tim<wbr>egate\/&lt;url&gt;)<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;We will 301 redirect from the old routes to the new ones.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;More details are below, if anyone is interested or might be tripped up by our changes.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;We hope this whips Perma&#039;s Memento support into shape. We&#039;ve run the validator (http:\/\/mementoweb.org\/tools\/valida<wbr>tor) against our staging server (https:\/\/perma-stage.org), and it looks good, but I can now easily tweak the output as needed.<\/span>","time":1780258377,"resto":108914628},{"no":108951484,"now":"05\/31\/26(Sun)17:03:05","name":"Anonymous","com":"<a href=\"#p108951098\" class=\"quotelink\">&gt;&gt;108951098<\/a><br>One of those works:<br><span class=\"quote\">&gt;https:\/\/perma.cc\/timemap\/html\/http<wbr>s:\/\/0.0g.gg\/?a1a6d33cb095785d#-J5Ft<wbr>7PXfHJiCwRAqfvBN4ffpFsg2otNg2wwf8Jn<wbr>kfwKt<\/span><br><span class=\"quote\">&gt;Query Results<\/span><br><span class=\"quote\">&gt;No captures found for https:\/\/0.0g.gg\/?a1a6d33cb095785d<\/span><br>but it looks like it only does exact URL matches, not a list of links captureed for a site or subdomain.<br><br>A URL that was captured:<br><span class=\"quote\">&gt;https:\/\/archive.org\/details\/perma_<wbr>cc_M55W-652M<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/timemap\/html\/http<wbr>s:\/\/www.dailymail.co.uk\/news\/articl<wbr>e-5182895\/Man-ordered-remove-Santa-<wbr>beard-violating-burqa-ban.html<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/timemap\/json\/http<wbr>s:\/\/www.dailymail.co.uk\/news\/articl<wbr>e-5182895\/Man-ordered-remove-Santa-<wbr>beard-violating-burqa-ban.html<\/span><br><span class=\"quote\">&gt;Query Results<\/span><br><span class=\"quote\">&gt;1 capture of https:\/\/www.dailymail.co.uk\/news\/ar<wbr>ticle-5182895\/Man-ordered-remove-Sa<wbr>nta-beard-violating-burqa-ban.html<\/span><br><span class=\"quote\">&gt;Captured At Perma Link \/ Memento Url<\/span><br><span class=\"quote\">&gt;Nov. 21, 2021, 6:54 p.m. https:\/\/perma.cc\/M55W-652M<\/span>","time":1780261385,"resto":108914628},{"no":108951890,"now":"05\/31\/26(Sun)18:16:14","name":"Anonymous","com":"<a href=\"#p108951484\" class=\"quotelink\">&gt;&gt;108951484<\/a><br><span class=\"quote\">&gt;not a list of links captured for a site or subdomain<\/span><br>Unless perma.cc improves their software, I somehow login with my forgotten email+password, or it shuts down, I may never find that webpage capture I got perma.cc to make years ago.<br><br>perma.cc has a Contingency Plan which includes:<br><span class=\"quote\">&gt;https:\/\/perma.cc\/contingency-plan<\/span><br><span class=\"quote\">&gt;3. Publish a map file. Perma.cc will publish to third-party websites a text file mapping each Permalink back to the original archived link. The file will also map Permalinks to the corresponding third-party archives, and to any new locations where the archive has been moved during the phaseout period.<\/span><br><br>Just give me the map file right now: with only &quot;Permalink &lt;-&gt; original link&quot;!<br><br>Other than that, I looked at everything in https:\/\/archive.is\/offset=1400\/perm<wbr>a.cc and didn&#039;t see my capture. I did see:<br><span class=\"quote\">&gt;failed capture: https:\/\/perma.cc\/4MAQ-UWJU -&gt; https:\/\/rejouer.perma.cc\/replay-web<wbr>-page\/w\/id-8a4e972226e2\/mp_\/[URL here]#<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/2R8K-3G6U &quot;Perma | Whites to Lose Majority Status in U.S. by 2042 - WSJ&quot;<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/NZ6D-YMEH &quot;Perma | Sam Heughan Felt Betrayed on &#039;Outlander&#039; Over &#039;Unnecessary&#039; Penis Shot - Business Insider&quot;<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/67WT-R4GD &quot;Perma | Korean streamer suddenly dies on stream - YouTube&quot;<\/span><br><span class=\"quote\">&gt;https:\/\/perma.cc\/5ZG2-BXUL &quot;INDUSTRIA GAMER IS EXACTLY THE SAME, A MAN THING. A WOMAN&#039;S PLACE IS DOING A WOMAN&#039;S THING, FUTILE AND IMBECILE THINGS THAT EVEN A RETARD COULD DO.&quot;<\/span>","filename":"2026-05-31-161419_1280x1024_scrot","ext":".png","w":1280,"h":1024,"tn_w":125,"tn_h":100,"tim":1780265774557681,"time":1780265774,"md5":"ZpDxP4LeYudgyTv8n63MBw==","fsize":284390,"resto":108914628},{"no":108952524,"now":"05\/31\/26(Sun)20:34:45","name":"Anonymous","com":"I need to modify a qBittorrent .fastresume file to fix the upload-download ratio stats. Those files are the metadata files which say how many pieces of a torrent you have and so on. They&#039;re stored at<br>~\/.local\/share\/qBittorrent\/BT_backu<wbr>p\/*.fastresume<br><br>Reason: I have 191.4 GiB of a torrent with a total size of 1954.8 GiB. After checking finished it downloaded 362.0 MiB more of it (didn&#039;t intend for this to happen). Right now, I don&#039;t want to download any more of it. It&#039;s messing up my share ratio because it&#039;s basing it on 362.0 MiB being total downloaded, not 0 MiB downloaded. 0 MiB being the total downloaded stat = share ratio works the same as if all of it was downloaded.<br><br>Don&#039;t know what to change in the .fastresume right now.<br><br>( Relatedly, I do know how to edit .fastresume files to fix for a different problem: https:\/\/desuarchive.org\/g\/thread\/10<wbr>8656842\/#108776690 )","time":1780274085,"resto":108914628},{"no":108953684,"now":"06\/01\/26(Mon)00:46:15","name":"Anonymous","com":"<a href=\"#p108952524\" class=\"quotelink\">&gt;&gt;108952524<\/a><br>qBittorrent v5.1.2 seems to already have some fix for that, if you&#039;re within a certain ratio\/fraction. Inside the ratio:<br>- does NOT do: 11.84 GiB uploaded \/ 362.0 MiB downloaded = 33.492 share ratio<br>- does this, basically: 11.84 GiB uploaded \/ 1.909 TiB total size = 0.0061 share ratio<br>- actual Share Ratio stat is off by one order of magnitude for some reason: 0.06 share ratio<br><br>Outside of said ratio:<br>- Downloaded: 3.0 MiB (~3,145,728 B); Uploaded: 167.7 MiB (~175,846,195 B); Total Size: 165.2 MiB<br>- Result: 54.71 share ratio<br><br><span class=\"quote\">&gt;Don&#039;t know what to change in the .fastresume right now.<\/span><br>In ~\/.local\/share\/qBittorrent\/BT_backu<wbr>p\/$i.fastresume<br><span class=\"quote\">&gt;:save_path16:\/some\/path9:seed_mode<wbr>i0e12:seeding_timei8174059e19:seque<wbr>ntial_downloadi0e10:share_modei0e15<wbr>:stop_when_readyi0e13:super_seeding<wbr>i0e16:total_downloadedi3215388e14:t<wbr>otal_uploadedi175914042e8:trackersl<wbr>l43:<\/span><br>notice how &quot;total_downloadedi3215388e14&quot; = 3,215,388 bytes and &quot;total_uploadedi175914042e8&quot; = 175,914,042 bytes, so change it to<br><span class=\"quote\">&gt;total_downloadedi0e14<\/span>","time":1780289175,"resto":108914628},{"no":108953911,"now":"06\/01\/26(Mon)01:40:28","name":"Anonymous","com":"<a href=\"#p108949975\" class=\"quotelink\">&gt;&gt;108949975<\/a><br>Then good news! Blombooru has an optional AI image tagger.<br><br>I&#039;m running it on piece of crap hardware so I can&#039;t test it out, but it looks like you just import a CSV of the tag set you want to use, then let the automated tagger iterate over your image collection.","time":1780292428,"resto":108914628},{"no":108953924,"now":"06\/01\/26(Mon)01:44:22","name":"Anonymous","com":"<a href=\"#p108949975\" class=\"quotelink\">&gt;&gt;108949975<\/a><br><a href=\"#p108953911\" class=\"quotelink\">&gt;&gt;108953911<\/a><br>Also, you raise an interesting idea. It would be possible to share suggested tags between booru instances by allowing image-hash queries against instances. Those queries could then return any tags associated with that image from a given booru.<br>You&#039;d need to federate or centralize, but it&#039;d be reasonably light touch. Queries only go out when an image is added to a booru, and a response only comes back if that image hash matches on in the other booru&#039;s database. It only allows discovery of booru content in as far as confirming a specific image has exists there (but it never serves it).","time":1780292662,"resto":108914628},{"no":108954667,"now":"06\/01\/26(Mon)05:17:36","name":"Anonymous","com":"My faithful old WD Blue 1TB is at 105,952 power-on-hours according to SMART. No errors, but it can&#039;t last forever. Pensive emoji.","time":1780305456,"resto":108914628},{"no":108954690,"now":"06\/01\/26(Mon)05:22:40","name":"Anonymous","com":"<a href=\"#p108954667\" class=\"quotelink\">&gt;&gt;108954667<\/a><br>That&#039;s 12 years. I think how much you use it also factors into the annualized failure rate (AFR) of HDDs. You say it&#039;s been powered on for that long, but how much do you read or write to it? I have multiple old 1TB HDDs sitting around, bit-rotting away.<br><br><a href=\"#p108937861\" class=\"quotelink\">&gt;&gt;108937861<\/a><br>Does ngrok allow me to pay to remove the verification wall with cryptocurrencies? If yes, it better be cheap.<br><br>Update: 10 USD per month, fuck that. Plus the only payment method is credit card: also bad. Source:<br>https:\/\/ngrok.com\/pricing -&gt; https:\/\/dashboard.ngrok.com\/billing<wbr>\/choose-a-plan -&gt; https:\/\/dashboard.ngrok.com\/billing<wbr>\/checkout?plan=hobbyist_monthly","time":1780305760,"resto":108914628},{"no":108954845,"now":"06\/01\/26(Mon)06:04:22","name":"Anonymous","com":"<a href=\"#p108940430\" class=\"quotelink\">&gt;&gt;108940430<\/a><br><span class=\"quote\">&gt;basically impossible to find sensical short-or-medium\/long-length sentences, especially if not looking at &quot;random English words&quot; book pages<\/span><br>Challenge: find 3 or 4 consecutive English words. This Library of Babel book titled &quot;dncvutswesgslocann l.r m&quot; has one English words page:<br>- uncompressed version: https:\/\/archive.is\/https:\/\/140.235.<wbr>158.66\/ipfs\/*<br>- compressed version: https:\/\/webthree.site\/raw\/_5YMz3PNJ<wbr>Cd2VaINqK-L-kjC-p2VFRq5RhzhfYxBfqE<br><br>Skimming through the &quot;random text&quot; part (not said page) and not using any tools, I found it difficult to find two consecutive English words. Findings (7): &quot;sex is|try.ok|sexbag|illcpgok|loli lupcp|sixneo|fap ew&quot;. Duck.ai only allow jpg\/png\/gif and pdf uploads. It can only analyze 15 PDF pages, so asking &quot;find meaningful sequences of words in this&quot; won&#039;t help unless it can analyze the 410 pages packed into 15 pages. For PDFs, duck.ai probably has both a page limit and a text length limit.<br><br>This book was deleted off of archive.org\/details\/ by someone other than the uploader<br><br><a href=\"#p108940671\" class=\"quotelink\">&gt;&gt;108940671<\/a><br><span class=\"quote\">&gt;the address space of libraryofbabel.app&#039;s URLs can only map to less than one quintillionth of all of the books in the Library of Babel megastructure<\/span><br>Issue addressed at<br><span class=\"quote\">&gt;https:\/\/libraryofbabel.app\/about<\/span><br><span class=\"quote\">&gt;the room identifiers are represented as SHA-256 hashes. The actual room identifiers get incredibly long, up to roughly a million characters. This makes them much too long to be used in URLs, so instead a &#039;bookmark&#039; is created and stored for the room in the form of a hash when it is first visited. This hash is used in place of the room identifier in the page URL.<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;However, there are orders of magnitude more possible room identifiers than there are possible hashes. In theory, eventually when enough rooms are discovered there will be collisions (2 inputs map to the same output hash) and they will start to be overwritten as new, undiscovered rooms are accessed for the first time. In reality, this limit will never be reached<\/span><br><br>Vid unrelated","filename":"part_of_lLZbjEFWcgI","ext":".mp4","w":1458,"h":820,"tn_w":125,"tn_h":70,"tim":1780308262401049,"time":1780308262,"md5":"CzDTGa61LwMqRRVl53e0KQ==","fsize":3534909,"resto":108914628},{"no":108955321,"now":"06\/01\/26(Mon)07:43:20","name":"Anonymous","com":"If you use Wayback Machine, you&#039;ll sometimes get this error for days:<br><span class=\"quote\">&gt;web.archive.org refused to connect.<\/span><br>because they rate limit you. (It&#039;s not an HTTP error like 4xx or 5xx: those only happen after you successfully connect to a site.)<br><br>In these cases, you can still use the site in Tor Browser or with torsocks in a CLI. But how do you use a thing like torsocks but for I2P instead? First run &quot;i2prouter start&quot; to start the daemon then run:<br><span class=\"quote\">&gt;$ TZ=UTC http_proxy=http:\/\/localhost:4444 wget --no-verbose -O- http:\/\/artixmirror.i2p\/ | grep Onion<\/span><br><span class=\"quote\">&gt;[...]<\/span><br><span class=\"quote\">&gt;2026-06-01 11:16:14 URL:http:\/\/artixmirror.i2p\/ [4565] -&gt; &quot;-&quot; [1]<\/span><br><span class=\"quote\">&gt; &lt;li&gt;&lt;a href=&quot;http:\/\/artixhnbzrty77wcrnv4a5<wbr>ylx7ujro7w5ueopb6un6uxmc36lhnz2oid.<wbr>onion&quot;&gt;Onion Service&lt;\/a&gt;&lt;\/li&gt;<\/span><br><br>A use case: daily usages (URLs saved) are used up for Tor users using Wayback Machine. Oh wait, for this to work, you need to use an I2P outproxy from a CLI; not sure how to do that, but with the above method you can get WARCs of eepsites (.i2p):<br><span class=\"quote\">&gt;$ TZ=UTC http_proxy=http:\/\/localhost:4444 wget -p -r --level=1 --span-hosts --adjust-extension --convert-links --warc-max-size=700000000 --warc-cdx -e robots=off --warc-file=w http:\/\/fluttershy.i2p\/ 1&gt;wget_out.txt 2&gt;wget_err.txt<\/span>","filename":"bafkreigihecv7i3niauuygwfctxlqmw3hncq3dquv4kwtnuq5ez2dytwke","ext":".png","w":220,"h":220,"tn_w":125,"tn_h":125,"tim":1780314200013965,"time":1780314200,"md5":"z3Iv5nbdrZ6dVDs5cCLUIw==","fsize":6871,"resto":108914628},{"no":108955377,"now":"06\/01\/26(Mon)07:54:45","name":"Anonymous","com":"<a href=\"#p108943297\" class=\"quotelink\">&gt;&gt;108943297<\/a><br>ReplayWeb.page says it can &quot;Load Web Archive&quot; (replay) a webpage from a HAR file.<br><br>My excitement disappeared when I tried to open 5-MB local file<br><span class=\"quote\">&gt;https:\/\/dn720001.ca.archive.org\/0\/<wbr>items\/fa6cadb204274becc890306411d68<wbr>a\/i_localhost.har<\/span><br>in<br><span class=\"quote\">&gt;.\/ReplayWeb.page-2.4.6.AppImage<\/span><br>then got this error:<br><span class=\"quote\">&gt;An unexpected error occured: TypeError: Cannot read properties of undefined (reading &#039;size&#039;)<\/span>","filename":"UsaJg","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780314885981559,"time":1780314885,"md5":"goga4M9tI+IzcMg1+z6lUw==","fsize":14621,"resto":108914628},{"no":108955763,"now":"06\/01\/26(Mon)08:59:46","name":"Anonymous","com":"In the end, evil triumphs over good. All life and the universe itself will die to a heat death or something.<br><br>However, things are abysmal on much shorter timescales as well. In 10,000 years, all the information that humans have now will be gone. The English language will be gone or unrecognizable by then.","time":1780318786,"resto":108914628},{"no":108956123,"now":"06\/01\/26(Mon)09:56:12","name":"Anonymous","com":"<a href=\"#p108955763\" class=\"quotelink\">&gt;&gt;108955763<\/a><br>That&#039;s someone else&#039;s problem. I care about archiving because it&#039;s genuinely useful.","filename":"1750816208885901","ext":".jpg","w":736,"h":924,"tn_w":99,"tn_h":125,"tim":1780322172762446,"time":1780322172,"md5":"aUkpFME6PCZxaDSwVHWKCA==","fsize":104115,"resto":108914628},{"no":108957768,"now":"06\/01\/26(Mon)13:52:53","name":"Anonymous","com":"<a href=\"#p108955763\" class=\"quotelink\">&gt;&gt;108955763<\/a><br>So what? Keep going. Spread good things around. Out of spite, if you have to.","time":1780336373,"resto":108914628},{"no":108958491,"now":"06\/01\/26(Mon)15:39:45","name":"Anonymous","com":"<a href=\"#p108943297\" class=\"quotelink\">&gt;&gt;108943297<\/a><br><span class=\"quote\">&gt;Torrent which has that WARC of great interest (probably dead now):<\/span><br><span class=\"quote\">&gt;[magnet link]<\/span><br>Whoa, I saw one peer from Asia with 100% of this 897.92-GiB torrent today. That folder \/ infohash was created in 2022. It&#039;s hundreds of gigabytes of Web ARChive file, with outlinks! So it&#039;s not just WARCs of a single website only.<br><br>This isn&#039;t just &quot;data hoarding for no reason&quot;, I find it helpful and am accessing a file in it right now.<br><br>Question is: can ReplayWeb.page only import 3 out of the 5 GB of the .warc.gz file? Or, does it require at least 5 GB of free space on the computer which is running the ReplayWeb.page AppImage?","time":1780342785,"resto":108914628},{"no":108958636,"now":"06\/01\/26(Mon)16:04:23","name":"Anonymous","com":"<a href=\"#p108955763\" class=\"quotelink\">&gt;&gt;108955763<\/a><br>The information will be gone because people don&#039;t really value information. Not because the English language will be gone or very different. In case anyone was making that connection.<br><br>If 10,000 years later, humans are more technologically advanced then they are now (or around the same level of advancement as now), then they can easily translate Industrial-Revolution-era English into whatever their language is then.<br><br><a href=\"#p108958491\" class=\"quotelink\">&gt;&gt;108958491<\/a><br><span class=\"quote\">&gt;can ReplayWeb.page only import 3 out of the 5 GB of the .warc.gz file?<\/span><br>Don&#039;t know, didn&#039;t test that (yet?).<br><br><span class=\"quote\">&gt;require at least 5 GB of free space on the computer which is running the ReplayWeb.page AppImage?<\/span><br>I hope it just processes the .warc and leaves the data storage and retrieval to wherever the .warc.gz is stored. It could be stored over sshfs where there&#039;s not enough space locally to store 5 new gigabytes.<br><br>Update: it does work like that! <a href=\"#p108943227\" class=\"quotelink\">&gt;&gt;108943227<\/a> stats on one of the 5.38-GB warcs: around 208817 records total.<br><br>However, going to http:\/\/localhost:5471\/w\/id-2cc15264<wbr>811f\/20190903230859mp_\/https:\/\/... = &quot;localhost refused to connect&quot;","time":1780344263,"resto":108914628},{"no":108958675,"now":"06\/01\/26(Mon)16:09:53","name":"Anonymous","com":"<a href=\"#p108958636\" class=\"quotelink\">&gt;&gt;108958636<\/a><br>Does ReplayWeb.page seriously not have a WebUI \/ thing: where the webpage replays can be accessed in any browser on any device in the LAN? I need that so I can get SingleFile to capture the replayed page.<br><br><a href=\"#p108951890\" class=\"quotelink\">&gt;&gt;108951890<\/a><br>Thought I found it a while ago (with the help of a third-party service), but it was some WARC created in 2026. The one I&#039;m looking for was from 2018 or 2022. Not this one:<br>https:\/\/archive.is\/2026.05.31-22330<wbr>2\/https:\/\/web.archive.org\/web\/timem<wbr>ap\/json?url=https:\/\/rejouer.perma.c<wbr>c\/&amp;matchType=prefix&amp;collapse=urlkey<wbr>&amp;output=json&amp;fl=original,mimetype,t<wbr>imestamp,endtimestamp,groupcount,un<wbr>iqcount&amp;limit=100000","filename":"1733015312812","ext":".png","w":792,"h":498,"tn_w":124,"tn_h":78,"tim":1780344593837180,"time":1780344593,"md5":"k6Q4Ywe0YC5EDuYCbcuqBA==","fsize":16570,"resto":108914628},{"no":108958919,"now":"06\/01\/26(Mon)16:43:23","name":"Anonymous","com":"<a href=\"#p108958675\" class=\"quotelink\">&gt;&gt;108958675<\/a><br>In my Apache server:<br><span class=\"quote\">&gt;ReplayWeb.page could not be loaded due to the following error:<\/span><br><span class=\"quote\">&gt;SecurityError: Failed to register a ServiceWorker for scope (&#039;https:\/\/10.0.0.54\/path\/replay\/&#039;) with script (&#039;https:\/\/10.0.0.54\/path\/replay\/sw.<wbr>js?serveIndex=1&#039;): An SSL certificate error occurred when fetching the script.<\/span><br><br>In server \/usr\/local\/nginx\/sbin\/nginx (usr local nginx, gotta remember that path):<br><span class=\"quote\">&gt;Sorry, the ReplayWeb.page system must be loaded from an HTTPS URL (or localhost), but was loaded from: 10.0.0.73:8000.<\/span><br><span class=\"quote\">&gt;Please try loading this page from an HTTPS URL<\/span><br><br>Per this guide:<br>https:\/\/replayweb.page\/docs\/embeddi<wbr>ng\/<br><br>Will have to do whatever steps (I forgot but did them in the past) to get this computer to trust a LAN IP address&#039;s HTTPS cert.","filename":"1590964986871","ext":".jpg","w":571,"h":548,"tn_w":125,"tn_h":119,"tim":1780346603061642,"time":1780346603,"md5":"urFaFPn3dDMhNv+OtDHXSw==","fsize":48729,"resto":108914628},{"no":108958968,"now":"06\/01\/26(Mon)16:49:34","name":"Anonymous","com":"<a href=\"#p108958919\" class=\"quotelink\">&gt;&gt;108958919<\/a><br>I stopped being dumb and loaded it from:<br>http:\/\/localhost:8000\/replay.html<br><br>However, now I get this error:<br><span class=\"quote\">&gt;Unexpected Loading Error: &quot;https:\/\/replayweb.page\/docs\/exampl<wbr>es\/tweet-example.wacz&quot;<\/span><br><br><a href=\"#p108958675\" class=\"quotelink\">&gt;&gt;108958675<\/a><br><span class=\"quote\">&gt;Thought I found it a while ago<\/span><br>referring to<br>https:\/\/archive.org\/details\/daily_p<wbr>erma_cc_2026-02-22 -&gt; 9KE9-G3V9.warc.gz","time":1780346974,"resto":108914628},{"no":108960189,"now":"06\/01\/26(Mon)19:56:39","name":"Anonymous","com":"<a href=\"#p108958968\" class=\"quotelink\">&gt;&gt;108958968<\/a><br>CORS policy error. The docs say to put this in the HTML:<br><span class=\"quote\">&gt;&lt;script src=&quot;https:\/\/cdn.jsdelivr.net\/npm\/r<wbr>eplaywebpage@2.4.6\/ui.js&quot;&gt;&lt;\/script&gt;<wbr><\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;&lt;replay-web-page source=&quot;https:\/\/replayweb.page\/docs<wbr>\/examples\/tweet-example.wacz&quot;<\/span><br><span class=\"quote\">&gt;url=&quot;https:\/\/oembed.link\/https:\/\/t<wbr>witter.com\/webrecorder_io\/status\/15<wbr>65881026215219200&quot;&gt;&lt;\/replay-web-pag<wbr>e&gt;<\/span><br><br>You want to change that to this after downloading that WACZ file to the same folder as were &quot;replay.html&quot; is:<br><span class=\"quote\">&gt;&lt;replay-web-page style=&quot;height:9999px;&quot; source=&quot;tweet-example.wacz&quot;<\/span><br><span class=\"quote\">&gt;url=&quot;https:\/\/oembed.link\/https:\/\/t<wbr>witter.com\/webrecorder_io\/status\/15<wbr>65881026215219200&quot;&gt;&lt;\/replay-web-pag<wbr>e&gt;<\/span><br><br>It&#039;s a capture of this Shitter post:<br><span class=\"quote\">&gt;Want to help us make the best open-source web archiving tools to empower anyone to create, portable high-fidelity web archives?<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;We are looking for a senior dev to focus on our web crawling tools, including R&amp;D to push the limits of web archiving!<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;https:\/\/webrecorder.net\/jobs<\/span><br><span class=\"quote\">&gt;7:55 PM \u00b7 Sep 2, 2022<\/span>","filename":"zoomout_67percent_https---replayweb.page-docs","ext":".png","w":1280,"h":1024,"tn_w":125,"tn_h":100,"tim":1780358199074768,"time":1780358199,"md5":"+SZQNDeM3OeCrRRDPFns2w==","fsize":154866,"resto":108914628},{"no":108960404,"now":"06\/01\/26(Mon)20:29:21","name":"Anonymous","com":"<a href=\"#p108960189\" class=\"quotelink\">&gt;&gt;108960189<\/a><br>Point of that post was that I got it working, but it seems to only work on WACZ files and not WARC files. I wish these smelly little nerds would just make the AppImage have its web replays accessible in global localhost (not localhost in the AppImage).<br><br>That would be easier. I&#039;m having problems otherwise. For example: <a href=\"#p108958636\" class=\"quotelink\">&gt;&gt;108958636<\/a><br><span class=\"quote\">&gt;can ReplayWeb.page only import 3 out of the 5 GB of the .warc.gz file?<\/span><br>Yes. If you copied the first 2.19 GB of a 5-GB .warc.gz to another file then it can load everything in that. It shows no errors or has silent errors. However, trying to load that partial file at https:\/\/replayweb.page\/ fails with this error:<br><span class=\"quote\">&gt;Loading<\/span><br><span class=\"quote\">&gt;file:\/\/part.warc.gz<\/span><br><span class=\"quote\">&gt;...<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;An unexpected error occured: NotReadableError: The requested file could not be read, typically due to permission problems that have occurred after a reference to a file was acquired.<\/span><br><br>Even though it has 777 permissions and is owned my normal user. This shit is glitching.","filename":"samepath","ext":".png","w":1280,"h":935,"tn_w":125,"tn_h":91,"tim":1780360161840077,"time":1780360161,"md5":"fn5fPVa1dnf0sdm5XcN4gw==","fsize":79503,"resto":108914628},{"no":108960578,"now":"06\/01\/26(Mon)20:54:09","name":"Anonymous","com":"<a href=\"#p108943297\" class=\"quotelink\">&gt;&gt;108943297<\/a><br><a href=\"#p108958491\" class=\"quotelink\">&gt;&gt;108958491<\/a><br>Found this anime orgy pic in that torrent. It&#039;s from this URL which isn&#039;t in web.archive.org:<br><br>20190903215637 -&gt; https:\/\/i.pximg.net\/c\/1920x960_80_a<wbr>2_g5\/background\/img\/2019\/01\/19\/05\/0<wbr>5\/32\/12048318_801d09ef8901765a35dd3<wbr>cfb2699fc3f.png -&gt; HTTP 200, picrel<br><br>Probably also not in archive.is, but that site isn&#039;t loading for me now. The live version of that link as of now = HTTP 403 Forbidden.<br><br><a href=\"#p108960404\" class=\"quotelink\">&gt;&gt;108960404<\/a><br><span class=\"quote\">&gt;If you copied the first 2.19 GB of a 5-GB .warc.gz to another file then [ReplayWeb.page AppImage] can load everything in that<\/span><br>The ReplayWeb.page AppImage also doesn&#039;t care about headers \/ starting bytes in a certain way: nice. So you can run:<br><span class=\"quote\">&gt;$ zcat pbooru.com-2019-09-03-7322d8b3-0000<wbr>0.warc.gz | tail -c+2523460591 &gt; p2.warc<\/span><br>and the resulting &quot;p2.warc&quot; will load up just fine in the AppImage.","filename":"12048318_801d09ef8901765a35dd3cfb2699fc3f","ext":".png","w":1920,"h":960,"tn_w":125,"tn_h":62,"tim":1780361649332228,"time":1780361649,"md5":"\/eQcfGG\/Njn9kgZpOBgkXw==","fsize":2437139,"resto":108914628},{"no":108961043,"now":"06\/01\/26(Mon)22:17:48","name":"Anonymous","com":"<a href=\"#p108960404\" class=\"quotelink\">&gt;&gt;108960404<\/a><br><span class=\"quote\">&gt;but &lt;REPLAY-WEB-PAGE&gt; seems to only work on WACZ files and not WARC files<\/span><br>True. I looked into this more. As if it wasn&#039;t annoying enough that you have or can have the Gzip-compressed version and the uncompressed version of whatever.warc.gz, I have to make a .wacz version of the WARC. First, install py-wacz:<br><span class=\"quote\">&gt;https:\/\/pypi.org\/project\/wacz\/<\/span><br><span class=\"quote\">&gt;$ pip install wacz<\/span><br><br><a href=\"#p108960578\" class=\"quotelink\">&gt;&gt;108960578<\/a><br><span class=\"quote\">&gt;ReplayWeb.page AppImage also doesn&#039;t care about starting bytes in a certain way<\/span><br>wacz 0.5.0 does care though. Need a way to edit an uncompressed .warc file without vim trying to make a backup file. Anyways, you gotta run this so wacz version 0.5.0 can make the Web Archive Collection Zipped file:<br><span class=\"quote\">&gt;$ wacz create -o \/mnt\/usb\/myfile.wacz ~\/Downloads\/p4.warc<\/span><br><br>But hey, it finally works. Now the problem is that SingleFile 1.22.81 can&#039;t get it. (I forgot where I got this image from.)<br><br><span class=\"deadlink\">&gt;&gt;108960997<\/span><br><span class=\"quote\">&gt;&gt; mfw anon thinks a domain checker is the missing piece for web archiving<\/span><br>I never said that. I said it&#039;s an interesting site. It revealed some things I didn&#039;t know about.","filename":"timemap","ext":".png","w":1611,"h":475,"tn_w":125,"tn_h":36,"tim":1780366668080384,"time":1780366668,"md5":"bHMwtYhHAF50TRTvXdmA8A==","fsize":165799,"resto":108914628},{"no":108961059,"now":"06\/01\/26(Mon)22:20:17","name":"Anonymous","com":"How do I go about ripping\/encoding a DVD I got? I have a drive for it already.","time":1780366817,"resto":108914628},{"no":108961111,"now":"06\/01\/26(Mon)22:27:34","name":"Anonymous","com":"<a href=\"#p108961059\" class=\"quotelink\">&gt;&gt;108961059<\/a><br>Back when I used Windows a decade ago, I used these programs to rip many DVDs (mainly movies): to MKV and ISO. Software:<br>- MakeMKV<br>- AnyDVD, or was it called &quot;AnyDVD Pro&quot;?, I used a cracked version (warez)<br><br>I don&#039;t remember ever doing much DVD ripping with Linux. Maybe &quot;$ sudo cat \/dev\/sr0 &gt; file.iso&quot; works, or open it in VLC first so that decodes it.","time":1780367254,"resto":108914628},{"no":108961140,"now":"06\/01\/26(Mon)22:34:12","name":"Anonymous","com":"<a href=\"#p108961043\" class=\"quotelink\">&gt;&gt;108961043<\/a><br>The REPLAY-WEB-PAGE html element uses iframe and shadow root, both things which might be &quot;impossible&quot; for stuff such as SingleFile version 1.22.81 to capture.<br><br>Ugh.<br><br><a href=\"#p108958636\" class=\"quotelink\">&gt;&gt;108958636<\/a><br>Somewhat obvious, but if you open a .warc(.gz) in<br><span class=\"quote\">&gt;https:\/\/replayweb.page\/?source=fil<wbr>e%3A%2F%2Fpbooru.com-2019-09-03-732<wbr>2d8b3-00000.warc.gz<\/span><br>Then it first need to load the 5 GB file into RAM. Doesn&#039;t really matter since that too would use said iframe and shadow root DOM.","time":1780367652,"resto":108914628},{"no":108961477,"now":"06\/01\/26(Mon)23:43:36","name":"Anonymous","com":"<span class=\"deadlink\">&gt;&gt;108961016<\/span><br><span class=\"quote\">&gt;Implying I have any say in the matter<\/span><br>giw","filename":"we got a discord","ext":".jpg","w":1080,"h":1666,"tn_w":81,"tn_h":124,"tim":1780371816942164,"time":1780371816,"md5":"GF3aCt0pXNMERORUy4sqkw==","fsize":134694,"resto":108914628},{"no":108962214,"now":"06\/02\/26(Tue)03:07:11","name":"Anonymous","com":"<a href=\"#p108961140\" class=\"quotelink\">&gt;&gt;108961140<\/a><br>Possible solution: use ipwb (InterPlanetary Wayback) instead of ReplayWeb.page. As I remember, ipwb lacked archival fixity, so unlike ReplayWeb.page, it couldn&#039;t show non-Base64-encoded images in webpages. At least ipwb doesn&#039;t use iframe + shadow DOM \/ web components like ReplayWeb.page. Plus, all the images in the webpage I&#039;m thinking of are broken\/not-grabbed in ReplayWeb.page&#039;s replay as well. Both ipwb and the other correctly render CSS and probably also JS.<br><br>I&#039;m running this:<br><span class=\"quote\">&gt;$ ipwb --daemon \/dns\/10.0.0.76\/tcp\/5001\/http index \/pathTo\/00000.warc.gz &gt; \/pathTo\/00000.warc.cdxj<\/span><br>It took roughly 3 minutes to get started and now it says:<br><span class=\"quote\">&gt;Processing WARC records in 00000.warc.gz: 409\/104415<\/span><br><br>It&#039;s helpful to make this modification to &quot;\/usr\/local\/lib\/python3.12\/dist-pac<wbr>kages\/ipfshttpclient\/client\/__init_<wbr>_.py&quot; beforehand:<br>https:\/\/github.com\/ipfs-shipyard\/py<wbr>-ipfs-http-client\/commit\/c191872706<wbr>e1118d2cd76ea326a2a8d580899353<br><br>Picrel shows the entirety of the edit which is useful for ipwb&#039;s usage of ipfshttpclient.","filename":"2026-06-02-010232_1280x1024_scrot","ext":".png","w":1280,"h":1024,"tn_w":125,"tn_h":100,"tim":1780384031806578,"time":1780384031,"md5":"DJReUGEYHRd+EQGpU44heQ==","fsize":48448,"resto":108914628},{"no":108962297,"now":"06\/02\/26(Tue)03:39:02","name":"Anonymous","com":"<a href=\"#p108961477\" class=\"quotelink\">&gt;&gt;108961477<\/a><br><span class=\"quote\">&gt;giw<\/span><br>What does this mean?<br><br><a href=\"#p108961111\" class=\"quotelink\">&gt;&gt;108961111<\/a><br><span class=\"quote\">&gt;or was it called &quot;AnyDVD Pro&quot;?<\/span><br>AnyDVD HD 8.4.8.0; this screenshot of it was deleted off of archive.org\/details\/ by someone other than the uploader, so I added it to ar:\/\/:<br>https:\/\/kingoffireland.store\/raw\/oT<wbr>EbjOKt2ld91N_owwYtvmkSEsDvE1Ch3l0nG<wbr>kMDR6E","filename":"requesttrialnointernet","ext":".png","w":1280,"h":1024,"tn_w":125,"tn_h":100,"tim":1780385942410818,"time":1780385942,"md5":"ex+J+629zs+MUoldC7z5Vw==","fsize":111813,"resto":108914628},{"no":108962320,"now":"06\/02\/26(Tue)03:48:16","name":"Anonymous","com":"<a href=\"#p108961140\" class=\"quotelink\">&gt;&gt;108961140<\/a><br>SingleFile dev complaining about shadow roots:<br><span class=\"quote\">&gt;https:\/\/news.ycombinator.com\/item?<wbr>id=20232628<\/span><br><span class=\"quote\">&gt;On my side, the criticism I could make to web components is that there is no standard to serialize their shadow roots and, therefore, they are not deserializable without using JavaScript. I have been maintaining SingleFile [1], a web extension to save complete web pages, for 9 years and this is the first time I have had to include JavaScript code [2] to attach and display the shadow root of the web components (e.g. embed tweets) included in the saved page.<\/span><br><br>He also said:<br><span class=\"quote\">&gt;Thanks! You can find web components in a lot of unexpected places. For example, this page [1] contains more than 10K web components... The good news is that once the Pandora&#039;s box is open, I had the idea to code SingleFileZ [2] which also requires JavaScript to be enabled but frankly uses it!<\/span><br>10,000 shadow DOMs in one webpage? Disgusting.<br><br><a href=\"#p108962297\" class=\"quotelink\">&gt;&gt;108962297<\/a><br>&quot;got it wrong&quot; maybe.","time":1780386496,"resto":108914628},{"no":108962370,"now":"06\/02\/26(Tue)04:06:20","name":"Anonymous","com":"<a href=\"#p108923876\" class=\"quotelink\">&gt;&gt;108923876<\/a><br><span class=\"deadlink\">&gt;&gt;108960959<\/span><br>IA trannies are aware of this issue in the face of popular news outlets wanting to be excluded from the WBM. They will probably cave to their demands, just like WBM always does, and excessively does. (I&#039;d once again like to take this opportunity to say fuck archive.org.)<br><br>As of today or yesterday, you can go to any https:\/\/archive.org\/... page and see their message about it (picrel):<br><span class=\"quote\">&gt;Keep the news in the Wayback Machine. Sign Fight for the Future&#039;s letter. [ https:\/\/www.savethearchive.com\/News<wbr>Leaders ]<\/span><br>following that link, you can see:<br><span class=\"quote\">&gt;https:\/\/archive.is\/2026.06.02-0122<wbr>36\/https:\/\/www.savethearchive.com\/N<wbr>ewsLeaders<\/span><br><span class=\"quote\">&gt;Are you a journalist? Join Rachel Maddow and sign the journalist letter here.<\/span><br><span class=\"quote\">&gt;A project by Fight for the Future<\/span><br><span class=\"quote\">&gt;Tell New York Times, The Atlantic, and USA Today to keep the crucial work of journalists in the Wayback Machine!<\/span><br><span class=\"quote\">&gt;The news isn\u2019t getting preserved in the Wayback Machine anymore because major media outlets are blocking it.<\/span><br><span class=\"quote\">&gt;This petition is a demand for them to stop.<\/span>","filename":"kxsHC","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780387580333164,"time":1780387580,"md5":"zJVsLN4iLUItJIMyhdM0ew==","fsize":40323,"resto":108914628},{"no":108962393,"now":"06\/02\/26(Tue)04:12:56","name":"Anonymous","com":"Why&#039;d you or someone else delete your posts? Promoting some thing in a spammy\/AI way? I found <span class=\"deadlink\">&gt;&gt;108960981<\/span> to be especially suspect. Also the thing you said about csv files sounded sorta dumb.<br><br><a href=\"#p108962370\" class=\"quotelink\">&gt;&gt;108962370<\/a><br>that pic is login-walled in IA (would add it to ar:\/\/ but it&#039;s not flat-out deleted, yet). maybe I&#039;ll add it anyways...","filename":"d9gdr6s-71f1f835-a909-no-watermark","ext":".jpg","w":644,"h":1000,"tn_w":80,"tn_h":125,"tim":1780387976340179,"time":1780387976,"md5":"IPXarolmPVFerwyaKNPkKQ==","fsize":380506,"resto":108914628},{"no":108962439,"now":"06\/02\/26(Tue)04:26:03","name":"Anonymous","com":"<a href=\"#p108962297\" class=\"quotelink\">&gt;&gt;108962297<\/a><br><a href=\"#p108962320\" class=\"quotelink\">&gt;&gt;108962320<\/a><br>&quot;giw&quot; is more likely &quot;god i wish&quot; since anons sometimes abbreviate &quot;god i wish that were me&quot; to &quot;giwtwm&quot;<br><br><a href=\"#p108962393\" class=\"quotelink\">&gt;&gt;108962393<\/a><br><span class=\"quote\">&gt;Why&#039;d you or someone else delete your posts?<\/span><br>NTA, but it could be unrelated. I had all my posts nuked once after pretending to be an LLM despite the fact that the posts before that were perfectly normal replies.","filename":"1754254849340668","ext":".png","w":598,"h":566,"tn_w":125,"tn_h":118,"tim":1780388763062297,"time":1780388763,"md5":"9Z11rWGvBq6jlViwTaB1pg==","fsize":87974,"resto":108914628},{"no":108962523,"now":"06\/02\/26(Tue)04:48:36","name":"Anonymous","com":"<a href=\"#p108962439\" class=\"quotelink\">&gt;&gt;108962439<\/a><br><span class=\"quote\">&gt;image<\/span><br>lulz, did the official DevianTART twitter account really join in to laugh at some fat neckbeard commenting on some weird furfag artwork in DA?<br><br>Was that guy in fact a chatbot making short 4chan shitposts ITT while actually shilling something? Let&#039;s look closer at his posts (mostly chronological):<br><br>1.<br><span class=\"quote\">&gt;&gt;implying any centralized archive won&#039;t get enshittified eventually<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;local hoarding or die, everything else is cope<\/span><br>Somewhat agree. Local redundancy and actually having the data locally so you can effectively deal with whatever remote &quot;cloud&quot; deleting everything is top priority. Second priority is sharing it over BitTorrent and so on. &quot;What happens to your data when you die?&quot; may also concern some people.<br><br>2.<br><span class=\"quote\">&gt;&gt;mfw someone still thinks ngrok free tier is usable for anything serious<\/span><br>Strongly agree. At first I thought it would be helpful, but later I found that ngrok free tier is a useless or personal-use-only toy.<br><br>3.<br><span class=\"quote\">&gt;&gt; tabhab<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;kek, did you just have a stroke or is that the new meta<\/span><br>&quot;Pointless post&quot;<br><br>4.<br><span class=\"quote\">&gt;saucenao&#039;s been dying for years. just use iqdb or reverse image search on yandex.<\/span><br>Sounds like a canned reply. Sounds suspect. Made me think of that incident where anti-CP organizations were uploading CP to saucenao.<br><br>5.<br><span class=\"quote\">&gt;[says I can use some service for arweave uploads instead of &quot;wrestling with raw api calls&quot;]<\/span><br>So is your service a scraping tool or some crypto \/ blockchain thing? And you just so happen to have a solution for my niche problem.<br><br>6.<br><span class=\"quote\">&gt;&gt; using web.archive to dodge rate limits is just asking for a banwave when they start blocking wbm ips<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;fr if you&#039;re scraping at scale just [use my service] instead, way cleaner than fighting with archive.org&#039;s jank<\/span><br>First part sounds sorta legit, second part is very suspect. BTW, web.archive.org does rate limiting as well if you try to download many (or just a few) of their captures.","filename":"GJZM4","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780390116913394,"time":1780390116,"md5":"1MM2zG3io5Pv\/tndsWHYiQ==","fsize":53251,"resto":108914628},{"no":108962587,"now":"06\/02\/26(Tue)05:07:28","name":"Anonymous","com":"<a href=\"#p108962523\" class=\"quotelink\">&gt;&gt;108962523<\/a><br><span class=\"quote\">&gt;giwtwm image<\/span><br>yeah, classic screenshot image. (I first saw that image and anons saying that years ago.)<br><br><span class=\"quote\">&gt;effectively deal with whatever remote &quot;cloud&quot; deleting everything is top priority<\/span><br>Also need to make sure you have the data on at least two HDDs in case one dies. I should probably just go buy a HDD for one of my non-mirrored zpools.<br><br><span class=\"quote\">&gt;Was that guy in fact a chatbot shilling something? Let&#039;s look closer at his posts (mostly chronological):<\/span><br>7.<br><span class=\"quote\">&gt;&gt; &quot;fuckai&quot; is the most honest title in that whole library<\/span><br>Stupid \/ uninformed post. Currently, Library of Babel sites use zero AI, only good old-fashioned mathematics and code:<br><span class=\"quote\">&gt;https:\/\/libraryofbabel.app\/about<\/span><br><span class=\"quote\">&gt;Most importantly, the library, and it\u2019s books, are not stored anywhere \u2014 more on that below. Instead, a book is generated by a mathematical function when you view it. Once you view it, it is still not stored anywhere \u2014 the same way that by computing 2 \u00d7 5, the resulting 10 is not being stored anywhere, it is simply the output of the function. The next person to visit that same book will compute the same function again, and get the same answer.<\/span><br><br>8.<br><span class=\"quote\">&gt;&gt; mfw anon thinks a domain checker is the missing piece for web archiving<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;just use archive.is and stop overcomplicating it<\/span><br>Already said in the other post that this is stupid. The second part is stupid as well. BTW, in the past archive.today used to have a master index. Not anymore, but you&#039;d go their main index and it&#039;d list all the domains it has captures of, from 0example.com to zexample.com.<br><br>9.<br><span class=\"quote\">&gt;&gt; using discord for anything you want to keep<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;lmao ngmi<\/span><br>Doesn&#039;t really correspond to what he replied to, dumb short 4chan post, but still sorta based.<br><br>10.<br><span class=\"quote\">&gt;&gt; shodan is for scanning ips and exposed services, not subdomain enumeration<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;just use [services] like everyone else, they pull from [services] for free<\/span><br>First part sounds legit, second part is maybe also legit, but I redacted it anyways.","filename":"NUW7i","ext":".png","w":1024,"h":768,"tn_w":125,"tn_h":93,"tim":1780391248041966,"time":1780391248,"md5":"HadRIRnyjHH8FAJv7960GA==","fsize":457962,"resto":108914628},{"no":108962618,"now":"06\/02\/26(Tue)05:15:41","name":"Anonymous","com":"<a href=\"#p108962523\" class=\"quotelink\">&gt;&gt;108962523<\/a><br><a href=\"#p108962587\" class=\"quotelink\">&gt;&gt;108962587<\/a><br>Yep, it was likely a shillbot. Check desuarchive for the product he was shilling, it was spammed all over the place.","filename":"1771791883385650","ext":".jpg","w":1200,"h":675,"tn_w":125,"tn_h":70,"tim":1780391741334511,"time":1780391741,"md5":"cr+rQkcYgqI4ysFfpiSXsg==","fsize":124973,"resto":108914628},{"no":108962625,"now":"06\/02\/26(Tue)05:18:59","name":"Anonymous","com":"<a href=\"#p108962587\" class=\"quotelink\">&gt;&gt;108962587<\/a><br>10.<br><span class=\"quote\">&gt;&gt;muh 1 million images<\/span><br><span class=\"quote\">&gt;<\/span><br><span class=\"quote\">&gt;bro just use a local CLIP model and call it a day. you&#039;re overthinking this.<\/span><br>Guess he used the LLM which pretends to be a 4chan user. That is a legit thing though:<br><span class=\"quote\">&gt;https:\/\/github.com\/openai\/CLIP<\/span><br><span class=\"quote\">&gt;CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image<\/span><br><br>Any point engaging with a bot\/non-human? Yes. If you understand part of Hegel&#039;s philosophical project, you&#039;ll know that engaging with falsity isn&#039;t as bad as one may think:<br><span class=\"quote\">&gt;https:\/\/www.marxists.org\/reference<wbr>\/archive\/hegel\/works\/ph\/phprefac.ht<wbr>m#m039<\/span><br><span class=\"quote\">&gt;Truth and falsehood as commonly understood belong to those sharply defined ideas which claim a completely fixed nature of their own, one standing in solid isolation on this side, the other on that, without any community between them. Against that view it must be pointed out, that truth is not like stamped coin that is issued ready from the mint and so can be taken up and used. Nor, again, is there something false, any more than there is something evil. Evil and falsehood are indeed not so bad as the devil, for in the form of the devil they get the length of being particular subjects; qua false and evil they are merely universals, though they have a nature of their own with reference to one another. Falsity (that is what we are dealing with here) would be otherness, the negative aspect of the substance, which [substance], qua content of knowledge, is truth. But the substance is itself essentially the negative element, partly as involving distinction and determination of content, partly as being a process of distinguishing pure and simple, i.e. as being self and knowledge in general. Doubtless we can know in a way that is false. To know something falsely means that knowledge is not adequate to, is not on equal terms with, its substance. Yet this very dissimilarity is the process of distinction in general, the essential moment in knowing.<\/span><br><br><a href=\"#p108962618\" class=\"quotelink\">&gt;&gt;108962618<\/a><br>was thinking that too: other threads","time":1780391939,"resto":108914628}]}