Skip to main content

View Post [edit]

Poster: MashTec Date: Jan 21, 2024 10:55am
Forum: general Subject: Suppress files from "Download Options" list

Hello,
i'm new to Archive.org and did my first upload here. Almost all files i uploaded are meant to be published for the community to download (instead of the thumbnail-image e.g.).
After i uploaded my files, TheArchive.org seemed to auto generate some meta data. Lots of .xml, .sqlite, xml.gz (HOCR and CHOCR files). All these additional files are shown in the "Download Options" list and make it hard to see the actual files to download.
I didn't see this on any other pages on TheArchive. Did i miss an important option to suppress these files? Is there any way to get rid of those files to have the page clean to the actual content?

Reply [edit]

Poster: sydneydux Date: Jan 21, 2024 6:02pm
Forum: general Subject: FWIW

Older uploads here seem to have fewer download options. Maybe that’s what you’re seeing on other pages here.

Newer uploads (like a PDF) will auto-generate (“derive”) many files types, some for folks with “print disabilities.” I guess we should be supportive of that high level of empathy by IA.

You can always add notes to your upload (after it’s done deriving, i.e., the light-green horizontal bar will disappear). You could say “for a text-based PDF that you can search, download ‘PDF WITH TEXT, or for a PDF with high quality images, download ‘PDF’.” You can add more notes describing the various searchable-text options, etc. I’m sure there’re FAQ’s with this info you can pull text from. And, yes, the text-based PDF’s sometimes have poor image quality (especially old, dark photos).

Hope this helps.

Reply [edit]

Poster: MashTec Date: Jan 22, 2024 12:07pm
Forum: general Subject: Re: FWIW

Thank you for your answer.