PDF Forgeries Are Surprisingly Rare
One kind of fraud is striking in its absence online: tampered or forged PDFs. People create malicious videos, photos, chat logs, and Microsoft Word documents all the time to scam and propagandize people, or publish entire PDFs full of garbage science, but they don’t edit existing PDFs.
Once in a while I see someone object to a paper I host on Gwern.net by saying “but that’s not on a real journal website! it’s not peer-
There is plenty of incompetence, fraud, and malice online, often in PDFs… but only new PDFs. I can’t think of a single fraud accomplished by editing a real PDF & just uploading it for Google Scholar etc. or where I’ve been burned by even mislabeling.
You can just search for a paper title, download it, and trust that ~100% of the time, you are getting what you thought you were getting, with the main caveat being that you may be downloading the author’s draft or a preprint and not the finalized version (particularly in economics, where papers might go through many preprints, sometimes changing the results substantially along the way, and take anywhere up to a decade to reach final publication). And when you do find a PDF claiming something malicious, like claiming to use statistics to show that ‘Trump won the 2020 US presidential election’, it’s always a ‘new’ PDF, which is forthright about it being a new unpublished ‘white paper’ or somesuch, and doesn’t purport to be a published paper. Or if it was a forged or edited document, it was usually clearly exported from Microsoft Word or another word processor (eg. all the forgeries exposed by anachronistic use of the Calibri font).
Whereas, if you were so epistemically careless with images on, say, Facebook, you would wind up with a folder stuffed full of lying images which have been Photoshopped, claimed to be things other than what they are, ‘deep faked’, etc.
PDF forgery is striking because it’d be so easy to do: find a useful research paper, edit it in any of the many PDF utilities, upload anywhere, wait for people to copy it (as they do), then take down yours; now you have an authoritative peer-
Given how rarely people check the original papers, and how retracted studies like the Wakefield autism/
And it’s not as if there are no zealots or fanatics or malefactors willing to do so—historically, scribes tamper with documents all the time! (“Written by Confucius” or “apropos of nothing, now I, Josephus the Jew, will tell you how wonderful Jesus Christ was”…)
Why can you just download PDFs off any random asshole’s website (like mine)?
Because there’s no Photoshop for PDFs, maybe? Places like arXiv provide TeX sources, but that’s still a dark art for most would-
This is enough to push malefactors into other approaches. After all, if editing photos can work so well, why bother with the much harder editing of PDFs?
As the joke goes, PDFs don’t need to outrun the (Russian?) bear, they just need to outrun the other format.