Skip to main content
Baldur Bjarnason

Knowledge tech that's subtly wrong is more dangerous than tech that's obviously wrong. (Or, where I disagree with Robin Sloan.)

Baldur Bjarnason

So, I’ve long been a fan of Robin Sloan’s. This made one of his latest blog posts a bit of a disappointment.

I disagree with pretty much both the core premise and every step of the reasoning of it

And, unfortunately, because I’ve long ben a fan, I feel obligated to explain why, which is tedious and generally fruitless because at this point in the AI Bubble debate isn’t going to change anybody’s mind.

Maybe if I scream into a pillow the urge to write a reply blog post will go away?

One muffled howl of despair later.

Okay, that didn’t work.

So, first off, the core framing of the argument is wrong. It portrays the core arguments made for and against as inherently wrong, that you can’t have reached an unambiguous “LLMs are wrong” stance through proper reasoning.

This passage, specifically, outright claims that anybody who has already come to a conclusion based on their research is wrong:

First, if you (you engineer, you AI acolyte!) think the answer is obviously “yes, it’s okay”, or if you (you journalist, you media executive!) think the answer is obviously “no, it’s not okay”, then I will suggest that you are not thinking with sufficient sensitivity and imagination about something truly new on Earth.

Considering that the most consistent, most forceful, and least ambiguous warning cries about LLMs have come from AI and Machine Learning academics like Timnit Gebru, Dr Abeba Birhane, Emily M. Bender, and Dr. Damien P. Williams, (I’m using “Dr” with the name based on how they represent themselves on social media) just to name a few. They have come to their conclusion precisely because they have thought about the topic with “sufficient sensitivity and imagination”, not to mention extensive domain knowledge, deep understanding of how the tech works, and how these models interact with the larger context of society and culture.

They’ve been consistent about their criticism and repeated again and again to anybody who wants to liste. Everybody who wants to use LLMs because they think they’re nifty, keeps ignoring what they say.

Or, not to put a too fine point on it, if you find the arguments against LLMs easy to dismiss, that’s because you are choosing to only pay attention to the arguments that are easy to dismiss.

The model is not the data set #

Robin Sloan’s next argument, after dismissing both copyright concerns and the common “AI” boosterism that LLMs learn just like people (a claim that is, in and of itself, obviously wrong based on even a superficial understanding of the neurology of learning) is that LLMs are all of writing (emphasis original).

In this formulation, language models are not merely trained on human writing. They are the writing, granted the ability to speak for itself. I imagine the PyTorch code as a mech suit, with squishy language strapped in tight…

The problem with this statement is that the map is not the territory, a model of a data set is not the data set. To say that a language model is the writing is equivalent to saying that cyanide is an apple just because you can get cyanide from processing apple seeds.

Exacerbating the problems with this framing are the assumptions made about the data set. Namely, that text on the internet is a fair representation of the entirety of human writing.

Honestly, I find it hard to think of an idea that’s further from the truth.

The medium is the message #

Text on the web is shaped by the web, its culture, form, incentives, and economics. As McLuhan was fond of saying “the medium is the message”. The message delivered by the web is bound, modified, structured, created, and delivered under the conditions and precepts of the web as a medium.

And most of what gets delivered is outright poisonous.

The web, taken in it’s near entirety as LLM vendors seem to have done, isn’t the wellspring of all written knowledge, but is instead a poisoned well. It may be only a “little bit” poisoned, from the perspective of those of us swimming in a sea of aggro and trolling, but that toxicity gets magnified when the ocean is condensed into a model. It results in tools that have magnified biases and errors which are wrapped in the capability to replicate various formal and academic writing styles that we associate with truth, factuality, and meaning.

This is what makes LLMs so dangerous. If we assume that competitive forces will drive costs and associated climate impacts down (not a given by any means considering how irrational the US tech market has become), and we look past the fact that these tools are explicitly being made and marketed as methods for reducing the bargaining power of labour (which is certainly a look), then the biggest risk with these models is that they are consistently wrong with an inherent base error rate that none of the vendors have managed to remove because it’s baked into the concept of modelling a poisoned well, all wrapped up in writing styles we associate with correctness and formality, but with enough variability and subtlety to make the errors impossible to detect programmatically or automatically.

That means that the harm done by these systems compound the more widely they are used as errors pile up at every stage of work, in every sector of the economy. It builds up an ambient radiation of system variability and errors that magnifies every other systemic issue with the modern state and economy.

It’s as if homeopathy and naturopathy got adopted as standard practices in the healthcare system. Society might be fine if it’s only used by a small collection of cranks, but once it gets adopted by half of the medical profession, things would start to get bad, fast.

That includes the current poster child of “LLMs are awesome”: coding assistants. These tools have the same error rates, repetition, and biases as other LLM applications and they consistently perpetuate harmful or counterproductive practices from the past (like an over-reliance on React or poor accessibility). A world where every programmer has adopted a coding copilot, unless there’s a revolutionary improvement in the effectiveness of these tools, is a world where the software industry collapses in on itself in a cascade of bugs, poor design, and systemic failures.

And generating novel training data – whether it’s automatically or through labour – doesn’t get past this problem. The biases are an artefact of the environment and AI vendors are some of the most irrational and biased environments available. Automatically generated training data also just perpetrates the issues with the original data set.

None of this works the way boosters are claiming. None of it.


Aside: calling LLMs for translation “Babel Fish” is glossing over some of the major flaws in these tools work for translation, which is that they only sorta kinda work for languages with a large body of text in the training data set. For most languages in the world that don’t have huge pools of textual sludge all over the web, like Icelandic or Danish, LLMs are less reliable machine translators than old pre-LLM approaches.

And even when it does kinda sorta work, the reader is doing a lot of heavy lifting by interpreting the model’s incredibly random and shitty translations based on the context. What’s worse is that these tools have a thick layer of biases baked into the experience because most of them are fundamentally English-language tools made by English-speakers, using heavily biased collections of English-language texts as the foundation,. This means that LLMs often translate gendered or non-gendered pronouns incorrectly based on stereotypes (doctors are male, nurses female, according to these models) and, as anybody who studies international queer media has already discovered, regularly translate local words for “partner” or “lover” as “boyfriend” if the speaker or writer is a woman, or “girlfriend” if they’re male.

This shit is translating the world through the lens of a bigoted, middle-aged American.


Finally, the last resort of the tech booster, complete and utter science fiction #

If super science is a possibility—if, say, Claude 13 can help deliver cures to a host of diseases—then, you know what? Yes, it is okay, all of it. I’m not sure what kind of person could insist that the maintenance of a media status quo trumps the eradication of, say, most cancers. Couldn’t be me. Fine, wreck the arts as we know them. We’ll invent new ones.

This is science-fiction. There is no path that can take the current text synthesis models and turn them into super-scientists.

You don’t get super-science by modelling text that is itself an inaccurate model of reality. That’s not how any of it works.

LLM boosters keep saying that AI will help science. They point to studies where statistical modelling is used to further biological or medical research, and then conclude that LLMs will enable super-science.

But the statistical modelling used in biological, medical, or “hard science” research is not of the scientific texts themselves. It’s of data and statistical observations. That statistical modelling might be of help with statistics is an obvious conclusion. That statistical modelling of text might help with statistics is not. That’s a extraordinary claim and extraordinary claims require extraordinary evidence.

There is no path from language modelling to super-science. That claim exists only to help inflate the massive financial bubble that has grown around LLMs.

So—is super science really on the menu? We don’t have any way of knowing; not yet.

We absolutely do know. It is not on the menu. ML and statistical models – not LLMs – help with scientific research but not to the degree that it would enable super-science. Just regular plodding scientific progress. There is not magic solution to scientific research on the horizon and believing in that magical solution is exactly why people like Elon Musk and the US’s tech oligarchy are comfortable with the idea of ending public funding for scientific research.

That’s the shit you’re enabling with your AI boosterism.

It’s not super-science. It’s anti-science.