I understand that it has become a term of art, but I still dislike talking about generative AI “hallucinating.”
It’s not an apt metaphor; a hallucination is an error in perception, seeing something that isn’t there. That doesn’t apply to LLMs or diffusion models, though.
The process that produces erroneous output is exactly the same as the process that produces correct output; nothing has gone wrong, it’s just that the probabalistically-generated plausible answer was incorrect.
Related, and more essentially, an actual hallucination is an error in PERCEPTION. With AI (including so-called multi-modal systems), there is no one in there to perceive.
Indeed, when we converse with an AI-based product, the only possible hallucination is happening on our side of the keyboard (and reading some people’s AI takes, there’s definitely some hallucinating going on).
@maxleibman Some folks have used “confabulation” but it didn’t stick.
@maxleibman pretty much just chalked it up to the same people that made devops a job title, or other various product person words that everyone starts repeating.
@elebertus Speaking as someone whose job title is ”solutions architect,” I’m gonna about that one!
@maxleibman oh don’t worry I’ve been “a devops” before
@maxleibman I have never been happy with the term - far many of the reasons you say. It is not a hallucination.
It is not that we see a tree and imagine it is blue. It is that someone has actually painted a tree blue.
@maxleibman Beautifully put
@maxleibman when an LLM is wrong, you can call it wrong
Today's immense training data sets include absolute boatloads of wrong
@maxleibman
Another way someone put it was that LLMs hallucinate 100% of their output and it just coincides with reality and facts just enough you can market it not being completely useless.
@maxleibman as @tante says, it’s *all* hallucinations.
@maxleibman I consider it more a gaslighting than a hallucination, albeit unintentioned on the LLM's end
The use of that term applied to LLMs has bothered me as well. I would prefer they just call them what they are: incorrect answers.
It is a bit baffling that people so nonchalantly adopt these models as agents of fact when they often fail to produce correct answers.
@maxleibman The urge to anthropomorphize LLMs is very strong.
encouraging folks to act like they are not robots, feels as hard as discussing the ways in which a robot is not alive.
both of these seem harder every day.
when we did a bit of star wars day media critique (theme snacks and arguing), we talked a lot about fantasy v sci-fi and the role of droids.
@maxleibman I used to say it’s all hallucination, but it may be more apt (and relatable) to say it’s all dreaming.
Just like in our own dreams, it starts with a premise or idea, then a believable story is built around it, with plausible details, progression, even plot twists.
These are all built on our own knowledge and experiences, just like with an AI.
But sometimes, the dream takes an odd turn. This can be crazy and unexpected, and those are the “hallucinations” that make the press.
They can also be believable, and even logically consistent. But they’re still wrong — the worst result for an AI you’re hoping for an answer from.
@darthnull Great reply! I don’t like the anthropomorphizing of “dreams,” but in terms of analogy you’re right, it is a much better fit.
@maxleibman We’re gonna anthropomorphise anyway, so we can at least use a more accurate metaphor. :)
@darthnull Well put!
@maxleibman I 100% agree. Here’s a blog post I wrote on the subject a while back: https://mattjhayes.com/2024/01/21/when-computers-hallucinate/
@maxleibman a better word is "confabulation" but it didn't catch on because it is less familiar and maybe less dramatic sounding.