universeodon.com is part of the decentralized social network powered by Mastodon.
Be one with the #fediverse. Join millions of humans building, creating, and collaborating on Mastodon Social Network. Supports 1000 character posts.

Administered by:

Server stats:

2.6K
active users

Learn more

LLMs are like slot machines, in that an incorrect answer (the slot machine eating your dollar) is unremarkable, while the LLM solving a problem (a jackpot) is amazing, and the latter stands out in your memory, causing you to overestimate the reliability of LLMs.

blog.glyph.im/2025/08/futzing-

@pluralistic Recently at work, a manager pushing AI was presenting the "great stats" of AI use, and presented a "24% prompt acceptance rate" as if that didn't mean "76% prompt failure rate", aka, it fucks up the vast majority of the time. Even in these cases, of the 24% of prompts that were "accepted", there were no stats for how much additional work was needed to get the "accepted" result into an actually acceptable state.

JPeck

@Azuaron @pluralistic 🤣
Mirrors what I've seen. Our team tried to implement LLMs into our workflow then decided it was like working with a dense intern. If I have to walk it through each step then double check the work. I may as well do the work myself

@boscoandpeck @Azuaron @pluralistic

Found the same thing. But working with that slow learning student is more rewarding. And the student will remember what you worked through with them (sometimes) and will be able to do it on their own at some point. Didn't manage to do that with the LLM so far.

@tschenkel And the student won’t gaslight you when you point out their mistakes. @boscoandpeck @Azuaron @pluralistic

@boscoandpeck @Azuaron @pluralistic but, but, 10000 lines of code per 12 hour workday 🙃

@loke @Azuaron @pluralistic yeah, i know it's amazing! and think of all the hidden vulnerabilities in those 10000 lines. we'll be cleaning that up for decades... so, job security?😂

@loke @boscoandpeck @Azuaron @pluralistic the oerson who does the job with less lines of code (after formatting for normalisation) usually wins long-term

(and LoC of a program does include the total LoC of all frameworks and libraries it includes, independent on whether the functions are actually needed)

@boscoandpeck @Azuaron @pluralistic Isn't the plan for you to train the LLM and then fire you, then after a short interval hire you or someone like you to fix the work the LLM fucks up, for less money and no benefits?

@tinydoctor @boscoandpeck @pluralistic In my case, I think the plan is to just fire 3/4 the people like me, and tell the remaining 1/4 to be 5 times as productive.

@tinydoctor @Azuaron @pluralistic I like the part where you say plan. There's no plan. The suits at all the companies are talking to salesman at AI companies. They are getting sold AI then handing it to their IT managers and saying okay now make it work. Then when it doesn't work the manager gets blamed. No plan or conspiracy needed, just good old-fashioned greed and incompetence

@tinydoctor @boscoandpeck @Azuaron @pluralistic No, after an LLM thoroughly destroys the software architecture skilled engineers will be paid even more to fix it.

Oops!An unexpected error occurred.