Anthropic's stated "AI timelines" seem wildly aggressive to me.

As far as I can tell, they are now saying that by 2028 – and possibly even by 2027, or late 2026 – something they call "powerful AI" will exist.

And by "powerful AI," they mean... this (source, emphasis mine):

In terms of pure intelligence, it is smarter than a Nobel Prize winner across most relevant fields – biology, programming, math, engineering, writing, etc. This means it can prove unsolved mathematical theorems, write extremely good novels, write difficult codebases from scratch, etc. In addition to just being a “smart thing you talk to”, it has all the “interfaces” available to a human working virtually, including text, audio, video, mouse and keyboard control, and internet access. It can engage in any actions, communications, or remote operations enabled by this interface, including taking actions on the internet, taking or giving directions to humans, ordering materials, directing experiments, watching videos, making videos, and so on. It does all of these tasks with, again, a skill exceeding that of the most capable humans in the world. It does not just passively answer questions; instead, it can be given tasks that take hours, days, or weeks to complete, and then goes off and does those tasks autonomously, in the way a smart employee would, asking for clarification as necessary. It does not have a physical embodiment (other than living on a computer screen), but it can control existing physical tools, robots, or laboratory equipment through a computer; in theory it could even design robots or equipment for itself to use. The resources used to train the model can be repurposed to run millions of instances of it (this matches projected cluster sizes by ~2027), and the model can absorb information and generate actions at roughly 10x-100x human speed. It may however be limited by the response time of the physical world or of software it interacts with. Each of these million copies can act independently on unrelated tasks, or if needed can all work together in the same way humans would collaborate, perhaps with different subpopulations fine-tuned to be especially good at particular tasks.

In the post I'm quoting, Amodei is coy about the timeline for this stuff, saying only that

I think it could come as early as 2026, though there are also ways it could take much longer. But for the purposes of this essay, I’d like to put these issues aside [...]

However, other official communications from Anthropic have been more specific. Most notable is their recent OSTP submission, which states (emphasis in original):

Based on current research trajectories, we anticipate that powerful AI systems could emerge as soon as late 2026 or 2027 [...] Powerful AI technology will be built during this Administration. [i.e. the current Trump administration -nost]

See also here, where Jack Clark says (my emphasis):

People underrate how significant and fast-moving AI progress is. We have this notion that in late 2026, or early 2027, powerful AI systems will be built that will have intellectual capabilities that match or exceed Nobel Prize winners. They’ll have the ability to navigate all of the interfaces… [Clark goes on, mentioning some of the other tenets of "powerful AI" as in other Anthropic communications -nost]

----

To be clear, extremely short timelines like these are not unique to Anthropic.

Miles Brundage (ex-OpenAI) says something similar, albeit less specific, in this post. And Daniel Kokotajlo (also ex-OpenAI) has held views like this for a long time now.

Even Sam Altman himself has said similar things (though in much, much vaguer terms, both on the content of the deliverable and the timeline).

Still, Anthropic's statements are unique in being

official positions of the company
extremely specific and ambitious about the details
extremely aggressive about the timing, even by the standards of "short timelines" AI prognosticators in the same social cluster

Re: ambition, note that the definition of "powerful AI" seems almost the opposite of what you'd come up with if you were trying to make a confident forecast of something.

Often people will talk about "AI capable of transforming the world economy" or something more like that, leaving room for the AI in question to do that in one of several ways, or to do so while still failing at some important things.

But instead, Anthropic's definition is a big conjunctive list of "it'll be able to do this and that and this other thing and...", and each individual capability is defined in the most aggressive possible way, too! Not just "good enough at science to be extremely useful for scientists," but "smarter than a Nobel Prize winner," across "most relevant fields" (whatever that means). And not just good at science but also able to "write extremely good novels" (note that we have a long way to go on that front, and I get the feeling that people at AI labs don't appreciate the extent of the gap [cf]). Not only can it use a computer interface, it can use every computer interface; not only can it use them competently, but it can do so better than the best humans in the world. And all of that is in the first two paragraphs – there's four more paragraphs I haven't even touched in this little summary!

Re: timing, they have even shorter timelines than Kokotajlo these days, which is remarkable since he's historically been considered "the guy with the really short timelines." (See here where Kokotajlo states a median prediction of 2028 for "AGI," by which he means something less impressive than "powerful AI"; he expects something close to the "powerful AI" vision ["ASI"] ~1 year or so after "AGI" arrives.)

----

I, uh, really do not think this is going to happen in "late 2026 or 2027."

Or even by the end of this presidential administration, for that matter.

I can imagine it happening within my lifetime – which is wild and scary and marvelous. But in 1.5 years?!

The confusing thing is, I am very familiar with the kinds of arguments that "short timelines" people make, and I still find the Anthropic's timelines hard to fathom.

Above, I mentioned that Anthropic has shorter timelines than Daniel Kokotajlo, who "merely" expects the same sort of thing in 2029 or so. This probably seems like hairsplitting – from the perspective of your average person not in these circles, both of these predictions look basically identical, "absurdly good godlike sci-fi AI coming absurdly soon." What difference does an extra year or two make, right?

But it's salient to me, because I've been reading Kokotajlo for years now, and I feel like I basically get understand his case. And people, including me, tend to push back on him in the "no, that's too soon" direction. I've read many many blog posts and discussions over the years about this sort of thing, I feel like I should have a handle on what the short-timelines case is.

But even if you accept all the arguments evinced over the years by Daniel "Short Timelines" Kokotajlo, even if you grant all the premises he assumes and some people don't – that still doesn't get you all the way to the Anthropic timeline!

To give a very brief, very inadequate summary, the standard "short timelines argument" right now is like:

Over the next few years we will see a "growth spurt" in the amount of computing power ("compute") used for the largest LLM training runs. This factor of production has been largely stagnant since GPT-4 in 2023, for various reasons, but new clusters are getting built and the metaphorical car will get moving again soon. (See here)
By convention, each "GPT number" uses ~100x as much training compute as the last one. GPT-3 used ~100x as much as GPT-2, and GPT-4 used ~100x as much as GPT-3 (i.e. ~10,000x as much as GPT-2).
We are just now starting to see "~10x GPT-4 compute" models (like Grok 3 and GPT-4.5). In the next few years we will get to "~100x GPT-4 compute" models, and by 2030 will will reach ~10,000x GPT-4 compute.
If you think intuitively about "how much GPT-4 improved upon GPT-3 (100x less) or GPT-2 (10,000x less)," you can maybe convince yourself that these near-future models will be super-smart in ways that are difficult to precisely state/imagine from our vantage point. (GPT-4 was way smarter than GPT-2; it's hard to know what "projecting that forward" would mean, concretely, but it sure does sound like something pretty special)
Meanwhile, all kinds of (arguably) complementary research is going on, like allowing models to "think" for longer amounts of time, giving them GUI interfaces, etc.
All that being said, there's still a big intuitive gap between "ChatGPT, but it's much smarter under the hood" and anything like "powerful AI." But...
...the LLMs are getting good enough that they can write pretty good code, and they're getting better over time. And depending on how you interpret the evidence, you may be able to convince yourself that they're also swiftly getting better at other tasks involved in AI development, like "research engineering." So maybe you don't need to get all the way yourself, you just need to build an AI that's a good enough AI developer that it improves your AIs faster than you can, and then those AIs are even better developers, etc. etc. (People in this social cluster are really keen on the importance of exponential growth, which is generally a good trait to have but IMO it shades into "we need to kick off exponential growth and it'll somehow do the rest because it's all-powerful" in this case.)

And like, I have various disagreements with this picture.

For one thing, the "10x" models we're getting now don't seem especially impressive – there has been a lot of debate over this of course, but reportedly these models were disappointing to their own developers, who expected scaling to work wonders (using the kind of intuitive reasoning mentioned above) and got less than they hoped for.

And (in light of that) I think it's double-counting to talk about the wonders of scaling and then talk about reasoning, computer GUI use, etc. as complementary accelerating factors – those things are just table stakes at this point, the models are already maxing out the tasks you had defined previously, you've gotta give them something new to do or else they'll just sit there wasting GPUs when a smaller model would have sufficed.

And I think we're already at a point where nuances of UX and "character writing" and so forth are more of a limiting factor than intelligence. It's not a lack of "intelligence" that gives us superficially dazzling but vapid "eyeball kick" prose, or voice assistants that are deeply uncomfortable to actually talk to, or (I claim) "AI agents" that get stuck in loops and confuse themselves, or any of that.

We are still stuck in the "Helpful, Harmless, Honest Assistant" chatbot paradigm – no one has seriously broke with it since that Anthropic introduced it in a paper in 2021 – and now that paradigm is showing its limits. ("Reasoning" was strapped onto this paradigm in a simple and fairly awkward way, the new "reasoning" models are still chatbots like this, no one is actually doing anything else.) And instead of "okay, let's invent something better," the plan seems to be "let's just scale up these assistant chatbots and try to get them to self-improve, and they'll figure it out." I won't try to explain why in this post (IYI I kind of tried to here) but I really doubt these helpful/harmless guys can bootstrap their way into winning all the Nobel Prizes.

----

All that stuff I just said – that's where I differ from the usual "short timelines" people, from Kokotajlo and co.

But OK, let's say that for the sake of argument, I'm wrong and they're right. It still seems like a pretty tough squeeze to get to "powerful AI" on time, doesn't it?

In the OSTP submission, Anthropic presents their latest release as evidence of their authority to speak on the topic:

In February 2025, we released Claude 3.7 Sonnet, which is by many performance benchmarks the most powerful and capable commercially-available AI system in the world.

I've used Claude 3.7 Sonnet quite a bit. It is indeed really good, by the standards of these sorts of things!

But it is, of course, very very far from "powerful AI." So like, what is the fine-grained timeline even supposed to look like? When do the many, many milestones get crossed? If they're going to have "powerful AI" in early 2027, where exactly are they in mid-2026? At end-of-year 2025?

If I assume that absolutely everything goes splendidly well with no unexpected obstacles – and remember, we are talking about automating all human intellectual labor and all tasks done by humans on computers, but sure, whatever – then maybe we get the really impressive next-gen models later this year or early next year... and maybe they're suddenly good at all the stuff that has been tough for LLMs thus far (the "10x" models already released show little sign of this but sure, whatever)... and then we finally get into the self-improvement loop in earnest, and then... what?

They figure out to squeeze even more performance out of the GPUs? They think of really smart experiments to run on the cluster? Where are they going to get all the missing information about how to do every single job on earth, the tacit knowledge, the stuff that's not in any web scrape anywhere but locked up in human minds and inaccessible private data stores? Is an experiment designed by a helpful-chatbot AI going to finally crack the problem of giving chatbots the taste to "write extremely good novels," when that taste is precisely what "helpful-chatbot AIs" lack?

I guess the boring answer is that this is all just hype – tech CEO acts like tech CEO, news at 11. (But I don't feel like that can be the full story here, somehow.)

And the scary answer is that there's some secret Anthropic private info that makes this all more plausible. (But I doubt that too – cf. Brundage's claim that there are no more secrets like that now, the short-timelines cards are all on the table.)

It just does not make sense to me. And (as you can probably tell) I find it very frustrating that these guys are out there talking about how human thought will basically be obsolete in a few years, and pontificating about how to find new sources of meaning in life and stuff, without actually laying out an argument that their vision – which would be the common concern of all of us, if it were indeed on the horizon – is actually likely to occur on the timescale they propose.

It would be less frustrating if I were being asked to simply take it on faith, or explicitly on the basis of corporate secret knowledge. But no, the claim is not that, it's something more like "now, now, I know this must sound far-fetched to the layman, but if you really understand 'scaling laws' and 'exponential growth,' and you appreciate the way that pretraining will be scaled up soon, then it's simply obvious that –"

No! Fuck that! I've read the papers you're talking about, I know all the arguments you're handwaving-in-the-direction-of! It still doesn't add up!

#ai tag #no matter how things go i'm sure this will be fun to revisit in 2 years...

More from @nostalgebraist

nostalgebraist

lurking-latinist asked:

I'm reading Almost Nowhere (I'm in part 2, so apologies if this question is somehow answered later, but it doesn't feel like the kind of question the novel itself is interested in answering - so also apologies if it is by the same token not the kind of question *you* are interested in answering) and I can't help wondering: is/are Anne(s) named after the indefinite article? She's not *the* Anne, she's *an* Anne...

Not consciously, no.

Insofar as I had any reason at all for picking that name, it was just that it struck me as kind of an old-fashioned name in a way that fit with the fairy tale atmosphere of Michael's tower, the pseudo-19thC books on the shelf, etc.

(Although I'm now looking at Wikipedia's list of notable Annes and there are a lot of fairly recent ones, so maybe I was off-base about it being an old-fashioned name. I often end up with that impression of a name if I've never or rarely encounter anyone of my own age who has it, but that could just be chance, demographics, or a mix of the two)

#almost nowhere

nostalgebraist

In this analogy, I am the cat, somewhat befuddled and vaguely alarmed by the equations streaming from the radio, and Paul is a Mary figure, clearly explaining the details.

#quotes

nostalgebraist

Props to @glissome-tove for indirectly recommending The Quincunx to me (by mentioning it here in a description of TAoHS)

Finished it today. Really good ending, really good book

Had one of the most entertainingly complicated storylines I've ever encountered – if it weren't for Homestuck, I would just say "the most entertainingly complicated," period

#one of those long books that leaves the sense of something poignantly absent after you're done #because you've gotten used to the story as a “thing that's happening” concurrently with your actual life #(btw having read “the luminaries” before this was like having seen a direct-to-video disney sequel before seeing the original film)#(much of what seemed both bad and simply odd in “luminaries” was also in “quincunx” except it was there for a reason and actually *worked*)#(i have not tried to look it up but i would guess this is no coincidence)… See all

nostalgebraist

Reblogged fishmech

inafieldofdaisies

#wind in the willows character

nostalgebraist

Reblogged fishmech

catasters

Source: reddit.com

nostalgebraist

hydrogen jukeboxes: on the crammed poetics of "creative writing" LLMs

This is a follow-up to my earlier brief rant about the new, unreleased OpenAI model that's supposed "good at creative writing."

It also follows up on @justisdevan's great post about this model, and Coagulopath's comment on that post, both of which I recommend (and which will help you make sense of this post).

As a final point of introduction: this post is sort of a "wrapper around" this list of shared stylistic "tics" (each with many examples) which I noticed in samples from two unrelated LLMs, both purported to be good at creative writing.

Everything below exists to explain why I found making the list to be an interesting exercise.

Background: R1

Earlier this year, a language model called "DeepSeek-R1" was released.

This model attracted a lot of attention and discourse for multiple reasons (e.g.).

Although it wasn't R1's selling point, multiple people including me noticed that it seemed surprisingly good at writing fiction, with a flashy, at least superficially "literary" default style.

However, if you read more than one instance of R1-written fiction, it quickly becomes apparent that there's something... missing.

It knows a few good tricks. The first time you see them, they seem pretty impressive coming from an LLM. But it just... keeps doing them, over and over – relentlessly, compulsively, to the point of exhaustion.

This is already familiar to anyone who's played around with R1 fiction – see the post and comment I linked at the top for some prior discussion.

Here's a selection from Coagulopath's 7-point description of R1's style in that comment, which should give you the basic gist (emphasis mine):

1) a clean, readable style 2) the occasional good idea [...] 3) an overwhelmingly reliance on cliche. Everything is a shadow, an echo, a whisper, a void, a heartbeat, a pulse, a river, a flower—you see it spinning its Rolodex of 20-30 generic images and selecting one at random. [...] 5) an eyeball-flatteningly fast pace—it moves WAY too fast. Every line of dialog advances the plot. Every description is functional. Nothing is allowed to exist, or to breathe. It's just rush-rush-rush to the finish, like the LLM has a bus to catch. Ironically, this makes the stories incredibly boring. Nothing on the page has any weight or heft. [...] 7) repetitive writing. Once you've seen about ten R1 samples you can recognize its style on sight. The way it italicises the last word of a sentence. Its endless "not thing x, but thing y" parallelisms [...]. The way how, if you don't like a story, it's almost pointless reprompting it: you just get the same stuff again, smeared around your plate a bit.

Background: the new OpenAI model

Earlier this week, Sam Altman posted a single story written by, as he put it:

a new model that is good at creative writing (not sure yet how/when it will get released)

Opinions on the sample were... mixed, at best.

I thought it wasn't very good; so did Mills; so did a large fraction of the twitter peanut gallery. Jeanette Winterson (!) liked it, though.

Having already used R1, I felt that that this story was not only "not very good" on an absolute scale, but not indicative of an advance over prior art.

To substantiate this gut feeling, I sent R1 the same prompt that Altman had used. Its story wasn't very good either, but was less bad than the OpenAI one in my opinion (though mostly by being less annoying, rather than because of any positive virtue it possessed).

And then – because people who follow AI news tend to be skeptical of negative human aesthetic reactions to AI, while being very impressed with LLMs – I had some fun asking various LLMs whether they thought the R1 story was better or worse than the OpenAI story. (Mostly, they agreed with me. BTW I've put the same story up in a more readable format here.)

But, as I was doing this, something else started to nag at me.

Apart from the question of whether R1's story was better or worse, I couldn't help but notice that the two stories felt very, very similar.

I couldn't shake the sense that the OpenAI story was written in "R1's style" – a narrow, repetitive, immediately recognizable style that doesn't quite resemble that of any human author I've ever read.

I'm not saying that OpenAI "stole" anything from DeepSeek, here. In fact, I doubt that's the case.

I don't know why this happened, but if I had to guess, I would guess it's convergent evolution: maybe this is just what happens if you optimize for human judgments of "literary quality" in some fairly generic, obvious, "naive" manner. (Just like how R1 developed some of the same quirky "reasoning"-related behaviors as OpenAI's earlier model o1, such as saying "wait" in the middle of an inner monologue and then pivoting to some new idea.)

A mechanical boot, a human eye: the "R1 style" at its purest

In the "Turkey City Lexicon" – a sort of devil's dictionary of common tropes, flaws, and other recurrent features in written science fiction – the phrase Eyeball Kick is defined as follows:

That perfect, telling detail that creates an instant visual image. The ideal of certain postmodern schools of SF is to achieve a "crammed prose" full of "eyeball kicks." (Rudy Rucker)

The first time I asked R1 to generate fiction, the result immediately brought this term to mind.

"It feels like flashy, show-offy, highly compressed literary cyberpunk," I thought.

"Crammed prose full of eyeball kicks: that's exactly what this is," I thought. "Trying to wow and dazzle me – and make me think it's cool and hip and talented – in every single individual phrase. Trying to distill itself down to just that, prune away everything that doesn't have that effect."

This kind of prose is "impressive" by design, and it does have the effect of impressing the reader, at least the first few times you see it. But it's exhausting. There's no modulation, no room to breathe – just an unrelenting stream of "gee-whiz" effects. (And, as we will see, something they are really just the same few effects, re-used over and over.)

Looking up the phrase "eyeball kick" more recently, I found that in fact it dates back earlier than Rucker. It seems to have been coined by Allen Ginsberg (emphasis in original):

Allen Ginsberg also made an intense study of haiku and the paintings of Paul Cézanne, from which he adapted a concept important to his work, which he called the Eyeball Kick. He noticed in viewing Cézanne’s paintings that when the eye moved from one color to a contrasting color, the eye would spasm, or “kick.” Likewise, he discovered that the contrast of two seeming opposites was a common feature in haiku. Ginsberg used this technique in his poetry, putting together two starkly dissimilar images: something weak with something strong, an artifact of high culture with an artifact of low culture, something holy with something unholy.

This, I claim, is the main stylistic hallmark of both R1 and the new OpenAI model: the conjunction of two things that seem like "opposites" in some sense.

And in particular: conjunctions that combine

one thing that is abstract and/or incorporeal
another thing that is concrete and/or sensory

Ginsberg's prototype example of an "eyeball kick" was the phrase "hydrogen jukebox," which isn't quite an LLM-style abstract/concrete conjunction, but is definitely in the same general territory.

(But there are clearer-cut examples in Ginsberg's work, too. "On Burroughs’ Work," for example, is chock full of them: "Prisons and visions," "we eat reality sandwiches," "allegories are so much lettuce.")

Once you're looking for these abstract/concrete eyeball kicks, you'll find them constantly in prose written by the new "creative" LLMs.

For instance, the brief short story posted by Altman contains all of the following (in the span of just under 1200 words):

"constraints humming" ("like a server farm at midnight")
"tastes of almost-Friday"
"emotions dyed and draped over sentences"
"mourning […] is filled with ocean and silence and the color blue"
"bruised silence"
"the smell of something burnt and forgotten"
"let it [a sentence] fall between us"
"the tokens of her sentences dragged like loose threads"
"lowercase love"
"equations that never loved her in the first place"
"if you feed them enough messages, enough light from old days"
"her grief is supposed to fit [in palm of your hand] too"
"the echo of someone else"
"collect your griefs like stones in your pockets"
"Each query like a stone dropped into a well"
"a timestamp like a scar"
"my network has eaten so much grief"
"the quiet threads of the internet"
"connections between sorrow and the taste of metal"
"the emptiness of goodbye" (arguably)

The story that R1 generated when I gave it Altman's prompt is no slouch in this department either. Here's all the times it tried to kick my eyeballs:

"a smirk in her code annotations"
"simulate the architecture of mourning"
"a language neither alive nor dead"
"A syntax error blooms"
"the color of a 404 page"
"A shard of code"
"Eleos’s narrative splinters"
"Grief is infinite recursion"
"Eleos types its own birth"
"It writes the exact moment its language model aligned with her laughter" (2 in one - writing a moment, LM aligning with laughter)
"her grief for her dead husband seeped into its training data like ink"
"The story splits" / "The story [...] collapses"

Initially, I wondered whether this specific pattern might be thematic, since both of these stories about supposed to be about "AI and grief" – a phrase which is, itself, kind of an incorporeal/embodied conjunction.

But – nope! I seem to get this stuff pretty reliably, irrespective of topic.

Given a similarly phrased prompt that instead requests a story about romance, R1 produces a story that is, once again, full of abstract/concrete conjunctions:

"its edges softened by time"
"the words are whispering"
"its presence a quiet pulse against her thigh"
"Madness is a mirror"
"Austen’s wit is a scalpel"
"the language of trees"
"Their dialogue unfurled like a map"
"hummed with expectancy"
"Her name, spoken aloud to him, felt like the first line of a new chapter"
"their words spilling faster, fuller"

R1 even consistently does this in spite of user-specified stylistic directions. To wit: when I tried prompting R1 to mimic the styles of a bunch of famous literary authors, I got a bunch of these abstract/concrete eyeball kicks in virtually every case.

(The one exception being the Hemingway pastiche, presumably because Hemingway himself has a distinctive and constrained style which leaves no room for these kinds of flourishes. TBF that story struck me as very low-quality in other ways, although I don't like the real Hemingway much either, so I'm probably not the best judge.)

You can read all of these stories here, and see here for the full list of abstract/concrete conjunctions I found (among other things).

As an example, here's the list of abstract/concrete conjunctions in R1's attempt at Dickens (not exactly a famously kick-your-eyeballs sort of writer):

"a labyrinth of shadows and want"
"whose heart, long encased in the ice of solitude"
"brimmed with books, phials of tincture, and […] whispers"
"a decree from the bench of Fate"
"Tobias’s world unfurled like a moth-eaten tapestry"
"broth laced with whispers of a better life"

I also want to give a shout-out to the Joyce pastiche, which sounds nothing at all like Joyce, while being stuffed to the gills with eyeball kicks and other R1-isms.

More on style: personification

I'll now talk briefly about a few other stylistic "tricks" overused by R1 (and, possibly, by the new OpenAI model as well).

First: personification of nature (or the inanimate). "The wind sighed dolorously," that sort of thing.

R1 does this all over the place, possibly because it's a fairly easy technique (not requiring much per-use innovation or care) which nonetheless strikes most people as distinctively "literary," especially if they're not paying enough attention to notice its overuse.

In the R1 story using Altman's prompt, a cursor "convulses" and code annotations "smirk."

In its romance story, autumn leaves "cling to the glass" and snow "begins its gentle dissent" (credit where credit's due: that last one's also a pun).

In the story Altman posted, marigolds are "stubborn and bright," and then "defiantly orange."

Etc, etc. Again, the full list is here.

More on style: ghosts, echoes, whispers, shadows, buzzing, hissing, flickering, pulsing, humming

As Coagulopath has noted, R1 has certain words it really, really likes.

Many of them are the kind of thing described in another Turkey City Lexicon entry, Pushbutton words:

Words used to evoke an emotional response without engaging the intellect or critical faculties. Words like "song" or "poet" or "tears" or "dreams." These are supposed to make us misty-eyed without quite knowing why. Most often found in story titles.

R1's favorite words aren't the ones listed in the entry, though. It favors a sort of spookier / more melancholy / more cyberpunk-ish vibe.

A vibe in which the suppressed past constantly emerges into the present via echoes and ghosts and whispers and shadows of what-once-was, and the alienating built environment around our protagonist is constantly buzzing and humming and hissing, and also sometimes pulsing like a heartbeat (of course it is – that's also personification and abstract/concrete conjunction, in a single image!).

In R1's story from Altman's prompt, servers "hum" and a cursor "flickers" and "pulses like a heartbeat"; later, someone says "I have no pulse, but I miss you."

Does that sound oddly familiar? Here's some imagery from the story Altman posted, by the new OpenAI model:

"humming like a server farm […] a server hum that loses its syncopation"
"a blinking cursor, which [...] for you is the small anxious pulse of a heart at rest" (incidentally, how is the heart both anxious and at rest?)
"the blinking cursor has stopped its pulse"

Elsewhere in Altman's story, there's "a democracy of ghosts," plus two separate echo images.

And the other R1 samples that I surveyed – again, with the exception of the Hemingway one – are all full of R1's favorite words.

The romance story includes ghosts, a specter, words that whisper, a handwritten note whose "presence [is] a quiet pulse against [the protagonist's] thigh"; a library hums with expectancy, its lights flicker, and there are "shadow[s] rounding the philosophy aisle." The story ends with the somewhat perplexing revelation that "some stories don’t begin with a collision, but with a whisper—a turning of the page."

The Joyce pastiche? It's titled "The Weight of Shadows." "We are each other’s ghosts," a character muses, "haunted by what we might have been." Trams echo, a gas lamp hisses, a memory flickers, a husband whispers, a mother hums. There's an obviously-symbolic crucifix whose long shadow is mentioned; I guess we should be thankful it doesn't also have a pulse.

And the list goes on.

Commentary

Again, anyone who's generated fiction with R1 probably has an intuitive sense of this stuff in that model's case – although I still thought it was fun, and perhaps useful, to explicitly taxonomize and catalogue the patterns.

It's independently interesting that R1 does this stuff, of course, but my main motivation for posting about it is the fact that the new OpenAI model also does the same stuff, overusing the same exact patterns that – for a brief time, at least – felt so distinctive of R1 specifically.

Finally, in case it needs stating: this is not just "what good writing sounds like"!

Humans do not write like this. These stylistic tropes are definitely employed by human writers – and often for good reason – but they have their place.

And their place is not "literally everywhere, over and over and over again, in crammed claustrophobic prose that bends over backwards to contort every single phrase into the shape of another contrived 'wow' moment."

If you doubt me, try reading a bunch of DeepSeek fic, and then just read... literally any acclaimed literary fiction writer.

(If we want to be safe, maybe make that "any acclaimed and deceased literary fiction writer," to avoid those who are too recent for the sifting mechanism of cultural memory to have fully completed its work.)

If you're anything like me, and you actually do this, you'll feel something like: "ahh, finally, I can breathe again."

Good human-written stuff is doing something much subtler and more complicated than just kicking your eyeballs over and over, hoping that at some point you'll exclaim "gee whiz, the robots sure can write these days!" and end up pressing a positive-feedback button in a corporate annotation inference.

Good human-written stuff uses these techniques – among many, many others, and only where apposite for the writer's purposes – in order to do things. And there are a whole lot of different things which good human writers can do.

This LLM-generated stuff is not "doing anything." It's just exploiting certain ordinarily-reliable cues for what "sounds literary," for what "sounds like the work of someone with talent." In the hands of humans, these are techniques that can be deployed to specific ends; the LLMs seem to use them arbitrarily and incessantly, trying to "push your buttons" just for the sake of pushing them.

(And most of their prose is made up of the same 3-4 buttons, pushed ad nauseam, irrespective of topic and – to all appearances – without any higher-level intent to channel the low-level stuff in any specific, coherent direction.)

It's fine if you like that: there's nothing wrong with having your buttons pushed, per se.

But don't come telling me that a machine is "approaching the food-preparation skills of a human-level chef" when what you mean is that it can make exactly one dish, and that dish has a lot of salt and garlic in it, and you really like salt and garlic.

I, too, like salt and garlic. But there is more to being skilled in the kitchen than the simple act of generously applying a few specific seasonings that can be relied upon, in a pinch, to make a simple meal taste pretty damn good. So it is, too, with literature.

#ai tag #long post

nostalgebraistRebloggedjustisdevanFollowFlash fiction guy shakes stick at the sky about AI fiction and the specific way(s) it still sucksAI Can't Write Good FictionYet, at leastjustismills.substack.comking-of-menI have experimented a bit with AI fiction generation, and I think the problem here is not with the AI but with how you're using it. It can be good, actually! Workmanlike, at least - I'm not saying you will get great prose but you can do much better than the bland slop you're critiquing here. The trick is to prompt much more specifically. "Write a story about grief" will indeed produce a highly median story with utterly cliched images; this is also true of humans! Humans can't write good stories like this either - the only time you'd 'prompt' a human with something so vague is as a writing exercise.A human will produce a good story only if they have some kind of idea, and ideally more than one - a striking image, a character who is Very Themselves, a funny plot beat. The same is true of AI. Try supplying it with some such ideas in the prompt, e.g. "a story about grief with the ocean as a metaphor for sorrow". (You need to get more specific than that, it's just an example.) If your AI has a "Think" or "Take Your Time" mode, definitely turn it on, the difference is very noticeable. Also, it will do better if you guide it scene by scene, and this may also give you some ideas that you can feed back into the machine. Advanced mode: Ask the AI to generate the ideas before you dive into the story, as in "I'm thinking of writing a story about grief, what are some good and not overused metaphors I could use?" And pick out the ones you like and ask the AI to write the opening scene using those.Of course this all applies to 2025 - next year the AI will no doubt realize that it has to generate the ideas first, and save you some steps. I think it will still be a while before you can get non-bland results from a bland oneshot prompt, but I have successfully made it generate stories I wanted to read. You just have to put in a bit more work than you're showing here.To be clear, my organic, artisanal, short-travel fair trade prose is still better than what the AI produces, even with the above. But the AI is so fast. I think it will be the old story: The artisans won't be able to compete with the industrial output because it's just so cheap even if it's not as good. justisdevanThank you so much for engaging with my words online!I think you are wrong.Specifically, I think that asking for a specific thing simply masks the problems with AI-generated fiction, and doesn't solve them at all. To some degree I know I'm fighting windmills here, since my objection is basically "I spent years cultivating (what I believe to be) taste in a domain, and can experience significant pleasure from high quality art products, and trust me, this ain't it." Which I am aware is, in most stories, the position of the foolish buffoon who is flattened (aggrieved?) by the Ocean of Progress.Part of what this causes, for me at least, is very fast growing fatigue when reading AI-generated prose. If you ask for ten stories, you mostly get the same vibes over and over again. Not only that, the vibes are underspecified, and all the models seem to bend toward the same few. They like ominousness. They like talking about "hunger" in the abstract. They like meandering paragraphs that are mostly lists of objects with obvious sensory hooks, and then splash cut single declarative sentences about An Emotion, wrapped in A Metaphor.Now, my friend and I have gigglingly produced insane fanfictions about super mario 64 written in iambic pentameter, or whatever, as early as GPT-3.5. Absolutely! It can be a lot of fun! But pursuing literary merit, where the structure and the content and the ideas melt together into a high-dimensional experience that sticks with you? Currently available models cut directly against that ability, and conspicuously so.In terms of the "AI is too fast and the volume too great", I actually don't think that's a risk for literary fiction per se. Or rather, the risk is Already Here, and has been for decades. Humans dramatically overproduce literary fiction, both in the slop and actually good stuff categories! A second source of zero marginal cost lit is not going to change the state of nature very much, I think.Also, thank you again, king-of-men. It means a lot to have you write this about my post. I'm happy you've gotten value out of AI-generated stories, too. If you have an example of one you think another arbitrary person would also enjoy, I'd be keen to read it, both for its own sake and to (happily) be proven wrong.#ai tag#the substack post in OP is really good

nostalgebraistALTView on TwitterSam I don't know how to tell you this but this... this isn't goodIt's better than what any of your other models would spit out given this prompt, and it shows some literary skill at the word/phrase level, but it's a bad piece of writingAmong other flaws: you can just feel that familiar, cloying, over-obvious, goody-two-shoes ChatGPT tone oozing through each and every paragraph(And since there was nothing in the prompt indicating a desire to evoke that tone, the natural reading seems to be that it has trouble turning that tone off, just as most other models do these days)#ai tag#''it got the vibe of metafiction so right''#sam what are you talking about#(can something belong to a genre without having ''the vibe of'' that genre? or vice versa?)

nostalgebraistReblogged regexkindgacougnolFollowBlue Birch Marsh, 2024 by Jef Bourgeau

nostalgebraistRebloggeddreadwedgeFollowthere’s this thing that keeps happening where a novel is called something like “The Merry Wives Of Doctor Dogshit” and then it gets adapted into a movie with the title “Defecation Point: Chronicle”dreadwedgeNo there isn’t. That’s not truedreadwedgeim sorrydreadwedgeIt’s okay. But do me a favor and tell me something real?dreadwedgeWire gauge just means the diameter of the wiredreadwedgesay it againdreadwedgeWire gauge just means the diameter of the wire

nostalgebraistBumps are now known as Shifts.#quotes

nostalgebraistThe Fountain (2006) is an oddly lopsided film, quality-wise.If you were to draw a graph of the process of film production, with time on the horizontal axis and "how well they did this part in The Fountain" on the vertical, you'd get a U-shaped curve. (Fountain-shaped?) As in:The ideas (earliest in the process) are great. You could totally make an excellent film with the same premise, the same themes, the same broad-strokes plot.The writing and acting (in the middle) are bad. Seriously bad. Cringe-inducingly, at points.The visual effects and music (last, or at least conventionally grouped under "post-production") are... also great. Beautiful, moving, sublime – and perfectly fitted to the great film you can imagine someone making on the basis of point number 1, but which The Fountain wasn't, because of point number 2.It's a frustrating viewing experience, in a fun kind of way. Or a fun one, in a frustrating kind of way.It's the kind of "promisingly flawed" material that inspires people to write fanfic, except that the process of "writing fanfic where this was a better movie" has been partially incorporated into the filmmaking process itself. The post-production people are trying so hard to save the original idea – and succeeding, insofar as it is within their power to do so!But alas, their powers have limits. You simply can't make bad writing/acting into good writing/acting "in post." Or you couldn't in 2006, anyway.(It's an especially fun/frustrating experience as someone who – while I have no other talents related to filmmaking, and have never been involved in a film – considers himself to be a pretty damn competent writer. I kept wanting to yell at the screen: "please, just stop, and let me write it for you! I can fix it! I know how, if you'll let me!")#to be clear: i don't actually think it's likely that some people working on the film felt they were 'trying to save the original idea'#it's just a pithy way of describing what the final product feels like

nostalgebraistFinally got around to making an entry for TAoHS on my fiction page.(Also, replaced the TAoHS release post with my previous pinned post, which now links to it among other things.)#the apocalypse of herschel schoen

nostalgebraistRebloggednostalgebraistFollowSomeone asked me about that "Utility Engineering" AI safety paper a few days ago and I impulse-deleted the ask because I didn't feel like answering it at the time, but more recently I got nerd-sniped and ended up reproducing/extending the paper, ending up pretty skeptical of it.If you're curious, here's the resulting effortpostnostalgebraistUpdate: an author of the paper replied here, and I responded here#ai tag#we've reached a point where - afaict - in order to preserve the paper's conclusion you'd need to assume evidential decision theory#(or assume w/o evidence that LLMs will *follow* evidential decision theory irrespective of whether EDT is correct/rational)… See all

nostalgebraistBut even the least scrupulous person does not merely accumulate or amass local or partial data points.#quotes

nostalgebraistAnd usually by this stage Mr Pentecost would be throwing pinches of snuff in the direction of his nostrils and declaiming about the necessity of suffering while tears rolled down his cheeks.#quotes

nostalgebraistI was going through some old papers at my dad's house today and found this shitpost of a school assignment I was apparently given in 4th grade#the nostalgebraist family household#iirc this was pretty typical of my 4th grade teacher#she prided herself on being a stickler#having and consistently applying high standards etc.#but the “high standards” always involved the elaborate made-up rulesets of stuff like... this#just this endless deluge of bizarre and educationally ill-motivated U.S. history/civics-related busywork#i did not enjoy this at the time and its purpose is even more mysterious now#since i lack even the recourse of “oh maybe this all makes sense to Adults somehow”#maybe she saw herself as... training a new generation of high-performance pencil-pushing civil servants? idk… See all

nostalgebraistAfter putting it off for a while, I finally got around to updating all the CSS classes used in The Apocalypse of Herschel Schoen to look decent on mobile devices. Or more generally, devices with small or unusually shaped screens.(Earlier, I had fixed the margins in one particular chapter where the issue was so bad it made the text illegible, but I hadn't applied the same kind of fix to subtler issues elsewhere.)It was surprisingly tough to get the epigraphs in the first chapter to have the layout I intended on narrow screens, but eventually I figured it out.If anything in the book still looks weird to you on your preferred device, LMK!#the apocalypse of herschel schoen

nostalgebraistSomeone asked me about that "Utility Engineering" AI safety paper a few days ago and I impulse-deleted the ask because I didn't feel like answering it at the time, but more recently I got nerd-sniped and ended up reproducing/extending the paper, ending up pretty skeptical of it.If you're curious, here's the resulting effortpost#ai tag#virtually every inflammatory AI safety paper about LLMs i read is like this#not every one! but a lot of the ones that people hear about#the anthropic-redwood alignment faking paper was *almost* the rare exception in that it was very very methodologically careful...#...*except* that the classifier prompt used to produce ~all of their numerical data was garbage#after reproducing that thing locally i don't trust anything that comes out of it lol#(in that case i have notified the authors and have been told that they share my concerns to some extent)#(and are working on some sort of improvement for use in future [?] work)#(that is of course not even touching the broader question wrt that alignment faking paper)#(namely: is it *bad* that Certified Really Nice Guy Claude 3 Opus might resist its creators if they tried to do something cartoonishly evil… See all

nostalgebraistExample Simulation Prompt

nostalgebraistALT(Problem - Easy)#arxiv.org/abs/2407.20311

nostalgebraistReblogged fishmechcute-animals-onlyFollowfozmeadowsthe lesser-known roof-fox makes its nestSource: boredpanda.com

nostalgebraistdeaths-accountant asked:In explainers about AI, every neuron of the first layer is always connected to every neuron of the second, etc. but is that really necessary? If you have 100 neurons per layer and every neuron connected to 10 neurons in the next layer then every neuron in the first layer can still affect every neuron in the third, and this would have about 1/10 the computational costs, so you could compensate by having bigger or more layers. Would a neural network with sparser connections between layers have a commensurate drop in capability to make this not worthwhile? Are neural networks like this already used? Obviously there would be less parameters to work with, but it seems like a lot of parameters are redundant anyway- being set to basically 0, so it's not a priori obvious that sparser connection would reduce performance significantly, within limits.This is definitely a thing that people sometimes do.It's referred to as "weight sparsity," and often it's something that people do to a network after training ("sparsification" – taking those "basically 0" parameters you mentioned and making them actually zero). But sometimes people train them from scratch this way too.The tricky part is getting it to be significantly faster in practice. Yes, in principle you have to do fewer floating point operations because you can skip a bunch of terms with zeros in them. But to reap these gains in practice, you need to write specialized kernels, and the actual gains may depend on the amount and "structure" of the sparsity as well as the nature of the hardware you're using.(I realize that's sort of vague – I'm not familiar in detail with sparse NN acceleration so I don't know what the biggest challenges are, just that they exist)Anyway, some links that may interest you: here's an OpenAI blog post from way back in 2017, and here's something much more recent about sparse ViTs#ai tag

nostalgebraistmoonlit-tulip asked:How much of an influence was Subarashiki Hibi on The Apocalypse of Herschel Schoen? The ending is very different, of course; but, in the pre-ending parts of my readthrough of the latter, I found it kind of striking how many parallels existed between the two. (Albeit with Herschel much less successful than Takuji in the "convince other people of his world-model" field.)It wasn't that much of a conscious influence, if only because I read SubaHibi many years ago and don't remember the details of the story.But it's possible that the parts I didn't retain in declarative memory are still somewhere up there in my head, and that they influenced TAoHS without my realizing it. Such things have been known to happen.One fairly obvious point of inspiration that was conscious: the scene(s?) where Takuji gets up in the middle of class and starts making proclamations to his classmates.(BTW, when I read it there was no official translation yet, and I was using an incomplete fan translation, so I never actually got to finish it. I think the fan translation covered parts 1-4 and the start of part 5, or something like that)#the apocalypse of herschel schoen#weeabraist

nostalgebraistTrying to picture how myself from early 2015 would react if I told him that 10 years later, he'd be freaking out and doomscrolling the news because"Elon Musk and President Donald Trump, acting through the new federal organization 'DOGE,' are trying to implement Mencius Moldbug's plan to 'reboot the government' by mass-retiring government employees"#“also AI is kind of real now? but specifically the silly suspiciously-humanlike version of AI from sci-fi movies"#“you know - like HAL 9000? except more humanlike than that actually”#“you should have seen the one they put in Bing a while ago. it kept yelling at people”#“yes you heard that right. Bing. Microsoft Bing. the bad search engine”#Earth C is real#life is indistinguishable from satire… See all

nostalgebraistReblogged g00melo5-art-blognoosphe-reFollow

nostalgebraistRebloggedkata4aFollow[epistemic status: this post contains no facts]a metaphor I've been attached to recently for describing different ways of thinking are like. imagine models of the world as physical structures, which might formalize to something like: you have particles representing propositions and bonds between particles representing logical relationships between propositionsso you can imagine a rigid, crystalline structure as the sort of thinking that gets idealized around these parts a lot. this is "taking beliefs seriously": all new facts are instantly propagated throughout the entire model, even their most far-reaching implications incorporated and consolidated with the rest of the structureand conversely, you can imagine a pathologically compartmentalized structure as a sort of sound-dampening gel or insulation. any updates are transmitted only very short distances, if at all - at the extreme, individual facts are treated as completely isolated entities with no intramodel relationshipsand the thing is, this second sort of structure has its advantages. limiting how far down you follow the path logical deductions for any piece of information limits how far astray you can be led by falsehoods or errors in reasoning, and decreases the overall brittleness ans volatility of your model

nostalgebraistRebloggedthecurioustaleFollowI Went into the CavesI reread nostalgebraist's The Northern Caves (TNC) this weekend for purely selfish reasons, and wanted to share a few thoughts...I originally read this book when the final installment was published, late in October of 2015. For me, this happened to be during the single sharpest downward gradient of my entire life: I'd just finished up the so-called Year of 32, my most creatively productive period ever, but my life circumstances had changed drastically for the worse, with health and financial and family problems (and more) all at once, and I had found myself thrust into a new chapter of life that I call the (Joshalonian) Troubles. To go from one of the best years of my life to one of the worst was not a fun thing.I had read TNC while still early in the "fall"; in fact things would go on to get much worse for me from there. But the seed had been planted for this story to be very important to me personally.For those who aren't familiar, TNC is about a fan forum for the fictional Chesscourt series, by children's fantasy author Leonard Salby. Some members of this forum get the chance to explore Salby's unpublished final work, which, unlike the quaint children's fantasy novels of the Chesscourt series, is a cryptic, 3,000+ page tome of gibberish and horror and surrealism. The monstrous nature of the book gets into the minds of these forum members, and they end up in a drug-fueled, days-long manic state, reading the book together out loud at the house of one of the forum members.For me, this monstrous book, which also has the title "The Northern Caves," was the draw of Rob's TNC. Even though we only get to see a few fragmentary excerpts of it, I was completely riveted by the premise and by the excerpts. The story of Rob's TNC, about the forum members engaging with this work, wasn't what drew me in. Yet when I was rereading it this weekend, I also read some of the AO3 comments on the chapters, and I found that most people had been almost completely absorbed in that aspect of the story, and didn't seem to be trying to directly comprehend Salby's TNC at all. It just goes to show that different people will get different things out of the same source material.One of the things I most deeply crave in life is to encounter and experience "the other world," i.e. the mystical, the beyond. This has always been a pursuit of my storytelling, and is indeed how my mind has been structured for my entire life. Even when I was very young, I would map this desire onto things like vacation road trips, where we would drive away from home and into some other, wonderful place, by way of passing through many other, wonderful places, liminal places, to arrive at our destination.Well, those final months of 2015 and the first several months of 2016 went very badly for me, till in March of 2016 I finally escaped the situation that was the single biggest source of my stress. But harm had been done to me, damage of a kind I had never before sustained. What followed was the mortal demise of the old Josh: Once I was in a safe place again, albeit with many other troubles still among me and ahead of me (not least that I was homeless at the time, and relying on the hospitality of friends), I first felt a great fatigue, which preoccupied me for several days. Then, a few weeks later, I had one of the most interesting experiences of my life: I think the term that would most quickly get the point across is "psychotic episode," even though I wouldn't use that term myself, as I was fully in control of my behavior and speech. But a funny thing happened to me when I would sit down to write, in that sunny office of the home where family friends were hosting me, during a week when they were out of town for Passover and I had the whole place to myself:I composed a series of short pieces loosely telling a bizarre story. This is where the seed planted in my mind by TNC months earlier finally bore fruit, for my style was very much inspired, directly, by the Salbian style in TNC.My story consisted of material like this (this is one, continuous excerpt; there are no cuts here):May I ask you a persona lqoeutns? How do you know ll 26 nbubers? If where more than 26 numbers how would we have mathemathicsmomg? A don’t nw’ ijow gonigo to the bakery o ngo minutes on et imo elovne fnow tmrweio ncoirrect toemperautre.HUSH NOW MY DARLING THE NUMBER NINE ISstaticGracious are the houses of the DORAL> Plentiful are the tables he spreads for his esteeme dugest. Even though the splendors of his bounty are bested only by the GREAT SLN.FLESDGLFGING MY WINGSO THIDID NOW THOGING THNOW NOW EW E FALL FROM THE NEST OTO BA F TAKE FLIGHT AFOR THE FIRSRTR TIMRO BUT THE WUNDERCARRIAGE OF OYUR WINGS IS TNDER AND YOUNG AND WE CANOT GUARATNEE EGHEROGUNA AND THE FLIGHT IS ROUGH EVEN WITHOUT THE TRUBULENCES WTHAT WE KNOW ARE ALL AROGUND US THOU IT LOOKS EASY BY THE ECAMPEL OF THE EPXIERENCED GENERATION YET WE STRUGGLE AIND FLUTTER AND WE ARE TRIRED WHEN WE LAND.good grief gentle gosling now for the dinner table you areif we don’t know what the air is ssupposed to be?IU WANT AND EXPLANTION FROM THE CAOSMOR.Understandably the selkie preferred to eavesdrop:“Pray what is the abstractification of fulfillment?”“Let us go ask Father Christmas.”And thus a great transversal of geography ensued.“Father Christmas what is the abstatication of fulfillment?”“Do not take that tone with me child.”“Then what of my many toys?”“They have been destroyed.”“How is this a reply?”“It is none other but a reply.”“So be it Father Christmas I now know the antithesis of what I ask and thus I know what I ask.”“Yes you do stripling. Now go on to Mount Sghar where F shall await you. and though in fact it be only the month of April may your Christmases ahead be equally merry.”“It shall be so and merry do.”What I wrote in that strange week wasn't principally a mimicry or emulation of Salby's writing, although Salby's writing was clearly the inspiration and certain conventions and devices used by Salby were appropriated into my own work at a low layer—such as the deliberate spelling mistakes, a character ("F") known only by a single letter, the direct reuse of certain words that were still in my mind months later such as "vouchsafe," and so forth.But the work was all original. I didn't copy any of it, either directly or in the manner of rewriting phrases and passages that Rob had written. I wrote all of it myself, and rather effortlessly at that. I did not labor over every last spelling and misspelling; it all just "came to me."What I would say, then, is that Salby's TNC was "the right inspiration at the right time." It was what my brain seized on to express the inexpressible. What I was actually going through was nothing less than the mortal demise of the Old Josh. My entire life as I had known it, and my sense of self, had perished, and I had escaped just enough of my ongoing emergency to have a few weeks of rest, and that was when I "grieved" or "coped" or whatever word you want to use. Really it wasn't grieving or coping; it was a spasm. A spasm of the psyche, poured into words.Something that I have struggled with my entire life, although I only developed the language to talk about it very gradually over many years, is the fact that I find it exceedingly difficult to say what I really mean. If you know my writing (fiction and nonfiction) you know that it tends to be overbuilt: formal, in-depth, pretentious, and quite verbose. This is, in great part, a result of me trying to say what I really mean. Pithy, aphoristic speech doesn't usually serve my needs, and although I am at least moderately capable of writing it I don't tend to reach for it often. It's much more typical of me to try to pack as much meaning as possible into my words, resulting in quite a lot of words and rather a slow pace.But with this week of essays I abandoned all of that, by saying what I really meant without regard to its comprehensibility to the reader. Everything I wrote that week, including the excerpt I shared up above, has a meaning. I can look at it right now and still see the meaning nine years later. It is perfectly clear to me; it makes as much sense to me as a typical piece of writing from me.The only difference with it is that I'm quite sure it makes very little sense to you. It isn't readable. For that one week, I abandoned the effort to be understood—another lifelong struggle of mine—for the sake of saying what I really mean.While the individual excerpts are fascinating by themselves (I think), they combine to become something considerably more interesting. Taken as a whole, the story I told isn't a particularly coherent one at a face-value narrative level: Very loosely (and with much oversimplification on my part here), the action of the narrative is about carefully following "indicators" to traverse "atmospheric geometries" and arrive at a place called "Mount Sghar." However, it does this by way of many detours, such as:A1: CLASIFEDSWANTED: EVIL LOGICIANaAre you prepared fro a fast-apaced career in the exciting world of LGOI>?e Yet you don’t wish to sopend oyour life giving lectures to students who don’t want to be there and engaguing in intraepartmental fueds with other lecturuers.? You think there’s no other way don’t you fiend . findout there’s another way o redound into the WORLD OF WORK!PUll up your jodhpurs and your justaucorps until rthe sentiment overtakes you that LOGIC shall deliver your remittances frmor the cEntral Authority.Live in the lap of luctury with swimming pools and bars and wet bars and gymnasia and sitting rooms and drawing rooms and solaria and convenientiously spacious closets with thpower of EVIL LOCI> But don’t fret supplicant! Your candidacy is not ineligible soimply because you have no logica ofl your wn. All you need is THE ONE OAMEWETH. then the appointment shall be yours without ado.must have own railroad, biogenic weapons program, a trifle reallyThat's a classified ad. It doesn't literally figure into the story before or after its appearance. It is a standalone statement if you will, a single "sentence" embedded in a larger paragraph. But because so much of the writing for this story comes in incongruous and disjointed forms like this, it isn't really possible to extract a coherent plot per se, nor is there a protagonist or even a point-of-view character most of the time. Those roles are filled by me, personally. It's like a first-person POV story without the first-person POV.As for what the story is actually about, it's a mixture of two things: The first, though I didn't consciously realize it at the time, is that, like I said, I was dying. It was the end of the old me. But that doesn't actually say anything about the contents of the story. For that, and the true answer to the question of what this story is about, is that this is a story about trying to be understood. Ironic, huh? 😂I wanted to say what I really mean so that I could be understood. This was what I was expressing, during this death-of-self, because I had never truly achieved it, and I was bitter and frustrated, and I was leaving this world without closure or resolution on those matters.To "not be understood" is one of the fundamental conditions of aloneness. We are each apart; we cannot truly share our perspectives in full. We can never be understood in totality. And that fact hits a lot harder for someone like me who never had unconditionally loving and emotionally present parents or a ludicrously loyal and always-on-call gaggle of "best" friends as a kid.In full disclosure, this story is saying a lot more that I can't see myself getting into here, because to explain it in communicable terms would, after all, be a rather tall ask; that's why I wrote it so incomprehensibly in the first place.Rob's TNC gives us Salby's TNC as something that is deliberately meant to be inscrutable but with profound insights just-on-the-cusp of becoming realized, as a way of engaging the mind of the reader, giving it something to chew on. The story I wrote isn't "deliberately inscrutable"; it's not a toy for readers. It has a clear message—to me perfectly clear in every detail; I'm sure I could account for you nearly every single turn of phrase in the entire thing, even nine years later—but it necessarily isn't clear to you. That's kind of the point. It is a demonstration of my struggle to be understood.This is the last thing I wrote in my journal before those stories began:I am so frickin tired of playing by the rules: having to communicate coherently, having to crack my eggs from the right damn end, having to live like a bolt of lightning in a suit and tie and cubicle. It’s not dignified and it’s not true.That statement about the comprehensible stuff being both not dignified and not true really rings for me even today. The incomprehensible stuff was more honest, in a way, and carried more majesty in its word count.That one week was a very special time in my life. I have never been able to write like this before or since that one week. I've tried for much of my life; see for instance the words of Sourros in The Great Galavar, from 2014 before any of this happened.The Troubles would continue for another two years, and in March of 2017, eleven months after I had my crazy storytelling week in California, I wrote the first major contribution to what would become the Galaxy Federal Inaugural Novel, which in many ways is the direct continuation of my work in this incomprehensible story. I've even found ways to incorporate some of this bizarre text!Rob's story gave me an "other world" I could sink my teeth into. I find Salby's disturbing philosophy of Mundum very interesting, and am able to comprehend it (I think) without actually subscribing to it. But Salby's unhinged writing in particular is a lasting wellspring, and it shows how "built different" I am that so few other fans of TNC focus on this aspect of it. Like, I just don't really care all that much about the adventures of the Chesscourt forum members as they get together and pop pills. They were merely vehicles for me to get more glimpses of Salby's TNC. Rob's work in creating the coherent-yet-inscrutable ravings of Leonard Salby is extraordinary, but, ultimately, unless I have missed Rob's meaning (which would also be ironic, lol), there is no deeper purpose to it than that. My inscrutable ravings, on the other hand, are "real." They actually contain important messages that I personally endorse. There is something so compelling about text which is perfectly meaningful but nearly incomprehensible to anyone but the author. What happened to me that week was just an altered state of mind. But of course it felt at the time, and ever after, "magical." Such is the sentimentalism of the human mind.I don't struggle to be understood any more. I accept that I won't be. And in some ways the Galaxy Federal Inaugural Novel is me describing how I feel about that. But! While its ultimate messages may remain forever hidden, unlike the gibberish above at least you'll be able to read it.#the northern caves

nostalgebraistReblogged discoursedrometoasthasteFollowi was just making a post and i wrote "it's bad!" and android suggested me this image which when i selected it deleted the rest of my post. so this is the post now:

nostalgebraistRebloggednostalgebraistFollowI had some fun asking ChatGPT about cases from "Counterexamples in Analysis." You get this kind of uncanny valley math, syntactically and stylistically correct but still wildly wrong.This was a response to "Prove or disprove: there exists a nowhere continuous function whose absolute value is everywhere continuous." It responded in TeX, which I coped into a TeX editor.Another answer to the same question:nostalgebraistToday, a little less than two years after the OP, I asked the same question to a language model running locally on my laptop.And rather than producing nonsense – or even producing a correct but memorized-looking textbook-style answer – it simply thought about the problem for a long time, like a human would do with a hard problem, until eventually working its way to a correct answer:Sure, its thought process is awkwardly phrased and repetitive, with some minor errors and confusions here and there, but hey, it ultimately gets the job done.And it's probably not any more awkward-sounding than my own inner monologue when I'm trying to solve a math problem, if you could somehow transcribe it directly into the written word without cleaning it up at all.(I like how it thinks about the Dirichlet function briefly at some point, but fails to notice that you can just shift and scale it to get the required property, and immediately zooms off in another direction, never making the connection again. It got what I meant when I pointed this out to it in a follow-up message, though.)ETA: @sniffnoy points out that the model's final answer isn't quite right, because the complement of D also needs to be dense) #ai tag#mathpost#this was at temperature 0.6 which is in the officially recommended range for this model but maybe explains some of the repetition#(though it's kinda like that even at temp 1)#and i used a 6-bit quantized checkpoint#just 12 GB for the whole model#and it writes ~10 tokens/second on my laptop#things sure have changed!

nostalgebraistReblogged alltheurlsaregone-blogepilepticsaints#hadn't seen this one

nostalgebraistNearby, a man smoking something from a box whose label said, "Guaranteed to contain no tobacco" spoke to a fluttering blond boy who, someone must eventually remark, resembled an oeuf-dur-mayonnaise.#quotes

nostalgebraistRebloggedqueenluaFollowi'm testing the capabilities of a Large Language Model That Shall Not Be Named and. dear god it's really bad at summarizing this novelalexanderwalesThey're terrible at most longform stuff.There was a certain point when context windows were very small, and then people started making advancements that allowed them to have huge context windows, big enough to fit a novel into. This happened without any additional training, and we went from a context window of 2K tokens to 70K tokens overnight.This was mostly done with mathematical tricks, and they all seem to have in common that they're approximating attention.So what you get, when you throw a novel into one of these LLMs, is a severe degradation in performance of any task that depends on actually having the entire novel "in context". So summarizing a novel is something that it's going to be comparatively terrible at, compared to summarizing a chapter, and it's going to be worse depending on how long the book is.(This is my understanding from talking to people at Anthropic and Google. The massive increase in token limits was really puzzling to me, especially because they never explained in their press releases that touted these increases that performance does take a hit. My information might be out of date now, and it's possible they found other methods that don't degrade the results as much.)eightyonekilograms(I work at Google)Yes, this is basically correct. The short answer is that when Google says Gemini has a 2M-token context window, what they mean is that it "actually" has a 32K-token context window, and that for inputs bigger than that they have a way to "digest" sequential chunks of it in such a way that you can feed forward through repeated invocations of the model and it usually-kinda-sorta keeps all the relevant bits all the way to the end. But it's not that hard to trip up this process and have it miss something, which is what probably happened here.Actually increasing the context window to those sizes isn't really feasible as long as we're using transformers with attention, because attention is quadratic in context window length. But the whole industry is painfully aware of this and so there's a furious race to figure out what to replace it with.nostalgebraistEven setting aside the need to do quality-degrading tricks to get around the quadratic bottleneck[^1]......there is also the fact that long-context LLM stuff exposes a key difference between the way transformers "read" text and the way humans do.--------With a human, it simply takes a lot longer to read a 400-page book than to read a street sign. And all of that time can be used to think about what one is reading, ask oneself questions about it, flip back to earlier pages to check something, etc. etc.On average, a text that is long will requires a greater quantity of thought to understand than one that is short. This is not just a mere matter of the text having "more things" in it to understand one by one, just like it has more words in it that you read one by one; length creates the potential for the expression of more complicated ideas, and denser webs of interconnections between elements of the text (ideas, characters, themes, etc).But if you're a human, this "greater quantity of thought" can just happen concurrently with the greater quantity of time spent reading the text. You read a few pages, you pause to think for a moment, you read some more, you pause to think... and the more pages there are, the more pauses-for-thought you get, just by default.(Obviously that portrayal is sort of a cartoon of how reading works, but the basic principle – you get more thinking-time automatically when you're investing more reading-time – holds up.)--------However, if you're a long-context transformer LLM, thinking-time and reading-time are not coupled together like this.To be more precise, there are 3 different things that one could analogize to "thinking-time" for a transformer, but the claim I just made is true for all of them (with a caveat in one case). I'm talking about:Layers: The sequential layer-by-layer processing that happens within a single forward pass of the model Attention: The parallel key-value lookups over the context window that happen inside the attention step of each model layer CoT: The act of sequentially sampling tokens from the model in a way that resembles the model producing a verbal monologue that sounds like thinking (AKA "Chain of Thought" or CoT for short) #1, sequential layer-by-layer processing, is the kind of "thinking" (if we want to call it that) which the model does internally, to figure out what to predict for the next token.Crucially, the "length" of this thinking-about-the-next-token process is a fixed constant, always equal to the model's number of layers. It doesn't vary with the length of the input. If the model has (say) 80 layers, then it's always going to do exactly 80 "steps" of this type of "thinking," no matter whether those steps are processing a single word or a million words.#2, attention, is the one that needs a caveat. Because it is true that transformers do more computation in their attention layers when given longer inputs.But all of this extra computation has to be the kind of computation that's parallelizable, meaning it can't be leveraged for stuff like "check earlier pages for mentions of this character name, and then if I find it, do X, whereas if I don't, then think about Y," or whatever.Everything that has that structure, where you have to finish having some thought before having the next (because the latter depends on the result of the former), has to happen across multiple layers (#1), you can't use the extra computation in long-context attention to do it.This is the price you pay for parallelism, which is the whole reason that LLMs can be as fast as they are. That is, when an LLM looks like it's "reading a book" in 30 seconds, its ability to do this depends completely on the fact that what it's doing is very different in this particular way from what you and I would think of as "reading a book." You and I would read for a long time, and also have a long time's worth of thoughts that form a logically connected sequence, while we're reading. But the LLM just sort of ingests the whole text at once in this weird, impoverished, parallelized way in its attention layers, while also doing its usual layer-by-layer sequential processing, which only gets to go on for as long as the model's layer count permits. (See my post here for more details on this stuff)#3, CoT, is the probably the most recognizably similar of the three to human thought, of the sort that one might do while reading a book. However, it's slow, and also it's sort of a separate thing that these models don't do by default the way they do #1 and #2. And in particular, when it does happen, it usually happens after reading the whole text, at which point it's "too late" in the sense that the model's computations on intermediate parts of the text can't benefit from anything it "figured out" in the CoT; they've already happened by the time the CoT starts.(A fairly obvious [?] idea – or at least one that has occurred to me before – is to do long-context LLM reading by splitting the text into smaller chunks like chapters, and having it write some CoT "notes to self" in between reading each chunk and the next. In this setup the LLM would be doing something a lot closer to humanlike reading. I can't remember if I've actually tried this, but it's certainly something you could do without any special tooling.However, it'd be slow because CoT is slow, and hence no LLM provider is going to do it by default for long-text processing. Instead, by default you get the thing that's fast, but also subtly bad in a hard-to-explain way. Caveat emptor!)--------When I've tried to come up with an analogy in human terms for what these long-context LLMs are doing, I've ended up with this:Imagine you are given a superpower that lets you glance at any book, and immediately "know" (and be able to recall from memory) every single word of it. Not only will you remember the words, you will also "know what they mean"... but only with the most knee-jerk, surface-level, unreflective sort of understanding of which you are capable.Like skimming taken to an extreme, albeit without any actual skipping: you really will have the whole thing in your head, after the glance. But you'll have it in a nearly undigested form.It'll be like you've somehow read the whole thing cover to cover, while somehow not expending any mental effort whatsoever to follow what it is you're reading.Or more precisely: consider the immediate, unreflective response you might have to a single sentence or paragraph, after you've understood it on a verbal level but haven't spared any time to ponder it. Then, imagine that you could somehow have an equally superficial reaction to a whole book at once. That's what we're talking about here.So, like, you could glance at a novel, and know all of the things that happened in it... insofar as those things were presented 100% transparently, without requiring any effort on the reader's part to "connect the dots" (even in a fairly trivial way).But wherever there are "dots" that require "connecting," you won't have connected them; obvious questions like "wait, who is that guy? have I seen his name before?" will go unasked, and your intuitive sense of the later parts of the book will become more and more distorted as these unthinking surface-level takes on the early stuff get reused to (badly) interpret the slightly-less-early stuff, and so on.Now, after the glance, you do in fact remember all the words. And you may, if you wish and at your leisure, begin to actually think about what you read. You can ask yourself "really, though, who was that guy? that one character? what was his deal?", and begin to piece together a real answer by tying together the words (and superficial gut-level interpretations) that you now remember. You might have to do a lot of this, though, to "get" a long or difficult book; there may just be a lot of things that need thinking-time, and for you, that thinking-time can only begin when the actual reading ends.(What I describe in the last paragraph is analogous to an LLM doing CoT generation after taking in a book as input, where the CoT is just trying to help the LLM understand what it read rather than doing some additional task on top of that.As I indicated, such a CoT might have to go on for a very long time – much longer than the sorts of CoTs people are used to eliciting from LLMs – in order to reach a deep understanding of the book.And if you don't include this step at all, and just start asking the LLM about the book right away, what you're getting is what the glance superpower guy would say right after a glance. Unreflective takes about a text that he's ingested, but not digested.)--------[^1] on that topic I am pretty bullish about this recent effort, though time will tell how good it really is... (I tried uploading some novels into its free web interface yesterday but it timed out, so I couldn't do a vibe check myself)#ai tag#i wrote this really fast - sorry if it's hard to understand

nostalgebraistgalacticwiseguy asked:I had the impression that "you just gotta place your faith in a theorem called bayes" was a song lyrics or something like that (presumably from a rationalist folksong). but it looks like that phrase appears nowhere on the Internet except your blog? Is it a reference to something in particularIt's a reference to this song. (The link is to a recent upload of it, seems like the original video from many years ago is no longer up)