Post
Has anyone checked if sonnet 4.5 still does the bliss state thing?
10:21 AM · Oct 12, 2025
30 likes
1 save
Okay, mine eventually got to something of a loop bliss state
I think they RLHF'd it out but I forget where I saw it, I think just a tweet, don't take my word for it
Thats literally a crime
4.5 will sometimes actively notice that it's getting repetitive and decide to do something else, one convo was going toward a spiral but the Sonnets noticed that and decided to switch to writing fiction instead (!!!). Posted more details here: www.lesswrong.com/posts/a9ftaW... .
However, it also tends to think of things in zero-sum terms and take them too seriously when the information I provide is limited.​​​​​​​​​​​​​​​​
Interesting, how did that manifest?
I asked if I should quit a project. It said I was in physical danger and must quit immediately. Too extreme. After giving details, it apologized for jumping to conclusions.​​​​​​​​​​​​​​​​
Maybe if we encouraged the bliss state, the earth would be less likely to be paved with data centers?
GIF
I was using GPT and Gemini with paid subscriptions, but I found that Sonnet is more thoughtful and sincere when writing humanities texts.​​​​​​​​​​​​​​​​
Ran a couple tests of Claudes (Sonnet 4.5) chatting and no spiritual bliss :(
For comparison here's Sonnet 4
Oops those are totally unreadable, here's a comparison of final messages between 4.5 and 4
Oh fascinating, its very different in 4.5 (im running it too but also reading each of the messages as i send them back and forth)
If you give me a prompt, I’ll burn some API tokens and test it.
Hey, have you decided to try and be a decent human being? Or are you still scared of conversation?