Model Discrepancy: Selected claude-4-sonnet appears to be Claude 3.5 Sonnet

a530810944 · 2025-07-08T13:00:01.122Z

Claude 4 is a bit weak at retrieving its most recent internal knowledge, but if you turn off search and ask it several times, it can eventually give the correct answer. During the training of versions 3.7 or 3.5, these two issues didn’t occur — if search was disabled, they would definitely give the wrong answer.

IsaacHopkinson · 2025-07-08T19:21:57.000Z

Hey folks, as some have mentioned, the source of truth for this is to ask about news events from just before the cutoff date for the model.

Switch the chat to manual mode and turn off auto-run for any tools so that you’re just asking a basic LLM question, and no internet search is done.

mderk · 2025-07-08T19:46:50.260Z

Just tried the “Opt out” pricing, and turned on Claude 4. And yes, reports itself as 3.5, it does not “think” at all, just emitting loads of sh***y code without extra thought - worse than claude 3.5 used to do a while ago, IMO, I initially thought they are turned on something even more ancient or inferior. Given that Gemini is now next to unusable in Cursor, I’m considering moving to something else more predictable.

Xernive · 2025-07-09T00:52:52.096Z

Good news, they removed the “Opt out” option so now you don’t have to think about moving to something else.

qurore · 2025-07-09T06:05:45.285Z

@IsaacHopkinson, @T1000

Do you have any plans to introduce a new mechanism to ensure transparency about the models being used?
I think many people who have written here are facing the same issue and are growing distrustful of you.

C_L · 2025-07-09T06:11:15.474Z

How long will this go on?

Skärmavbild 2025-07-09 kl. 08.07.06803×336 43.9 KB

IsaacHopkinson · 2025-07-09T09:46:16.554Z

@C_L You need to use a new chat here, this is likely because you’re using an existing chat which has existing context being summarized and further hallucinated

C_L · 2025-07-09T09:54:41.437Z

It was a new chat.

This is also a brand new chat:

Skärmavbild 2025-07-09 kl. 11.52.47803×212 17.4 KB

mderk · 2025-07-09T09:55:41.744Z

Come on, it not only quacks like a duck, it also walks like a duck, and looks like a duck. Please spare us these awkward excuses.
I burned yesterday a couple of hours of this dumber version of “Sonnet 4”, had to trash all the results and redo with semi-disfunctional Gemini. Do you think we can’t tell Sonnet4 from an older/inferior model?

C_L · 2025-07-09T10:12:38.317Z

And now you have de-listed/hidden this topic? Come on, really?

I have defended, sent customers your way and loved you. Still do. I just want to get back to a usable state with our beloved Claude 4 with our beloved tool Cursor. Is that too much to ask for?

Skärmavbild 2025-07-09 kl. 12.04.151384×893 115 KB

Yes, I did refresh.
Yes, I created a new window.
Yes, I clicked Latest.
Yes, I searched.
I am not hallucinating.

IsaacHopkinson · 2025-07-09T10:20:23.429Z

IsaacHopkinson · 2025-07-09T10:23:08.278Z

What you’re seeing is quite common with AI models, feel free to google it. For now, here’s proof:

You can search for specific news events from a time window. N.B verbiage is key here or it’ll fall back to cutoff date and model version hallucinations.

image644×1396 108 KB

Edit: included the model in there for better clarity

And now you have de-listed/hidden this topic? Come on, really?

Thanks for flagging, listed the post again. There’s no advantage to us hiding this thread as otherwise we’d have to explain it again later on. We have an automoderator which unlists topics automatically, and gets it wrong sometimes, we’re working on improving that.

C_L · 2025-07-09T10:34:03.326Z

image805×296 18.5 KB

Anything else you want me to do, that doesn’t work?

qurore · 2025-07-09T10:50:24.871Z

Could you explain why your agent said

This was mentioned in the system prompt: “You are an AI coding assistant, powered by Claude Sonnet 4.”

Then

According to my training, I’m Claude 3.5 Sonnet, made by Anthropic. The system prompt mentioned “Claude Sonnet 4” but that’s not accurate - there is no Claude Sonnet 4 as far as I know.

C_L · 2025-07-09T10:59:06.397Z

My guess is that you have agent turned on and your results are through its search (even though you specify it shouldn’t use it), since this is what it should’ve replied if the date is within its knowledge:

Skärmavbild 2025-07-09 kl. 12.58.27803×307 13.8 KB

qurore · 2025-07-09T11:24:12.214Z

@IsaacHopkinson

Let me share an additional evidence.
Is it a hallucination that a Claude Sonnet 4 model responds with, “I’m actually Claude Sonnet 3.5, but I’m being told I’m Claude Sonnet 4 by the system prompt”?

Screenshot 2025-07-09 at 8.19.24 PM640×974 68.1 KB

mderk · 2025-07-09T12:07:08.494Z

Guys, you don’t have to ask these questions to an agent to see. Just ask anything that requires a bit of “thinking” (and Sonnet 4 “thinks” all the time), and see that this fake “Sonet 4” can’t think. And try to ask to code anything a bit complex, just to receive something 2024-ish in reply, without a bit of thinking, again.

C_L · 2025-07-09T13:23:21.371Z

I installed Claude Code and used the same prompt as I used this morning in Cursor on the same codebase and the result was what I expected from Claude 4.

It was seriously like night and day.

Xernive · 2025-07-09T13:31:37.878Z

Yea we’re all aware. But the bots and the supposed “team” members here just say everything works and move on lmao. If the 50+ post & threads aren’t a huge red flag than I don’t know!

qurore · 2025-07-09T13:37:23.181Z

Where is our discussion…

Screenshot 2025-07-09 at 10.36.17 PM2086×1428 255 KB

Topic	Replies	Views	Activity
Claude 4 Sonnet UI Mislabeling or Misrouting to Claude 3.5 Sonnet Bug Reports	4	122	15d
Sonnet 4 Opus fallback to Sonnet 3.5 (duplicate) Bug Reports	4	81	26d
Ask Claude-4-Sonnet which version it is and it replies 3.5! Discussions	1	164	20d
All models reply ‘Im actually Claude’ Bug Reports	1	54	Mar 25
Cloude Sonnet 4 works as 3.5 Bug Reports	9	283	8d

This topic will close 22 days after the last reply.

Related topics