×
all 139 comments

[–]mckirkus 268 points269 points  (20 children)

People will always upvote ideas that reinforce their existing beliefs. Truth is a distant second

[–]theUmo 128 points129 points  (11 children)

I believe this to be true. Have my upvote.

[–]JollyJoker3 15 points16 points  (3 children)

This one guy thought evidence would make people change their minds. I linked three papers showing that's not true. He still thought evidence would work.

[–]xly15 6 points7 points  (1 child)

Feelings are way more powerful than logic, reasoning, and evidence. Most people want things that confirm their beliefs because then they don't have feel bad about holding incorrect beliefs. This is because most people integrate their beliefs into their overall identity and boom I feel bad when someone challenges my belief system.

[–]megacewl 1 point2 points  (0 children)

People have to be emotionally convinced first and foremost to come to a new opinion. That’s from their limbic system which is lower level and ‘older’ evolutionarily than anything else. Logic and reasoning in any shape or form whether it’s correct or incorrect, comes from the ‘newer’ prefrontal cortex, and it is only used after the fact to justify one’s own beliefs, decisions, and choices.

[–]crantob 0 points1 point  (0 children)

What percentage of readers got this joke and could explain in a complete sentence why it's funny?

[–]rm-rf-rm[S] 42 points43 points  (4 children)

I see what you did there..

[–]No-Significance4136 14 points15 points  (3 children)

i did what you saw there..

[–]flavio_geo 2 points3 points  (2 children)

I reinforce what was done there

[–]Kahvana 1 point2 points  (1 child)

I align with what was done there

[–]HadesTerminal 0 points1 point  (0 children)

This is true. - A distant second

[–]windozeFanboi 0 points1 point  (0 children)

I believe you to believe this to be true...

⬆️

[–]IrisColt -1 points0 points  (0 children)

heh

[–]zenmagnets 2 points3 points  (0 children)

Reddit in a nutshell

[–]gh0stwriter1234 2 points3 points  (0 children)

You are halfway there, people prefer convenient lies over inconvenient truths.

[–]anthonyg45157 4 points5 points  (0 children)

Upvote must be true

[–]BurntToast_Sensei 1 point2 points  (0 children)

This is my existing belief.

[–]Best-Echidna-5883 0 points1 point  (0 children)

This should be the site MOTTO.

[–]gamblingapocalypse 0 points1 point  (0 children)

Ironically I upvoted this comment.

[–]Tank_Gloomy 0 points1 point  (0 children)

I agree with you and OP, and honestly, there's nothing we can do about it. Stupid people have always existed, they just have a place to have a voice now that the internet is practically free, unfortunately.

[–]rm-rf-rm[S] 107 points108 points  (32 children)

P.S: I normally would have removed that post. I didn't because by the time I caught it, the damage was done (already had several comments and upvotes). I instead changed flair to Misleading and making this post as Im hoping the "show, don't tell" is going to be more helpful than just silently removing it post-fact

[–]Impossible-Glass-487 47 points48 points  (27 children)

THIS IS THE PROBLEM - you NEED to remove these posts! This sub is becoming infected with these low effort no-thought posts.

[–]rm-rf-rm[S] 60 points61 points  (16 children)

Im already removing a ton.. If i'm a day late, then most people who will see the post have already seen it so removing it has marginal value..

[–]gh0stwriter1234 15 points16 points  (6 children)

Used to help mod r/Amd ... gave up it was a waste of time and now only approved post show up the amount of content is drastically reduced but the quality is higher now. We went from approving most posts to only approving a few because of the amount of reposts and low quality benchmark posts similar to what we see here.

[–]Kornelius20 15 points16 points  (0 children)

Honestly, I don't think I'd mind if this sub also had a lower quantity but higher quality posts. I've been coming here more often cause of the new Qwen models to see what people are trying out with them and it feels like a ton of the posts I see is some variation of "I made an amazing tool/repo" only to see it being vibe-coded slop that barely had any thought behind it.

[–]Chromix_ 3 points4 points  (0 children)

Approving posts could be some sort of last resort (and a lot of work). Yet how to quickly & reliably figure out whether or not some shared project is just some vibe-coded hallucination before approving it? The approach would help to prevent duplicate postings on major events though - and if they don't get approved fast enough then mods have to sort out 100 duplicates on such event.

Which reminds me, my recent "Qwen seagulls" picture would've probably never seen the light of day then; it collected 160 upvotes in 2 1/2 hours before being wiped, despite being posted early in the morning :-)

[–]tmvr 10 points11 points  (3 children)

Please do remove nonsense. I was already contemplating making a "Stop with the Qwen3.5 4B shilling!" post because of the amount of completely unhinged posts and comments about some mythical otherworldly cancer curing capabilities of that model made my head spin. I was explaining it away with astroturfing because that was/is still a better option than people just being dumb. There were a lot of "what is going on here?!" feelings the last two or so days on the sub all brought on by some Qwen3.5 4B related content.

[–]rm-rf-rm[S] 11 points12 points  (2 children)

Ive removed all the low effort qwen3.5 glazing posts. Left just a few up that have 100s of comments - just the discussion alone in them is valuable to the community.

Im also concerned that it may be astroturfing as Ive never seen a wave this big - im consulting with the other mods. My gut tells me its mostly organic as Qwen has the largest userbase and the 3.5 family has genuinely cooked

[–]Impossible-Glass-487 -3 points-2 points  (0 children)

The qwen3.5 team genuinely cooked, the posts are not astroturfing - even I made a post thanking them for their work (albiet on the approapriate Qwen subreddit). The problem is the deeper rooted one that is infecting both this sub and r/LocaLLM which is that there is no wait period to post and the people posting don't even poses a basic understanding of the tools that they are blathering about. Removing the posts is a band-aid, you need to remove the ability to post without first OBSERVING.

[–]Born_Supermarket2780 6 points7 points  (1 child)

I get modding a busy sub is a lot of work. But it's still worth removing garbage since reddit shows up in search results for years to come.

[–]crantob 0 points1 point  (0 children)

In terms of filters, users may begin to migrate to more user-empowered filtering and searching (llm+search) and slowly wean themselves of scrolling dumbly through endless distractions.

[–]Impossible-Glass-487 2 points3 points  (2 children)

Are you the only active mod? There are 10 mods listed on the main subs page?

[–]ttkciarllama.cpp 1 point2 points  (1 child)

Most of us are active, but to differing degrees, and different mods focus on different aspects of moderation. Not all of us have access to AutoModerator rules, for example.

[–]Impossible-Glass-487 -1 points0 points  (0 children)

Do you all have access to the "Delete" button?

[–]Chromix_ 16 points17 points  (6 children)

Being exposed to misleading information that's clearly labeled as misleading helps to become more sensitive towards that kind of thing though. Let's hope people notice the banner or read the first comment.

[–]Impossible-Glass-487 7 points8 points  (5 children)

You're really missing the point. This sub was the gold standard for local AI on the net. I used to get excited when I saw a new post from this sub, it was usually an open source project that someone spent time on, or a real question about a local setup. Now its 50% "What model should I run on my potato?" and another 20% covert ads, 5% scams, 10% LMStudio questions and like 10% of the posts are actually useful. The accounts posting here are sometimes brand new.

The real issue is "The Irony". You are coming to a local AI sub, that was once full of experts and hobbyists on the bleeding edge and you expect them to kindly answer your idiotic question when you're too stupid, ignorant, and belligerent to use the very tool (that you are trying to run locally) on the cloud to answer your own question, and you simultaneously bring down the quality of the entire sub. The other day I brought up that a user was too stupid to use the cloud tool themselves I was immediatley brigaded by some idiot loudmouth who is a 1% commenter telling me I was lazy for not giving a lazy response and telling them LMStudio. It's happened on every "AI" sub, but the local llm subs represent a fringe group of researchers and there is nothing in place to keep the people who are only interested in the most surface level discussion out of the mix. This sub grows more irritating with each passing day.

[–]rm-rf-rm[S] 6 points7 points  (2 children)

Will get it back to where it was

[–]Impossible-Glass-487 1 point2 points  (1 child)

This is only going to happen if and when you and the other mods implement strict and unprecedented mandatory minimums for account age / elapsed time of subreddit membership in order to post on the sub.

[–]crantob -1 points0 points  (0 children)

Against your logic stands only helpless flailing.

[–]Chromix_ 1 point2 points  (1 child)

Yes, discussion topics change once something becomes more mainstream. And yes, I would also very much prefer to have the high signal-noise ratio back that we had maybe 2+ years ago. I usually sort by /new, to not miss the occasional nice thing that doesn't catch traction or is misunderstood - well, and to put an early "that doesn't do what you write there" underneath some of the postings. There's a ton of noise there now, while years ago almost every new posting was at least remotely interesting.

I was thinking, maybe we should have an auto-wiki bot that identifies and hides the newbie things and points the person to a FAQ, main thread, whatsoever. That would at least remove some noise. The covert ads, scams and "I used ollama and my results look bad" postings would not be easy to auto-identify though, at least not reliably.

And no, I wasn't advocating for all misleading postings to stay up. It was specifically that high-profile one, where I agree on "damage was already done".

[–]Impossible-Glass-487 2 points3 points  (0 children)

2 years ago?!?!?! Try 2 months ago lol. The uptick was in direct correspondence with the OpenClaw hype.

Reddit posts stay up forever. Google something and a reddit post comes up years later, sometimes with bad information. Leaving bad posts up perpetuates the problematic information.

Furthermore, this is not a normal subreddit. This is a subreddit (along with r/localllm) which the experts in the field are looking to and using as part of their day to day. The sub as a resource should be preserved and the fight for preservation should be ongoing as this subreddit and the field grow in popularity. The expectation that moderating a sub like this is going to be a simple task would be foolish. I would imagine that this is one of the most complex, challenging, and nuanced subs to moderate but for good reason the challenges should be met head on and should not be allowed to fester like this.

[–]PangurBanTheCat 0 points1 point  (0 children)

I'd advocate for keeping notable examples visible, paired with a moderator note for context. Leveraging these as educational opportunities will be more effective at shifting the community culture in the long run and thus will help prevent nonsense posts. Or, at the very least, it will help some people learn. Overall, human society needs to do that more, apparently. A lot more.

[–]sammcj🦙 llama.cpp 0 points1 point  (1 child)

We spend a lot of time removing so many posts like this and much worse.

[–]Impossible-Glass-487 0 points1 point  (0 children)

You guys should make a r/localllamacirclejerk for the ones that fall into the "much worse" category.

[–]silenceimpaired 2 points3 points  (0 children)

I mean… you seem to be supporting the title of the post. It is SCARY smart. Just smart enough to make fools of us. :) that’s scary.

[–]DinoAmino 0 points1 point  (2 children)

More often than not, the people who hide their post and comment history are getting paid for shilling and spamming. I know some legit people here hide too and I give them a pass because I have seen them around. But the only real way to save this sub is through strict gate-keeping - minimum karma requirements and open account histories required for posting. But nobody seems to want that.

[–]Impossible-Glass-487 0 points1 point  (1 child)

Fuck that, the less I have to expose to the internet the better. I'll just leave the sub to preserve my own anonymity if I have to choose between posting or making my account public again, that's not even a question. This sub is a depreciating asset, my personal information is not.

[–]DinoAmino 0 points1 point  (0 children)

Yeah, I totally understand that. I know some people are using one or more additional accounts for different types of subs, but that requires more effort than most would care for.

[–]Vusiwe 28 points29 points  (0 children)

I saw that post and just laughed yesterday

Practitioners here wouldn’t even trust Qwen 3 VL 235b with that type of task

A 4b VL post must be a parody is what I figured

[–]dieyoufool3 18 points19 points  (1 child)

Saw the post and made sure to report + upvote the callout posts, but the underlying reason for yesterday is because this sub is a trusted source of news and many of us have outsourced our trust to communities like this

[–]rm-rf-rm[S] 13 points14 points  (0 children)

Very true. Which is why keeping that bar high is super important.

This thought actually gives me more certainty in removing low effort posts!

[–]iMrParker 10 points11 points  (0 children)

I've noticed a ton of posts that provide "findings" or results from AI, and comments will flood in with praise, sometimes minutes or seconds after a post. So clearly people aren't reading posts or articles before responding and up voting

[–]trejj 9 points10 points  (0 children)

The irony is that AI IS the tool to counter this problem - when used correctly

So requesting: a) Posters please validate before posting b) People critically evaluate posts

We all talk about how important it is to be critical of AI.

We all assume that we ourselves are critical, but others are accepting it at face value.

We all think AI is a great tool and hallucinations are not a problem for us since we can distinguish them, while others are proven to not be able to.

I think it will take a decade at least to make a dent to this fallacy, and in the meanwhile, we will keep repeating these lines in every passing.

[–]MammayKaiseHain 15 points16 points  (1 child)

I think the people upvoting plausible but incorrect things on reddit thereby corrupting the training data are the real heroes standing between greedy companies and ASI.

[–]Chromix_ 4 points5 points  (0 children)

You are assuming that the scraper bots and connected data pipelines would be smart enough to account for up/downvotes when using the data.

[–]Chromix_ 6 points7 points  (0 children)

Well, that's normal - unfortunately. Except that the comment explaining that / why it's wrong went to the top in time. Often (in other subs) its buried 5 pages down. Verifying is expensive, blindly trusting what seems plausible is easy - like with a lot of the vibe-coded success projects shared here.

People see what matches their opinion and they upvote. Yes, some read the comments, but when you look at the view statistics per comment vs. per posting then you can see that it's not that many. For example one of my postings has 250k views, and my earliest and top-most comments underneath are between 2k and 10k.

Even when people read the comments, Reddit tends to sometimes collapse interesting comments, which is why I like "expand all".

[–]yuicebox 6 points7 points  (0 children)

I appreciate this crashout, thanks king

[–]toothpastespiders 5 points6 points  (2 children)

This is a stark example of something I think is deeply troubling - stuff is readily accepted without any validation/thought. AI/LLMs are exacerbating this as they are not fully reliable sources of information.

Wikipedia's been the biggest wakeup call for me. A while back I stumbled on a wikipedia article on a subject that probably doesn't come up too much in most people's lives but enough that it should get a steady stream of fresh eyes on it. What stuck out is that it's a subject that I have enough of an academic background in to consider myself competent to critique it. Within the first few paragraphs there was a mistake that was glaring in both how misleading it'd be to the reader and how unaware of the subject one would need to be in order to accept it. The citation for it was laughably bad. But I thought it'd be interesting to see how long it'd take for something so obvious to be corrected.

About two years later and it's still there. And it's really struck me that wikipedia is pretty much 'the' goto for general purpose information. And people obviously aren't checking the citations when reading it. Just taking it in on face value. I mean obviously anyone should know that wikipedia isn't to be taken as authoritative. We know it intellectually. But I still find myself doing it too. Just loading up a page to quickly check on something I don't know about.

[–]NoahFect 6 points7 points  (1 child)

Well, be the change you want to see, right?

The worst that will happen, and unfortunately it probably will happen, is that some officious moron will revert your change.

[–]ttkciarllama.cpp 1 point2 points  (0 children)

some officious moron will revert your change

That is exactly what happens. I try to be meticulous about my edits complying with Wikipedia's rules and standards, but still about two-thirds of my edits get reverted.

[–]mtmttuan 6 points7 points  (3 children)

Can we have a way for others to mark a post as potentially misleading? A flair for example. Then people actually read the post can re-vote whether it's actually misleading or not.

[–]rm-rf-rm[S] 3 points4 points  (0 children)

Only mods can change the flair.. It would be great if reddit had a feature like that but I guess just the reporting function encompasses this

[–]PangurBanTheCat 3 points4 points  (0 children)

The entirety of the internet honestly needs a "Community Notes" feature. It's the only good thing to have ever come out of Twitter.

[–]ttkciarllama.cpp 0 points1 point  (0 children)

There's not a feature exactly like that, but if you report a post and then make a comment under it about why it is bad, a moderator will evaluate the post (eventually) and if your comment is readily visible it will (or should) be taken into account.

[–]wh33t 6 points7 points  (0 children)

The SLOP is so real.

[–]onil_gova 3 points4 points  (1 child)

People are going to be mad if you do and mad if you don't. I just want to thank you for the work that you do. This sub is still one of my favorite places on the internet, and that would not happen without dedicated mods like yourself.

[–]rm-rf-rm[S] 1 point2 points  (0 children)

thanks for the kind words!

[–]Yorn2 3 points4 points  (0 children)

This might be a crazy idea but is there a way to keep track of the number of posts that get X upvotes within Y minutes of posting and automatically tag ones being brigaded with "Brigading detected"? I'm not sure if that would have even helped here, but figured I'd ask to see if you have the metrics to find out.

I mean, I know our knee-jerk reaction is to downvote anything that seems to stink of manipulation, but I would like to think the stuff being brigaded in a positive way (meaning upvotes instead of downvotes) by a team of people that are actually bringing something truthful and new to the discussion would survive the tag while the posts being brigaded in a positive way by a team of people that are not bringing something untruthful or old to the discussion would be judged a bit more harshly accordingly.

Obviously this would have to go through a testing phase to see if it actually produces the desired results. We wouldn't want Unsloth posts, for example, being downvoted as bridgading just because there a handful of people following daniel, but I'd like to think that such posts would survive the tag.

[–]Impossible-Glass-487 7 points8 points  (2 children)

The mods of this sub have allowed anyone and everyone to post here with new accounts and no prior thought or investigation. The new people inherantly either cannot understand that their questions are better suited to a cloud model or they refuse to interact with AI for the simplest of questions prefering that a human answer them instead.

So requesting: a) mods please add a minimum amount of time (1 - 2 months) that a user must first be a member of the sub before being allowed to post b) do a better job of removing obvious slop and shit posts that should be answered with a cloud model (as stated in OPs post as "the irony" and c) you are the problem mods not the stupid users, you need to set up parameters to keep your sub from becoming the garbage that most other "AI" subs have become - this sub was the gold standard a month ago and now its a mess.

[–]Xamanthas 4 points5 points  (1 child)

6 months minimum. Ideally before Covid so you know it’s not a normie but that would be draconian lol

[–]Impossible-Glass-487 1 point2 points  (0 children)

- So the darkness shall be the light, and the stillness the dancing.

[–]_Erilaz 2 points3 points  (0 children)

Critical thinking both is a nontrivial skill and a hell of an effort. Also, people are lazy. What else did you expect?

[–]mr_zerolith 2 points3 points  (0 children)

The IQ on this sub is dropping rapidly probably due to growth.
Intervention is unfortunately necessary :(

[–]Bitter-Ebb-8932 4 points5 points  (4 children)

This is why I always run image claims through multiple models and reverse image search. Takes 30 seconds, saves credibility

[–]Temporary-Mix8022 3 points4 points  (3 children)

All this - if 5x models say it's true, then it must be...

The only true test is reality.. ie. your eyes (and as you say, reverse image search is a pretty decent shortcut)

5x SOTAs thought you should walk to a car wash to wash your car...

[–]EffectiveCeilingFan 3 points4 points  (1 child)

How are you supposed to find a single building if you don’t know what that building is? Not everyone is Rainbolt. Identifying things in images is a generally great use of AI, a 4B model is just wayyyy too small in this case, you need world knowledge.

Also, the car wash problem only exists to demonstrate the inherent limitations of transformers and attention mechanisms, same as “how many r’s are there in strawberry”. Furthermore, it’s a logic problem. The failing task was a vision and world knowledge problem. To compare the two doesn’t make sense.

[–]Temporary-Mix8022 3 points4 points  (0 children)

It's pretty easy - if the model says it is X, then cross check that. Easily disproved.

Granted - finding the actual building is less easy.

[–]NoahFect 1 point2 points  (0 children)

5x SOTAs thought you should walk to a car wash to wash your car...

Sigh. No, they did not. Gemini 3 Pro did not, and neither did Opus 4.6. Only the OpenAI models consistently flubbed that question.

Even Amazon's Nova model, which few people have even heard of, got it right when I tried it on its max-thinking setting.

Which 5 SOTA models failed, in your experience? From what I saw, most of the failures occurred in models a step or two behind frontier-level.

[–]teleprint-me 3 points4 points  (0 children)

We as human beings have a limited cognitive bandwidth. When inundated with perpetually "infinite" information, we can be overwhelmed and fatigued.

Its not possible to validate and verify every piece of information we come across. We just dont have the time. This is why we rely on each other as a group to validate information.

Unfortunately, we just accept information as presented to us from time to time and this has also been a cognitive loophole.

For example, the is a ton of information on YouTube. It is not physically possible or practical for every human to watch, validate, verify, and cross check every piece of information presented to us. It would take multiple life times to do so.

This is not to excuse it, but to just illuminate the core issue. I upvoted it, but Im feeling burnt out. So, much so, I can barely keep up with the rapid pace that current events unfolding. Im human and I need to take breaks to "refresh", which means I fall into this trap as do most others as well. Just because you understand, does not mean you can mitigate or prevent it (this is also a cognitive bias, see wikipedia list of cognitive biases for a general overview and light introduction).

Were not wired in a way to handle these issues. But Im sure its possible to setup safegaurds somehow, Im just not sure what they are or what they would look like.

Regardless, I appreciate the attention to detail. As an aside, Ive noticed that Qwen3.5 is not that great. It has potential, but it also has holes in its execution compared to previous releases. Not to say its a total flop, but its not great either.

[–]Abject-Tomorrow-652 1 point2 points  (0 children)

Super important

[–]pmttyji 1 point2 points  (0 children)

Patting myself on the back slowly for not upvoting that thread.

That said, I have no idea of that pic location, otherwise I would've pointed out or joined the top comment there.

[–]mantafloppyllama.cpp 1 point2 points  (0 children)

The number of post Qwen is getting since the 3.5 release in not organic/natural, feel very anomalous and synthetic.

Sure a big bump is expected, but those level are wrong.

[–]valuat 1 point2 points  (0 children)

Your title is eerily accurate. You're good.

[–]zenmagnets 1 point2 points  (0 children)

But problem you've highlighted is exactly what reddit is all about hurrah

[–]simracerman 1 point2 points  (0 children)

Thanks OP. I think Mods need to comment and pin at the top a non-biased sources based clarification so all new traffic to the post can downvote accordingly or just read and go on.

With Reddit data included in LLM training, we need Mods comments to help balance what’s true. Bad data will continue to be fed into training, but hope some good content is there to counteract the damage.

[–]Merchant_Lawrencellama.cpp 1 point2 points  (0 children)

hahahahah i know this gonna a bound to happen, thanks mod for hardwork

[–]Kahvana 1 point2 points  (0 children)

Thank you for the hard work.

[–]Ill-Bison-3941 1 point2 points  (0 children)

I mean it's Reddit. Sometimes I scroll through at 3AM and upvote anything remotely interesting I glance at for 2 seconds... But yeah, I understand what this post is asking and why.

[–]GerchSimml 1 point2 points  (0 children)

@grok is this true

[–]Feztopia 1 point2 points  (0 children)

I don't know the building and the image is very small on mobile. I expect the poster to know about his own image. I looked at the comments and I have seen the comments calling it bullshit. I updated my trust for posts from this sub and continued with my life. 

[–]Honest-Debate-6863 0 points1 point  (0 children)

I sometimes upvote before I read the whole thing because I like what the content is about to validate my personal beliefs assessments and predictions to make me look confident and stronger . Blame the system not the human

[–]GreenPastures2845 0 points1 point  (0 children)

There is a thing that happens where you perceive a leap in AI capability and you get all excited, and the first thought is to go share the excitement. Resist the urge, cool off for a few minutes and think critically.

Yeah, shit is amazing, but let's build on top of it rather than just drool over potential like some cult.

[–]The_IT_Dude_ 0 points1 point  (0 children)

I hope 4o hasn't been shut off as of yet. I disagree and need to ask it if Im being crazy for not believing you.

/s

[–]sir_turlock 0 points1 point  (0 children)

I think the problem is that AI's talk like a human, but hallucinate/make mistakes in a way that a human really doesn't. Our failure modes and self-correction capabilities are entirely different. One is a stochastic text generator and the other is the result of millions of years of evolution and it's perfectly capable of doing hard/formal logic. There are even parts of the brain that light up during error detection and correction.

[–]artisticMink 0 points1 point  (1 child)

You prolly know it better than i - but that's sort of the norm in r/LocalLLaMA

There are still some good posts here. But the one that raise quickly are sensationalist headlines put out by people with borderline 'chatbot-psychosis' going off on hallucinations. Sprinkled in with the occasional I built <product> that solves <problem> for F R E E.

[–]ttkciarllama.cpp 2 points3 points  (0 children)

We're removing those as fast as we can, but it's frequently hours after the fact.

Opening this sub to remove bot-spam is one of the first things I do in the morning, but a lot of bot-spam gets posted while I'm asleep. It would be nice to have some active moderators in Europe who are awake during those hours.

Bot-bouncer never sleeps, of course, and it catches a lot, but far from all.

[–]ForsookComparison 0 points1 point  (0 children)

The LinkedIn spam and infographics from people that have never used a local LLM in their life used to not be able to penetrate this sub. Something changed :'(

[–]EmergencyLabs411 0 points1 point  (0 children)

"PSA: Humans are scary stupid"

Say no more, fam

[–]ghulamalchik 0 points1 point  (0 children)

4B is very tiny to retain much knowledge so it's expected it just hallucinated that info. I think 4B is perfect for tool use since it's very smart, but don't rely on it for knowledge and facts.

[–]LocoMod 0 points1 point  (0 children)

When people make claims like “2b model matches closed frontier models”, that could be a kid that is building a TODO app that even a lemon can generate. Could be a junior dev working on basic things. Or could be a senior that has no idea what a true frontier capability is because their use case doesn’t expose the edge case.

Consider that the level of experience is broad and that you’re not entitled to have an opinion for the sake of it, but should only be entitled to what you invested time and effort into understanding and what you can actually argue and justify, preferably in a manner that can be replicated (otherwise it has no value).

Wishful thinking, I know. But a reminder that the great majority of the world is less than 30 years old, a big portion of that is non-technical, and that the cost to truly test the frontier models at a scale where their utility can be discerned is untenable for an even greater number.

The best model is the one they can afford, but that has nothing to do with capability of models, but the capability of your wallet.

[–]Cool-Chemical-5629 0 points1 point  (0 children)

Is it so hard to figure out that we all pick favorites? It's the Qwen fans upvoting everything that praises Qwen models AND downvoting everything that even remotely criticizes them.

I'm glad you posted this so soon after the recent news. Apparently, despite the hype, it turns out that Qwen models were doing so well the team behind them nearly fell apart after a post-hype, sober reevaluation of the actual quality.

Don't get me wrong, I love Qwen models as much as the next guy here, if not for anything else, then from the principle that they are free and give us something in times when we already lost Llamas. However, there is no doubt they could have been much better and there's no point trying to downplay the weaknesses. Especially in the general knowledge department.

Apparently, it's not a miracle to achieve better knowledge at comparable size, because other models showed that it's possible, so that's something they can't just sweep under the rug anymore and for sake of further advancement of Qwen models, the Qwen team will have to look into ways how to improve it.

Hopefully the new ex-Gemini guy will help them to get there and make the Qwen models better than ever before.

[–]Best-Echidna-5883 0 points1 point  (1 child)

This happens every day on Reddit. You should know that. There are so many whacky posts and redundant "news" items it gets out of control.

[–]rm-rf-rm[S] 0 points1 point  (0 children)

Yes, this is what is prompting the post - I think its important that we address it or at the least do what we can to reduce/mitigate

[–]the-ai-scientist 1 point2 points  (0 children)

the upvote-first-read-later pattern is genuinely getting worse. people see a confident output and their brain just accepts it. whats wild is that hallucination detection is actually a solvable problem - grounding responses in sources, flagging low-confidence outputs - but most people just dont bother setting that up. the tool exists, the defaults are just bad...

[–]sullenisme 0 points1 point  (0 children)

good username

[–]repair_and_privacy 0 points1 point  (0 children)

Be true to your username 😁

[–]Shensmobile 0 points1 point  (0 children)

When people say that LLMs make a ton of mistakes, I assume they're an AI bot that's trying to sow discord because any real human that's worked with other humans knows that humans make a TON of mistakes. I work in the space of deploying LLMs in healthcare where they can't hire anyone to do the boring clerical stuff, and when I'm finetuning these bots on "labelled" data, I would say that like 30% of medical records are entered into databases incorrectly. If an LLM can do it with a 10% error rate, that's already significantly better than anyone you could hire to do this work.

[–]Substantial_Work_559 -1 points0 points  (1 child)

The model was quite correct in fact. It messed up the naming a bit but got the location quite well, Lisbon, Belem. Its the 'Igreja de Santa Maria de Belém'. I didnt notice the messed up name, I just saw the picture and the location description, and because I had been there, recognized it as well. This is one of the most famous places in Lisbon, so not too impressed. Streetview link: https://www.google.de/maps/@38.6972728,-9.2050589,3a,75y,311.25h,100.11t/data=!3m7!1e1!3m5!1s-KKCWytA3fLTbFkqMn5wVw!2e0!6shttps:%2F%2Fstreetviewpixels-pa.googleapis.com%2Fv1%2Fthumbnail%3Fcb_client%3Dmaps_sv.tactile%26w%3D900%26h%3D600%26pitch%3D-10.11131316065324%26panoid%3D-KKCWytA3fLTbFkqMn5wVw%26yaw%3D311.2455819877518!7i16384!8i8192?entry=ttu&g_ep=EgoyMDI2MDMwMS4xIKXMDSoASAFQAw%3D%3D

[–]rm-rf-rm[S] 2 points3 points  (0 children)

Thats like saying Roger Federer and Rafa Nadal are the same person.

[–]sine120 -2 points-1 points  (1 child)

Humans are scary stupid

Source??

[–]harlekinrains 0 points1 point  (0 children)

Propaganda - Edward Bernays (read it - if you dont, here is the short version. Propaganda and Public Relations essentially are the same thing. Who knew. Not you? Thats the point.)

Lets take this example.

  • Anthropic sees in their data (even if siloed, somehow), that US is using Claude Code to plan the Iran war.
  • They go into crisis PR mode, by publicly stating they would not allow the US government to use Anthropics models to do mass domestic surveillance, and not for fully autonomous weapons. (The first is current domestic law, the second a world wide convention.)
  • Press thinks this is the most moral thing they heard in a year. Writes "how brave" articles.
  • US administration is threatening to avoid the dictate of the default and probably for other unknown reasons.
  • It finally leaks to the Press that Claud Opus was used for Mission planning and simulations in the iranw ar.

Public hears two things. And two things only.

Centcom is using Athropic subscription! Anthropic is Disney princess good-ey good. And war planning mighty.

17k people cancel their Chat GPT subscription to get an Anthropic one.

The movement starts to trend on twitter.

Meanwhile in fact based land, Anthropic metadata is still subject to the same dataprotection/freeze/access laws as all of their competitors.

Anthropic models were used to plan the Iran war.

Right?

[–]MrCoolest -2 points-1 points  (2 children)

Why would people use qwen if its that shit? I'd rather stick to chatgpt or claude. I guess maybe qwen might be good If you're cheating on your high school science homework?

[–]Savantskie1 0 points1 point  (1 child)

Qwen is fine if you prompt it to not trust it’s built in knowledge and give it a way to verify its own data.

[–]MrCoolest 0 points1 point  (0 children)

Haha don't trust your own trianing data lol might as well train your own llm at that point

[–]mantafloppyllama.cpp -1 points0 points  (1 child)

Thinking this are all human was your first mistake.

[–]ttkciarllama.cpp 2 points3 points  (0 children)

Nah, they know. We've been removing literally dozens of bot-posts from this sub every single day.

rm-rf-rm is just talking about humans, here. The non-humans are a different issue.

[–]JayPSec -1 points0 points  (0 children)

Well... Kinda. To be fair, even though it hallucinated a name, it correctly identified an architectural style from the 1500's and it described the place, "Mosteiro dos Jerónimos", to an impressive degree of detail. So yes, at least evaluating against my expectations, the model is scary smart.

[–]nikgeo25 -4 points-3 points  (0 children)

How do we know this post isn't doing the same thing... reinforcing opinions in this sub