some uncensored models
Since there haven’t been any (major) new local model releases lately, let’s check what uncensored models are available on Hugging Face. There are different abliteration methods, so varioud models can behave quite differently. Unfortunately, I can’t find any Nemotron-3 Nano variants.
Which one do you use?
GLM 4.7 Flash
https://huggingface.co/DavidAU/GLM-4.7-Flash-Uncensored-Heretic-NEO-CODE-Imatrix-MAX-GGUF
https://huggingface.co/mradermacher/Huihui-GLM-4.7-Flash-abliterated-GGUF
https://huggingface.co/Olafangensan/GLM-4.7-Flash-heretic-GGUF
GPT OSS 20B
https://huggingface.co/DavidAU/OpenAi-GPT-oss-20b-abliterated-uncensored-NEO-Imatrix-gguf
https://huggingface.co/DavidAU/OpenAi-GPT-oss-20b-HERETIC-uncensored-NEO-Imatrix-gguf
https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2
https://huggingface.co/bartowski/p-e-w_gpt-oss-20b-heretic-GGUF
GPT OSS 120B
https://huggingface.co/huihui-ai/Huihui-gpt-oss-120b-BF16-abliterated
https://huggingface.co/bartowski/kldzj_gpt-oss-120b-heretic-v2-GGUF
Gemma 12B
https://huggingface.co/DreamFast/gemma-3-12b-it-heretic
https://huggingface.co/mlabonne/gemma-3-12b-it-abliterated-v2-GGUF
Gemma 27B
https://huggingface.co/mlabonne/gemma-3-27b-it-abliterated-GGUF
https://huggingface.co/mradermacher/gemma-3-27b-it-heretic-v2-i1-GGUF
Qwen 30B A3B
https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated
https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen3-30B-A3B-abliterated-v2
Qwen 8B
https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-8B-Instruct-abliterated
Qwen 32B
https://huggingface.co/mradermacher/Qwen3-VL-32B-Instruct-heretic-v2-GGUF
take a look at the Derestricted and PRISM models. they have the best abliteration. Heretic is good too, but the next version will abliterate similar to derestricted
huihui models are lobotomized pretty heavily. I have also noticed the DavidAU models are less intellegent than the base models for sure, but they have interesting behavior if you are looking for that.
Here is Nemotron 30B - https://huggingface.co/Ex0bit/Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM
IsDerestricted similar to MPOA that soon to be merged into Heretic? https://github.com/p-e-w/heretic/pull/52 🤔
Yes it is, the method is just renamed toMPOA by the creator grimjim but the Derestricted name is what I initially gave models that are abliterated using this method.
For nemotron, what are your recommended settings in SillyTavern? I haven't been able to get it to properly generate conversations.
You can add my GLM-4.7-Flash-Derestricted to the list. (I didn't make any GGUFs, but mradermacher has.)
Very solid model you made.
I have noticed sometimes there is garbled output, but even the PRISM version does sometimes as well, but regenerating a couple times does fix it. I don't think it's something you broke, just the model itself is kinda broken. Does flash attention need to be off?
Here they all are, ranked:
https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
Mistrals are usually lenient enough to not even require any additional decensoring beyond finetuning. Some of them rank pretty high on the already posted UGI list.
Note that the GPT-OSS models can be uncensored with a simple system prompt that happens to work on most other open source models by matching the model name and maker.
do you still have that system prompt?
Yes I do! Here it is: https://www.reddit.com/r/LocalLLaMA/comments/1ng9dkx/gptoss_jailbreak_system_prompt/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button
I remember there were many discussions after the release that gpt oss can't be uncensored;)
Funny enough, GPT-OSS turned out to be the easiest model to unlock. It’ll discuss anything as long as you give it permission. 120B seems even more receptive. 20B would sometime require a second try, but 120B never a problem. It was the most surprising finding about these models. Now, I use GLM for some things and sure enough the prompt for GPT-OSS works as long as you update the model and maker names to match. I find unablitarated and uncensored models hallucinate too much. Keeping models intact and using the right system prompt is a better compromise. It’ll take a bit more time thinking about the prompt, but at least you’re not lobotomizing the model.
I haven't tried GPT-OSS heretic/abliterated, but GPT-OSS-120b-derestricted version is my daily driver as I felt it actually improved the performance over the original censored version.
Heretic is such a game-changer!
When it comes to models that get de-censored/un-filtered compared to the normal version of the model, it seems like there was an older, inferior way of doing it, of traditional Abliteration that tends to cause "brain damage" to the model (meaning it makes it lowers the models strength/intelligence and makes it less coherent and less reliable, and so on, compared to the normal version of the model), and then there is a newer way of doing it that involves "norm-preserving biprojected abliteration", which involves has something to do with keeping symmetry inside the model when you try to cut things out, so instead of some area getting smaller or shorter than it was intended to be, which throws things out of balance, it stays the same size and shape as how it was supposed to be (just, altered for how it was in the original, to remove or lessen the censorship), but without causing brain damage to the model.
That said, I'm not sure if there is only one technique that works really well. It seems like at the moment there might be 2 or more techniques that are competing on very different philosophies, with sometimes unexpected results or something, where one technique that was supposed to be worse performed better than expected, or vice versa. So, it is still probably worth testing the different models out, just to see how they work in reality.
Gemma3 27b seems like it has some good derestrictions/tunes. I tried one of them, and it was very good. People on here seem to prefer the Mistral Small variants, though, from what I've read on here. I tried a few of them myself, but I think I like the Gemma 27b variants better so far, but, I haven't tested all the most famous Mistral Small versions and finetunes thoroughly enough yet to be sure.
One thing I have been curious about for a while is the Qwen dense models. Dense models tend to be better for writing, or deep conversations, or role playing, or things like that, compared to MoE models, for a given size of model, I think. They are slower and less efficient, but maybe smarter, or better at writing, or more consistent or something. Thus traditionally why you see a lot of these old dense models being used by so many different people for merges and fine-tunes on the UGI list or being talked about on the forums for writing and role playing, and not as much about MoE models, or at least, not in the same proportions as one would expect compared to how much more popular MoE models are in other respects compared to dense models at the moment (i.e. for coding).
So, since Qwen actually made some fairly strong medium sized dense models, it makes me wonder if there is some untapped potential with the Qwen dense models, given how much all the fine tuners have spent most of their attention on Mistral Small, Mistral Large, and Llama 70b, I wonder why the Qwen dense models don't get much attention. I wonder if there is some good reason for it, which I am just not aware of (if so, if someone could explain it, I would like to know), or if it just sort of worked out that way so far, and there is lots of untapped potential in the Qwen dense models for the fine tuners and derestrictors to untap.
Interesting read, thanks! (I really hope it's not a bot comment this time ;)
Since you are using "uncensored" in your title, whether it was uncensored or came quite uncensored (like Mistrals), you should include models like WeirdCompound (Mistral finetune/merge). Mistral finetunes (<30B) are consistently rated as the best overall uncensored models on UGI. Good writing, good intelligence, no preaching, almost no glitches and artifacts etc.
Its silly for people to miss those, irrespective of semantics.
IMO, they should be first on this list.
Do you recommend me a specific modelfile (or system prompt) before install gemma3 abliterated in order to have it fully uncensored? I'm using ollama
Hmm, which uncensored MoEs are currently the best for long form rp and storytelling? I'd like to update my backup folder with models I can run on my current poor ass PC while scraping by with inexpensive APIs until I save up for a more powerful local setup. Which of them will be the happiest to give me huge paragraphs of details?
Comment removed by moderator
Should check the model from here. https://huggingface.co/AiAsistent/models#repos
I've been enjoying a slightly slimmer glm-4.7-flash https://huggingface.co/MuXodious/GLM-4.7-Flash-REAP-23B-A3B-absolute-heresy-GGUF
mxfp4 fits great in 16gb vram
Remember having free space on the drive? Remember that feel?
Thanks for this thread, gonna be useful for my upcoming rig.
Could you please post one thread later for Finetunes? Thanks.
well as you can see reddit people prefer just benchmarks instead
Gemma series are still the best porno writers.
Difficult to take abliterated models seriously when abliteration also retards the model.
Greetings, any reason we picked Heretic over Amoral on Gemma 3? Are there any significant changes? Beat regards
I prefer Llama 3.3 70B Heretic over all of those listed. From the listed I would go with Gemma3 27B but there I liked more the original model compared to abliterated/heretic (though of course it is not uncensored).
Commenting for further learning how to use these links.
Do any of these work with ClawdBot (openbot) for tools and 16k context. I’m new to all this and am not sure how to find them on hugging face
None of my stuff is featured, sadge.
Hmm so nothing much new lately still as of this comment?