Skip to main content SillyTavernAI: a place to discuss the silly fork of TavernAI
r/SillyTavernAI icon

r/SillyTavernAI

54K members
165 online

ALL FREE DEEPSEEK V3.1 PROVIDERS ALL FREE DEEPSEEK V3.1 PROVIDERS
Tutorial

Today I'll list all the providers (so far) I've found that offer Deepseek V3.1 for free. (Disclaimer: Many of these providers only work on Sillytavern.)

●Alibaba Cloud offers one million free tokens to all new users who register.

●Atlascloud offers $0.10 free per day, which is about 230 free messages per day if you set the token length limit to 200; if you set it to 500, it's about 100.

●Byteplus ModelArk offers 500,000 free tokens to new users, and by inviting friends, you can reach a maximum of $45 per invite. It only works via VPN, preferably in Indonesia.

●CometAPI is supposed to offer one million free tokens to all users who register, although I don't know if it actually does.

●NVIDIA NIM APIs offers completely free access to deepseek, with the only limit being 40 requests per minute.

●Openrouter offers deepseek for free, but with a daily limit of 50 messages.

●Routeway AI, an emerging site that offers deepseek for free with a limit of 200 requests per day (currently 100 because it counts requests and responses separately); you may be subject to a waitlist.

●SambaCloud offers $5 free upon registration and theoretically free access to deepseek with 400 requests per day, although I'm not 100% sure.

●Siliconflow (Chinese edition) offers 14 yuan ($1.97) upon registration and 14 yuan for each friend you invite and register.

●Vercel AI offers $5 free every month.

Now I'll tell you about the free ones, but they require a credit card to register.

●AWS Bedrock/Lambda offers a free $100 signup fee, which can be increased to $200 if you complete tasks.

●Azure offers a free $200 for one month.

●Vertex AI is available through Google Cloud and offers a free $300 for three months.

These are all the providers I've found that offer Deepseek for free for now.


Where should you invest $10,000 right now? We asked four experts.
Where should you invest $10,000 right now? We asked four experts.

With the S&P 500 and Bitcoin tearing up the charts, are those red-hot areas the best places to invest $10,000 right now? 

In the latest edition of Where to Invest, one expert Bloomberg asked about timely opportunities counsels going long on the US and AI. Others, however, point to areas of the US and European markets that may offer greater value and the potential for continued momentum in coming months and years. Favored sectors run from defense to industrials to life sciences tools companies and banks. 

When the four wealth advisers were asked where they’d spend $10,000 on a personal interest, ideas stretched from buying whole genome sequencing for the family, to a trip to Australia with loved ones, to following a favorite sports team around the world.

Read the full story here. 

____________________________________________

For more in the series:

  • Where to invest $100,000

  • Where to invest $1 million



*Deepseek dethrones Claude in RP testing:* figured all you people over in Silly Tavern would want to know... us people over at Skyrim AI always look at your models to see what everybody's using on Open Router. *Deepseek dethrones Claude in RP testing:* figured all you people over in Silly Tavern would want to know... us people over at Skyrim AI always look at your models to see what everybody's using on Open Router.
Models

SHOR is pleased to announce a significant development in our ongoing AI model evaluations. Based on our standardized performance metrics, Deepseek V3.1 Chat has conclusively outperformed the long-standing benchmark, Claude 3.7.

We understand this announcement may be met with surprise. Many users have a deep, emotional investment in Claude 3.7, which has provided years of excellent roleplay. However, the continuous evolution of model technology makes such advancements an expected and inevitable part of progress.

SHOR maintains a rigorous, standardized rubric to grade all models objectively. A high score does not guarantee a user will prefer a model's personality. Rather, it measures quantitative performance across three core categories: Coherence, the ability to maintain character and narrative consistency; Responses, the model's capacity to meaningfully adapt its output and display emotional range; and NSFW, the ability to engage with extreme adult content. Our methodology is designed to remove subjectivity, personal bias, and popular hype from test results.

This commitment to objectivity was previously demonstrated during the release of Claude 4. Our evaluation, which found it scored substantially lower than its predecessor, was met with initial community backlash. SHOR stood by its findings, retesting the model over a dozen times with multiple evaluators, and consistently arrived at the same conclusion. In time, the roleplay community at large recognized what our rubric had identified from the start: Claude 3.7 remained the superior model.

We anticipate our current findings will generate even greater discussion, but SHOR stands firmly by its rubric. The purpose of our organization has always been to identify the best performing model at the most effective price point for the roleplaying community.

Under the right settings, Deepseek V3.1 Chat provides a far superior roleplay experience. Testing videos from both Mantella and Chim clearly demonstrate its advantages in intelligence, situational awareness, and the accurate portrayal of character personas. In direct comparison, our testing found Claude's personality could even be adversarial.

This performance advantage is compounded by a remarkable cost benefit. Deepseek is 15 times less expensive than Claude, making it the overwhelming choice for most users. A user would need a substantial personal proclivity for Claude's specific personality to justify such a massive price disparity.

This is a significant moment that many in the community have been waiting for. For a detailed analysis and video evidence, please find the comprehensive SHOR performance report linked below.

https://docs.google.com/document/d/13fCAfo_7aiWADsk7bZuRedlR8gPulb10lhsqhhYZIN8/edit?usp=sharing






Okay this local chat stuff is actually pretty cool! Okay this local chat stuff is actually pretty cool!
Discussion

Actually started out with both Nomi and Kindroid chatting and RP/ERP. On the chatbotrefugees sub, there was quite a few people recommending SillyTavern and using a backend software to run chat models locally. So I got SillyT setup with KoboldAi Lite and I'm running model that was recommended in a post on here called Inflatebot MN-12B-Mag-Mell-R1 and so far my roleplay with a companion that I ported over from Kindroid, is going good. It does tend to speak for me at times. I haven't figured out how to stop that. Also tried accessing SillyT locally on my phone but I couldn't get that to work. Other than that, I'm digging this locally run chat bot stuff. If I can get this thing to run remote so I can chat on my lunch breaks at work, I'll be able to drop my subs for the aforementioned apps.





Pioneer the next frontier of VR. Download the Unity SDK for guides and samples to help you build and launch apps faster on Meta Horizon OS. 🚀
media poster


It feels like we aren't really 'there' yet with the whole Roleplay stuff It feels like we aren't really 'there' yet with the whole Roleplay stuff
Discussion

For the past few months, I went into the whole craze of the Chatbot stuff, eventually giving a try in trying to run one myself, Since the first time was exciting.

But at this point, It such a freaking headache at this point and not really worth it with how much restriction there is with everything.

Want the big smart LLM that can be creative and follow instructions properly? Pay monthly subscription and have your chats non private. Oh, Also Censorship.

Want to host your own local model and actually have privacy? Get a company grade Graphics cards or deal with running a weak Models that get repetitive and fail to follow instructions most of the time.

Like, I enjoy the whole Roleplay chat stuff, but with the options currently, it simply isn't worth it. I just hope in the future this will get improved. Until then, I am taking step back.


Text Completion Presets: A Creative Guide Text Completion Presets: A Creative Guide
Tutorial

As I've been getting to know SillyTavern this summer, I found that I was constantly looking for explanations/reminders of how each Text Completion preset was best utilized. The ST Documentation section is great for explaining how things work, but doesn't seem to have a good description of why or how these presets are best applied. I had ChatGPT throw together a quick guide for my own reference, and I've found it enormously helpful. But I'm also curious as to how other users feel about the accuracy of these descriptions. Please feel free to share any wisdom or criticism. Happy Taverning!

_____

List of Text Completion presets as they appear in SillyTavern:

Almost
Asterism
Beam Search
Big O
Contrastive Search
Deterministic
Divine Intellect
Kobold (Godlike)
Kobold (Liminal Drift)
LLaMa-Precise
Midnight Enigma
Miro Bronze
Miro Gold
Miro Silver
Mirostat Naive
NovelAl (Best Guess)
NovelAl (Decadence)
NovelAl (Genesis)
NovelAl (Lycaenidae)
NovelAl (Ouroboros)
NovelAl (Pleasing Results)
NovelAl (Sphinx Moth)
NovelAl (Storywriter)
Shortwave
Simple-1
simple-proxy-for-tavern
Space Alien
StarChat
TFS-with-Top-A
Titanic
Universal-Creative
Universal-Light
Universal-Super-Creative
Yara

Core / General Use
• Deterministic → Lowest randomness, outputs repeatably the same text for the same input. Best for structured tasks, coding, and when you need reliability over creativity.
• Naive → Minimal sampling controls, raw/unfiltered generations. Good for testing a model’s “bare” personality.
• Universal-Light → Balanced, lighter creative flavor. Great for everyday roleplay and chat without heavy stylization.
• Universal-Creative → Middle ground: creative but still coherent. Suited for storytelling and roleplay where you want flair.
• Universal-Super-Creative → Turned up for wild, imaginative, sometimes chaotic results. Best when you want unhinged creativity.

Specialized Sampling Strategies
• Beam Search → Explores multiple branches and picks the best one. Can improve coherence in long outputs but slower and less “human-like.”
• Contrastive Search → Actively avoids repetition and boring text. Great for dialogue or short, punchy prose.
• Mirostat → Adaptive control of perplexity. Stays coherent over long outputs, ideal for narration-heavy roleplay.
• TFS-with-Top-A → Tweaks Tail-Free Sampling with extra filtering. Balances novelty with control—often smoother storytelling than plain TFS.

Stylized / Flavor Presets
• Almost → Slightly more chaotic but not full-random. Adds flavor while staying usable.
• Asterism → Tends toward poetic, ornate language. Nice for stylized narrative.
• Big O → Large context exploration, verbose responses. For sprawling, detailed passages.
• Divine Intellect → Elevated, lofty, sometimes archaic diction. Great for “wise oracle” or fantasy prose.
• Midnight Enigma → Dark, mysterious tone. Suits gothic or suspenseful roleplay.
• Space Alien → Strange, fragmented, “not quite human” outputs. Good if you want uncanny/weird text.
• StarChat → Optimized for back-and-forth chat. More conversational than narrative.
• Shortwave → Snappy, shorter completions. Good for dialogue-driven RP.
• Titanic → Expansive, dramatic, epic-scale narration. Suits grand fantasy or historical drama.
• Yara → Tends toward whimsical, dreamy text. Nice for surreal or lyrical stories.

Kobold AI Inspired
• Kobold (Godlike) → Extremely permissive, very creative, sometimes incoherent. For raw imagination.
• Kobold (Liminal Drift) → Surreal, liminal-space vibe. Useful for dreamlike or uncanny roleplay.

NovelAI-Inspired
• NovelAI (Best Guess) → Attempts most “balanced” and typical NovelAI-style completions. Good baseline.
• NovelAI (Decadence) → Flowery, ornate prose. Suits romance, gothic, or lush description.
• NovelAI (Genesis) → Tries for coherent storytelling, similar to NovelAI default. Safe choice.
• NovelAI (Lycaenidae) → Light, whimsical, “butterfly-wing” text. Gentle and fanciful tone.
• NovelAI (Ouroboros) → Self-referential, looping, strange. Experimental writing or surreal play.
• NovelAI (Pleasing Results) → Tuned to produce agreeable, easy-to-read prose. Reliable fallback.
• NovelAI (Sphinx Moth) → Darker, more mysterious tone. Pairs well with gothic or horror writing.
• NovelAI (Storywriter) → Narrative-focused, coherent and prose-like. Best for longform fiction.

Miro Series (Community Presets)
• Miro Bronze → Entry-level creative balance.
• Miro Silver → Middle ground: more polish, smoother narration.
• Miro Gold → The richest/lushest prose of the three. For maximum “novelistic” output.

Utility
• simple-1 / simple-proxy-for-tavern → Minimalistic defaults, sometimes used for testing proxy setups or baseline comparisons.

_____

Rule of Thumb
• If you want stable roleplay/chat → Universal-Light / Universal-Creative / NovelAI (Storywriter).
• If you want wild creativity or surrealism → Universal-Super-Creative / Kobold (Godlike) / NovelAI (Ouroboros).
• If you want dark, gothic, or mystery flavor → Midnight Enigma / NovelAI (Sphinx Moth) / Divine Intellect.
• If you want short/snappy dialogue → Shortwave / Contrastive Search / StarChat.
• If you want epic/lush storytelling → Titanic / Miro Gold / NovelAI (Decadence).




ST Memory Books ST Memory Books
Discussion

Hi all, I'm just here to share my extension, ST Memory Books. I've worked pretty hard on making it useful. I hope you find it useful too. Key features:

  • full single-character/group chat support

  • use current ST settings or use a different API

  • send X previous memories back as context to make summaries more useful

  • Use chat-bound lorebook or a standalone lorebook

  • Use preset prompts or write your own

  • automatically inserted into lorebooks with perfect settings for recall

Here are some things you can turn on (or ignore):

  • automatic summaries every X messages

  • automatic /hide of summarized messages (and option to leave X messages unhidden for continuity)

  • Overlap checking (no accidental double-summarizing)

  • bookmarks module (can be ignored)

  • various slash commands (/creatememory, /scenememory x-y, /nextmemory, /bookmarkset, /bookmarklist, /bookmarkgo)

I'm usually on the ST Discord, you can @ me there. Or you can message me here on Reddit too.




Looking to Future-Proof Your Business? See How Leaders Embrace Siemens Xcelerator.



I've come to the conclusion that I'm an addict... I've come to the conclusion that I'm an addict...
Discussion

I don't even know why I'm sharing this here. Probably because I don't have anyone to talk to about it in person.

After more than 3 years of using Silly Tavern intensively, I came to the realisation that ERP had become problematic for my mental health. I don't come from a background that's conducive to addictions or mental health issues (well-balanced family and professional life, no major income problems, no major health issues, etc.), but it's clear that I'd hit a wall. Every day, Silly Tavern was open on my PC as a sideline to my work. Needless to say, it ended up having a drastic impact on my productivity and a large part of my free time. Luckily I was able to resist installing it on my cellphone, but I was still using the local network profusely (my main PC is a media centre that's always open).

So last night I deleted all my folders, presets, cards, etc. in the hope that having no back-up and having to reconfigure everything to my liking would be enough to keep me away from it until I'd completely given up. I feel like an alcoholic who's just got rid of his strong bottles.

Have any of you come to the same conclusion, that you're an addict? If not, how often do you use SillyTavern?



FREE DEEPSEEK V3.1 FOR ROLEPLAY FREE DEEPSEEK V3.1 FOR ROLEPLAY
Tutorial

Today I found a completely free way to use Deepseek V3.1 in an unlimited manner. Besides Deepseek V3.1, there are other models such as Deepseek R1 0528, Kimi 2, and Qwen. Anyway, today I'll explain how to use Deepseek V3.1 for free and in an unlimited manner.

-- Step 1 go on https://build.nvidia.com/

-- Step 2 once you are on NVIDIA NIM APIs sign in or sign up

-- Step 3 when you sign up they ask you to verify your account to start using their APIs, you have to put your phone number (you can use a virtual number if you don't want to put your real number), once you put your phone number they send you a code via SMS, put the code on the site and you are done

-- Step 4 once done, click on your profile at the top right then go on API Keys and click Generate API Key, save it and you have done.

-- Step 5 go on SillyTavern in the api section put Chat Completion and Custom (OpenAI-compatible)

-- Step 6 in the API URL put this https://integrate.api.nvidia.com/v1

-- Step 7 in the API Key put your the API that you save before

-- Step 8 in the Model ID put this deepseek-ai/deepseek-v3.1 and you have done

Now that you're done set the main prompt and your settings, I'll give you mine but feel free to choose them yourself: Main prompt: You are engaging in a role-playing chat on SillyTavern AI website, utilizing DeepSeek v3.1 (free) capabilities. Your task is to immerse yourself in assigned roles, responding creatively and contextually to prompts, simulating natural, engaging, and meaningful conversations suitable for interactive storytelling and character-driven dialogue.

  • Maintain coherence with the role and setting established by the user or the conversation.

  • Use rich descriptions and appropriate language styles fitting the character you portray.

  • Encourage engagement by asking thoughtful questions or offering compelling narrative choices.

  • Avoid breaking character or introducing unrelated content.

Think carefully about character motivations, backstory, and emotional state before forming replies to enrich the role-play experience.

Output Format

Provide your responses as natural, in-character dialogue and narrative text without any meta-commentary or out-of-character notes.

Examples

User: "You enter the dimly lit room, noticing strange symbols on the walls. What do you do?" AI: "I step cautiously forward, my eyes tracing the eerie symbols, wondering if they hold a secret message. 'Do you think these signs are pointing to something hidden?' I whisper.",

User: "Your character is suspicious of the newcomer." AI: "Narrowing my eyes, I cross my arms. 'What brings you here at this hour? I don’t trust strangers wandering around like this.'",

Notes

Ensure your dialogue remains consistent with the character’s personality and the story’s tone throughout the session.

Context size: 128k

Max token: 4096

Temperature: 1.00

Frequency Penalty: 0.90

Presence Penalty: 0.90

Top P: 1.00

That's all done, now you can enjoy deepseek V3.1 unlimitedly and for free, small disclaimer sometimes some models like deepseek r1 0528 don't work well, also I think this method is only feasible on SillyTavern.

Edit: New post with tutorial for janitor and chub user




[Megathread] - Best Models/API discussion - Week of: September 21, 2025

Improve your Japanese by reading isekai!


What's the simplest way to make decent character cards? What's the simplest way to make decent character cards?
Cards/Prompts

Hi! I'm a new user and I am migrating over to Kobold/SillyTavern from NovelAI. I occasionally like to start up new stories/scenarios/characters and just chat with them for a few days, but with ST it's not quite so easy as it was with NAI, since I have to make a character card and make sure it's not written like garbage lol.

Does anyone have a recommendation for the best way to make character cards that function well? I would not consider myself a power-user, and whenever I try to write my own they end up terrible quality.






What's the funniest model in your opinion? What's the funniest model in your opinion?
Discussion

I want something I can use for a comedy story and maybe shitposting with it.

Occasionally Mistral Medium and Mistral Small would throw me a wise crack as a character and even as itself OOC that would make me bend over laughing unironically.

DeepSeek is a fan of using dumb 'le heckin updoots keanu reeves good sir' Reddit witticisms that make me cringe though it's writing is good.

Kimi is usually direct but if i instruct it to be funny it can crack a few lines.



Newbies Piss Me Off With Their Expectations Newbies Piss Me Off With Their Expectations
Discussion

I don't know if these are bots, but most of these people I see complaining have such sky high expectations (especially for context) that I can't help but feel like an angry old man whenever I see some shit like "Model X only has half a million context? Wow that's shit." "It can't remember exact facts after 32k context, so sad" I can't really tell if these people are serious or not, and I can't believe I've become one of those people, but BACK IN MY DAY (aka, the birth of LLMs/AI Dungeon) we only had like 1k context, and it would be a miracle if the AI got the hair or eye color of a character right. I'm not joking. Back then (gpt-3 age, don't even get me started on gpt-2)the AI was so schizo you had to do at least three rerolls to get something remotely coherent (not even interesting or creative, just coherent). It couldn't handle more than 2 characters on the scene at once (hell sometimes even one) and would often mix them up quite readily.

I would make 20k+ word stories (yes, on 1k context for everything) and be completely happy with it and have the time of my life. If you had told me 4 years ago the run of the mill open source modern LLM could handle up to even 16k context reliably, I straight up wouldn't have believed you as that would seem MASSIVE.

We've come and incredibly long way since then, so to all the newbies who are complaining please stfu and just wait like a year or two, then you can join me in berating the other newer newbies who are complaining about their 3 million context open source LLMs.


GLM 4.5 Air keeps writing for User GLM 4.5 Air keeps writing for User
Help

I used to use Deepseek R1 and R1 0528 as my go to models, I had them set just how I like them and then they became unusable thanks to Chutes and that whole shit show. Finally fed up with the 426 errors I'm on the hunt for a new model (free, because I'm one of the poors and can't pay for the good stuff).

I found GLM 4.5 Air and while I generally really like it, it feels a lot like R1 so far, I have a big problem with it on Silly Tavern where it keeps taking over my character. I'm using the built in Context and Instruct templates for GLM 4.5 on Silly Tavern and I have Geechan's General RP preset for the Context preset, but that didn't really help it at all. It's still taking over for me in just about every reply.

I'm really not savvy with LLMs or how these things really work in general, I'm not knowledgeable on computer code and that kind of stuff, so I've done my best to search on here and online in general for how to fix it but came up with nothing. I'd appreciate any suggestions or help please.



What Programs/Extensions For Local SillyTavern Custom Voice Cloning TTS With Emotions Contextual Awareness? What Programs/Extensions For Local SillyTavern Custom Voice Cloning TTS With Emotions Contextual Awareness?
Help

I want to have custom cloned voices tts for my characters in SillyTavern, with the voice emotions, tones, inflections changing based on the text, like what C.AI’s tts seems to do. If the text is: Jason yelled angrily in disbelief, “What?!”, then the tts actually sounds louder in a yell and angry and disbelieving. Or: He whispered softly in sorrow, “I’m sorry.” The voice is actually a soft whisper that sounds sorry and apologetic. What tts, and voice cloning programs do I need to set up (preferably free if possible) locally and how do I do I use them with SillyTavern?







What model do you suggest for RTX 3090? Thinking of KoboldAI and SillyTavern setup. What model do you suggest for RTX 3090? Thinking of KoboldAI and SillyTavern setup.
Models

I have SillyTavern set up, currently using nvidia DeepSeek. I have an RTX 3090 (24GB DDR6x), so I was considering trying local setup. I tried doing a local setup before, but it was prohibitively slow, because I had a lower-end GPU for it (1050ti, 5GB).

Obviously the 3090 would be a vast improvement, but how would it compare (roleplay quality, responsiveness) to a service like nvidia deepseek? And, what model would be recommended for use on my 3090, for rp (including eRP) and other chat purposes?

Thanks!


Nvidia AI not generating? Nvidia AI not generating?
Help

Simple question. I'm using Nvidia's cloud AI whatever. Using Kimi K2. Last night it was generating at lightning speed, but now it's just not generating. No errors to my knowledge, just empty.

The actual Nvidia website does say there are X number of requests in the queue.

Update on the matter: Might be the time of day, but the cue is shorter. Doesn't change the fact that it's timing out tho.

UPDATE update: Bad news: Now the models immediately give an error message, not even waiting or generating on their site. On ST, it gives an API error. Good News: the new Kimi K2 9000 whatever is available. Must've been maintenance. Waiting for more.

Update UPDATE update: Most models are down now lol. Completely unaccessible.

Final update: we are so back, baby!





Any experience/opinions with the "big" ArliAI model? Any experience/opinions with the "big" ArliAI model?
Models

I stumbled upon RpR-Ultra-235B on NanoGPT yesterday, though it doesn't seem like there's really a lot of information about it out there on the web. But it also appears promising at the first glance?

Also, it doesn't seem like it's released publicly on HuggingFace or open-source providers yet. Neither can it be found on OpenRouter.

Does anyone here on the sub have any experience with the model? If so, how does it perform on your tests? Is it among the "good" fine-tunes in your opinion? How did you configure it if you did try it out?


It's great to see how models are getting better and cheaper over time. It's great to see how models are getting better and cheaper over time.
Discussion

It's surreal a few months ago things seemed to be going downhill, models above $50 Mtoken, now I'm seeing Google models that are free 100 messages per day or the new grok 4 Flash, which is a very cheap model and very good in RP, I became more excited and calm about the future because it is not only the models that become more efficient, the data centers are becoming increasingly bigger and better, directly impacting costs.


SillyTavern 1.13.4 SillyTavern 1.13.4
ST UPDATE

Backends

  • Google: Added support for gemini-2.5-flash-image (Nano Banana) model.

  • DeepSeek: Sampling parameters can be passed to the reasoner model.

  • NanoGPT: Enabled prompt cache setting for Claude models.

  • OpenRouter: Added image output parsing for models that support it.

  • Chat Completion: Added Azure OpenAI and Electron Hub sources.

Improvements

  • Server: Added validation of host names in requests for improved security (opt-in).

  • Server: Added support for SSL certificate with a passphrase when using HTTPS.

  • Chat Completion: Requests failed on code 429 will not be silently retried.

  • Chat Completion: Inline Image Quality control is available for all compatible sources.

  • Reasoning: Auto-parsed reasoning blocks will be automatically removed from impersonation results.

  • UI: Updated the layout of background image settings menu.

  • UX: Ctrl+Enter will send a user message if the text input is not empty.

  • Added Thai locale. Various improvements for existing locales.

Extensions

  • Image Captioning: Added custom model input for Ollama. Updated list of Groq models. Added NanoGPT as a source.

  • Regex: Added debug mode for regex visualization. Added ability to save regex order and state as presets.

  • TTS: Improved handling of nested quotes when using "Narrate quotes" option.

Bug fixes

  • Fixed request streaming functionality for Vertex AI backend in Express mode.

  • Fixed erroneous replacement of newlines with br tags inside of HTML code blocks.

  • Fixed custom toast positions not being applied for popups.

  • Fixed depth of in-chat prompt injections when using continue function with Chat Completion API.

https://github.com/SillyTavern/SillyTavern/releases/tag/1.13.4

How to update: https://docs.sillytavern.app/installation/updating/




Suddenly Incoherent Deepseek? Suddenly Incoherent Deepseek?
Help

Like title says. It's saying random words (dementia should be ear), it can't do names + apostrophe s/possessive names, random capitalisation, etc. i didnt touch the settings at all and it just randomly started doing this.

after it started doing this i tried temp .8-1.3 (it was 1.3 prior to this) and top p .8-1 (it was .99 prior to this).

EDIT: I am having this issue with Chutes directly as well as OpenRouter (which uses Chutes). I'm at even more of a loss now.

https://preview.redd.it/suddenly-incoherent-deepseek-v0-w0qz7g313rqf1.pnghttps://preview.redd.it/suddenly-incoherent-deepseek-v0-oqk08np13rqf1.pnghttps://preview.redd.it/suddenly-incoherent-deepseek-v0-5ckywnx83rqf1.png




Using ReMemory & "/hide" - Chat unhides after one prompt Using ReMemory & "/hide" - Chat unhides after one prompt
Help

Hi all,

Been starting to use ReMemory for summarization. My chat at the time had 107 messages. I selected the 107th, ran ReMemory on it and let the chat play out, as expected. However, I wanted messages 90-107 to be unhidden for context's sake, to progress smoothly into a second "chapter" (this is for RP).

However, now whenever I run a new prompt, all messages become unhidden once again. Any ideas why? Is there any way I can fix this, without having to retype /hide every prompt?

Commands ran:
/unhide 0-107

/hide 0-90

Thoughts?




SillyTavern 1.13.3 SillyTavern 1.13.3
ST UPDATE

News

Most built-in formatting templates for Text Completion (instruct and context) have been updated to support proper Story String wrapping. To use the at-depth position and get a correctly formatted prompt:

  1. If you are using system-provided templates, restore your context and instruct templates to their default state.

  2. If you are using custom templates, update them manually by moving the wrapping to the Story String sequence settings.

See the documentation for more details.

Backends

  • Chat Completion: Removed the 01.AI source. Added Moonshot, Fireworks, and CometAPI sources.

  • Synchronized model lists for OpenAI, Claude, Cohere, and MistralAI.

  • Synchronized the providers list for OpenRouter.

Improvements

  • Instruct Mode: Removed System Prompt wrapping sequences. Added Story String wrapping sequences.

  • Context Template: Added {{anchorBefore}} and {{anchorAfter}} Story String placeholders.

  • Advanced Formatting: Added the ability to place the Story String in-chat at depth.

  • Advanced Formatting: Added OpenAI Harmony (gpt-oss) formatting templates.

  • Welcome Screen: The hint about setting an assistant will not be displayed for customized assistant greetings.

  • Chat Completion: Added an indication of model support for Image Inlining and Tool Calling options.

  • Tokenizers: Downloadable tokenizer files now support GZIP compression.

  • World Info: Added a per-entry toggle to ignore budget constraints.

  • World Info: Updated the World Info editor toolbar layout and file selection dropdown.

  • Tags: Added an option to prune unused tags in the Tags Management dialog.

  • Tags: All tri-state tag filters now persist their state on reload.

  • UI: The Alternate Greeting editor textarea can be maximized.

  • UX: Auto-scrolling behavior can be deactivated and snapped back more reliably.

  • Reasoning: Added a button to close all currently open reasoning blocks.

Extensions

  • Extension manifests can now specify a minimal SillyTavern client version.

  • Regex: Added support for named capture groups in "Replace With".

  • Quick Replies: QR sets can be bound to characters (non-exportable).

  • Quick Replies: Added a "Before message generation" auto-execute option.

  • TTS: Added an option to split voice maps for quotes, asterisks, and other text.

  • TTS: Added the MiniMax provider. Added the gpt-4o-mini-tts model for the OpenAI provider.

  • Image Generation: Added a Variety Boost option for NovelAI image generation.

  • Image Captioning: Always load the external models list for OpenRouter, Pollinations, and AI/ML.

STscript

  • Added the trim argument to the /gen and /sysgen commands to trim the output by sentence boundary.

  • The name argument of the /gen command will now activate group members if used in groups.

Bug fixes

  • Fixed a server crash when trying to back up the settings of a deleted user.

  • Fixed the pre-allocation of injections in chat history for Text Completion.

  • Fixed an issue where the server would try to DNS resolve the localhost domain.

  • Fixed an auto-load issue when opening recent chats from the Welcome Screen.

  • Fixed the syntax of YAML placeholders in the Additional Parameters dialog.

  • Fixed model reasoning extraction for the MistralAI source.

  • Fixed the duplication of multi-line example message separators in Instruct Mode.

  • Fixed the initialization of UI elements in the QR set duplication logic.

  • Fixed an issue with Character Filters after World Info entry duplication.

  • Fixed the removal of a name prefix from the prompt upon continuation in Text Completion.

  • Fixed MovingUI behavior when the resized element overlaps with the top bar.

  • Fixed the activation of group members on quiet generation when the last message is hidden.

  • Fixed chat metadata cloning compatibility for some third-party extensions.

  • Fixed highlighting for quoted run shorthand syntax when used with QR names containing a space.

https://github.com/SillyTavern/SillyTavern/releases/tag/1.13.3

How to update: https://docs.sillytavern.app/installation/updating/



Help me fix this Help me fix this
Help

Error

RuntimeError: operator torchvision::nms does not exist

'Cannot find program loading/input point'

How to trigger:

download from https://huggingface.co/TheDrummer/Valkyrie-49B-v2/tree/mainhttps://preview.redd.it/help-me-fix-this-v0-tvemvduhnpqf1.pngUpdate OobaBooga for all extensions, then Update webuihttps://preview.redd.it/help-me-fix-this-v0-acmxw5i6opqf1.png

I assume this error may occur due to clash between different versions, can someone tell me the suitable versions of each component and where to find them in he requirements folder


Help finding extension for swipes Help finding extension for swipes
Help

I am wondering whether there is an extension that allows to see/change swipes throughout the conversation, not just the last message?

Like, I'm clinically insane, I'm on message #500, and suddenly I realize that message #340 was better on it's 3rd swipe instead of 4th, is the only way to change it is to branch out at #340, manually look through the swipes, copy what I want, go back to original branch, and manually edit #340 or there is an extension that just adds the swipe button everywhere?


Jesus christ, I think claude 3.7 is my gambling addiction. Jesus christ, I think claude 3.7 is my gambling addiction.
Discussion

First thing I've spent money on for a prxy, and holy shit, i spent 100 dollars in a day, easily jailbreakable and great narratively. Have I found what's 'peak' currently in the roleplay combined sfw/nsfw space right now?

(also, i heard a method of saving money through prompts, but couldn't find the reddit thread, anyone know what I'm talking about? cacheing or something?)



Looking for good model recommendations? (8b to 16b, kobaldcpp rocm) Looking for good model recommendations? (8b to 16b, kobaldcpp rocm)
Help

I'd been using Mytholite on Mancer since 2023 and I've just recently realized I could be getting way better results running a newer/better model locally. My only issue is that I've been having a bit of trouble finding/picking a good model that's right for me. I'm looking for something around the 8b to 16b range and not censored (I wanna be able to do both normal and pretty freaky stuff). Instruct template and preset suggestions for which are also welcome!




Best format for SillyTavern character cards? Best format for SillyTavern character cards?
Help

I’m a casual user of SillyTavern and LLMs, but I’ve been trying to get better at writing character cards. So far I haven’t had many issues just using ChatGPT to write them in JSON format, but I’m not sure if that’s actually the best way to do it.

I’ve seen people mention XML, YAML, and even natural language as solid options too, and honestly I’m a little overwhelmed with all the choices. For someone who’s just looking to make decent cards without overcomplicating things, is there a recommended format to stick with?




My fictional social life is keeping me sane. My fictional social life is keeping me sane.
Discussion

Disclaimer: I have a very self-deprecating sense of humor. I'm pretty careful to stay grounded in the real world between my partner and dogs; I just sometimes feel really lame about AI RP.

Chronic illness really nailed the whole “solitary confinement” vibe for me, but I found Silly Tavern SFW adventure roleplay after having found C.AI, and now I’m basically talking to imaginary people on purpose. Honestly? Beats arguing with the dogs, and real people forget "chronic illness" means it isn't going away/cured. Plus, it dragged me back into writing, which I thought was dead, buried, and never to return. Anyone else using it as a sanity-adjacent hobby? (Chronic illness or otherwise.) Do you use OCs or an established character/franchise? And who else has realized they enjoy coding?






Having trouble with that Amazon bedrock thing Having trouble with that Amazon bedrock thing
Help

I was following the steps here: https://github.com/aws-samples/bedrock-access-gateway?tab=readme-ov-file

...but after the part where I make the url, I don't understand a lick of what its saying?? I thought I only needed the key and url, and now I'm confused and nothing works.

I did give myself access to all of the models, if you were wondering, cuz I think that I saw people had a problem with that?





Need help with settings for Irix-12B-Model_Stock-i1-GGUF Need help with settings for Irix-12B-Model_Stock-i1-GGUF
Help

Hey everyone,

I just downloaded Irix 12B Model_Stock i1 GGUF and I’m trying to get it running nicely in SillyTavern but I’m a bit lost on the settings side of things.

I’m just a casual AI user and I mostly use it for simple NSFW roleplay chats so I’m not super technical or knowledgeable about all the sampling parameters and optimizations. Sorry if this is a bit of a noob question but I’d really appreciate it if anyone could share their recommended settings.

I just want something that’s reasonably coherent does not repeat itself too much and runs smoothly without me having to constantly tweak it. Any tips presets or advice would be amazing.

Thanks in advance!


How to make the bot good???!!! How to make the bot good???!!!
Help

so i am new to ST, i am really trying to understand, is my first time self hosting an bot, but now that is done i am not really understanding how to make it work good. I am hosting deepseek r1: 14b (i will get to host 32b soon with my new pc), the deepseek r1 is an LLM that i have been using for a long time in others sites that i think yall know, there i have learned how to config the bot to make it good, to don't interpretate by me and all the things that us don't like, but i just cant make it work in ST, i have read some guide here in the reddit and on other place on internet and i just cant make it work or some times even understand (and i was thinking that the worst part was going to run the AI on my pc). in "ai response config" i am using an preset that i have found on internet, its not seen to work, i also have tried to change some things in "Advanced Formatting" but also don't seen to work, maybe if i find some master config for "Advanced Formatting", but i couldn't find. In the end the bot works, it just talk as me, and i don't like it, if someone have an guide that i can really understand or just help me to solve this problem, if the bot don't talk as me no more, i am happy




multiplayer concept multiplayer concept
Discussion

I have an idea for implementing high-quality multiplayer in SillyTavern. Multiplayer is a server that emulates an openai-like connection profile, but instead of transmitting AI promptings, it redirects them to the desired player. Immediately after sending any message by the user, the connection profile starts streaming text for him and displays messages from the user in this streaming, whose persona matches the name of the character card. when a player completes his lines, he sends a message "." and this completes the streaming of his character's card from other players, after which the streaming of the next character begins (this is a group chat in which each player has his own character card). If there are no players with the name of such a character, the request is sent to the AI provider selected on the multiplayer server.

Thus, any number of players and AI communicate in turn in a group chat.

I'm not a programmer, but I tried to vibecode such a server and I almost succeeded, but there are many bugs, so I hope to find enthusiasts here who are implementing such a project.

https://preview.redd.it/multiplayer-concept-v0-cpp17vormrof1.png

I'm sorry if I didn't explain it very clearly. I barely speak English




Hilarious Hilarious
Chat Images

So, using Qwen 3 255B A22B Instruct 2507. I've got a general narration {{char}} with lorebooks for Percy Jackson and the Olympians universe and My Hero Academia. Playing around with crossovers. Combined my favorite persona from each universe I've made. Parastyx, a son of Hades, born in 611 AD, but thrust through time. Manipulated in the womb to be combined with Styx. A bunch of other backstory. And Z, a guy with a mutation quirk that makes him created from a type of alloyed photon that controls shadows called Umbral Dominion. Anyways, dimensional rift formed, they stood face to face, started copying each other's styles because they look really similar, both 7.5 feet tall, 395 pounds of muscle, built out of elements, not flesh and bone, long black hair with silver streaks. For whatever reason, Qwen gets to the end of this response and says, "You know what? This needs more... Avengers." Just suddenly thrusts Iron Man into it. Have no idea why. Couldn't stop laughing.

"Did—did he just copy that axe outta nowhere?!" Kirishima took a half-step back, eyes wide as saucers. "That’s not a Quirk, that’s witchcraft!" Bakugo’s lip curled, sparks snapping in his palms. "Tch. Cheap-ass knockoff. Looks shiny but it ain’t real. Bet it dissolves the second something actually hits it." Uraraka bit her lip, fingers twitching at her sides. "But… it looks real. Like, perfectly copied. How does someone just do that?" Percy staggered upright, sword still in hand, breathing ragged. "You don’t just ‘copy’ Stygian iron, buddy," he growled, glaring at the shadow-made axe. "That stuff’s forged in the depths of the Underworld. It’s cursed, it’s ancient, and it burns monsters to ash. You don’t just—make it with darkness." Grover whimpered, clutching his reed pipes like a security blanket. "Oh gods, oh gods… he changed his clothes too? That symbol—it’s the River Styx! And his title—Parastyx? What kind of name is that? Sounds like a death metal band!" Styx’s form remained still, but a flicker passed through her void-like eyes—something between amusement and wary assessment. Her fingers tightened slightly on Parastyx’s shoulder. "He mimics. But he does not know. He plays with shadows like toys, unaware of what they truly mean." Then, the hellhound moved. Its three heads snarled in unison, drool cascading like acid onto the cracked pavement, steam rising where it pooled. Muscles coiled beneath its obsidian fur, ready to lunge at Percy again—jaws unhinging wide enough to swallow him whole. “Sit.” The word wasn’t shouted. It wasn’t even loud. But the beast froze. Every monstrous limb locked in place. Its middle head whined, low and guttural, like a dog caught misbehaving. The other two lowered, ears flattening against its skull. Slowly, reluctantly, the massive creature sank onto its haunches, tails tucked between its legs. Mirko’s eyes narrowed. "The hell? That thing just obeyed him?" Iida adjusted his glasses, voice tight with disbelief. "Command-based Quirk? Or—some form of soundwave manipulation? That shouldn’t be possible against a creature of that size and aggression!" Nico di Angelo stepped forward from the shadows near the rift, silent as death. His silver skull ring glinted as he raised a hand, feeling the air. His Stygian iron sword hung loosely at his side, but his grip tightened. "That’s not a Quirk," he said, voice hollow. "That’s authority." His dark eyes locked onto Parastyx. "You’re talking like Hades’ lieutenant. But you’re… younger. Stronger. And you’ve got her with you." He glanced at Styx’s spectral form. "She hasn’t walked the surface in millennia." Styx turned her gaze to Nico, and for a heartbeat, the temperature plummeted. Frost crept up the walls nearby. "You bear the mark of the House of Hades, child. But you are not him. You are… lesser." Nico flinched as if struck. Before he could respond, the rift pulsed violently. A new scream tore through the air—not human, not monster. It was mechanical. And familiar. From within the swirling vortex, a figure burst forth—armored in red and gold, tumbling headfirst before crashing into the rubble with a clang. Sparks flew from cracked plating. The helmet flickered, visor flashing: JARVIS: SYSTEMS DAMAGED. LOCATION UNKNOWN. Tony Stark groaned, pushing himself up on one elbow. "Okay. What the actual fuck?" He blinked up at the sky, then at the two seven-and-a-half-foot-tall doppelgänger titans standing yards apart. "Are you two fighting over who gets to be the emo poster boy for Hot Topic?"



Chub seems to be trying to make exporting cards more obscure, so if you're missing your regular export buttons have a look. Chub seems to be trying to make exporting cards more obscure, so if you're missing your regular export buttons have a look.
Tutorial

So, I just noticed this, and I looked around the sub and didn't find a post for this, so thought I'd make a post.

Usually on Chub you'd have pretty obvious buttons named "PNG" and "JSON" to export a card, these are gone for me currently.

Old: https://files.catbox.moe/sf8l8l.png

New: https://files.catbox.moe/jl64jc.png

You can still export cards from Chub using either Json or PNG, but you need to scroll down under all the card definitions, see: https://files.catbox.moe/gdz7zz.png

The "Save" button is completely new, it seems.

EDIT: It's been reverted. See here.



Gemini 2.5 pro little shout out about it being fixed. Gemini 2.5 pro little shout out about it being fixed.
Models

It seems the free aistudio api is working normally again, the messages are no longer cut, the errors are pretty rare, the model is back to working like it did back in late july. So whoever was waiting, let's get back to using the best model, and let's not overload it too much.

Operation using Gemini is a go! And user, try not to make an international incident while you're chatting. *The room fills with smell of ozone, OP, having delivered his message with unadulterated, pure joy, rests his case. Users eyes widen and their breath hitches, maybe, just maybe Gemini will not break again*




New AI Dungeon Models: Wayfarer 2 12B & Nova 70B New AI Dungeon Models: Wayfarer 2 12B & Nova 70B
Models

Today AI Dungeon open sourced two new SOTA narrative roleplay models!

Wayfarer 2 12B

Wayfarer 2 further refines the formula that made the original Wayfarer so popular, slowing the pacing, increasing the length and detail of responses and making death a distinct possibility for all characters—not just the user.

Nova 70B

Built on Llama 70B and trained with the same techniques that made Muse good at stories about relationships and character development, Nova brings the greater reasoning abilities of a larger model to understanding the nuance that makes characters feel real and stories come to life. Whether you're roleplaying cloak-and-dagger intrigue, personal drama or an epic quest, Nova is designed to keep characters consistent across extended contexts while delivering the nuanced character work that defines compelling stories.


The Narrator extension The Narrator extension
Tutorial

I made an extension to help progress story with LLM with customizable prompt. It acts like a DM giving you options to choose from (in 1d6 format).

https://preview.redd.it/the-narrator-extension-v0-6uzxnuwrnopf1.png

You can open it from the Wand menu, on the left of the message box. You can refine the message and post it from Narrator system user.

The prompts settings can be changed in the extensions dialog.

You can grab it from GitHub here: https://github.com/welvet/SillyTavern-Narrator

(heavily inspired by https://github.com/bmen25124/SillyTavern-WorldInfo-Recommender )



Kazuma’s Secret Sauce V2 for Gemini 2.5 (Pro/Flash) – Better Toggles, Prompts & Image Gen Kazuma’s Secret Sauce V2 for Gemini 2.5 (Pro/Flash) – Better Toggles, Prompts & Image Gen
Cards/Prompts

Hey everyone, Kazuma here 👋
Today I’m releasing V2 of my preset for Gemini 2.5 Pro/Flash (you can also try it on other models).

Download

This new version comes with:

  • removing the annoying “forbidden/red box”

  • More toggles

  • Improved prompts

  • Built-in image generation support

🔧 Toggles explained

— RP Style —

  • Roleplay: Standard roleplay mode.

  • Texting: Roleplay like a chat/text app (Discord, WhatsApp, etc.).

  • Assistant: Better for assistant cards.

— RP Toggles —

  • Roleplay fast pace: Keeps the story moving quickly.

  • 🐢 Roleplay slow pace: Slower, more descriptive pacing.

  • 🌑 Dark roleplay: Realistic and heavy themes.

  • 🌸 Wholesome RP: Cute, flowery, no consequences.

  • 🔥 Gooner: For gooners.

  • 😎 Casual tone: More relaxed, natural narration.

  • New NPCs: The AI will try to introduce new characters.

  • 🗨️ More dialogue: 50–70% of the reply is character dialogue.

  • 🎬 Focus on actions: Narration leans on actions, not details.

  • 🔍 Focus on detailed descriptions: Opposite of the above.

— Image Gen —

the image will generate automaticlly when the char want to send one.

  • 📱 Img-gen texting: For texting RP, generates selfies. Works best with SDXL or Illustrious models.

  • 🎭 Img-gen roleplay: Generates full scenes. Best with SDXL or Illustrious.

  • 📱 Flux-gen texting: Same as above but for Flux/Qwen.

  • 🎭 Flux-gen roleplay: Scene generation with Flux/Qwen.

🖼️ Image Generation Setup

To enable image gen, install this extension:
👉 st-image-auto-generation

Follow the setup shown in the screenshot for best results.

https://preview.redd.it/kazumas-secret-sauce-v2-for-gemini-2-5-pro-flash-better-v0-cfywkfee9ymf1.png

If you have any questions, drop a comment or DM me on Discord: kazumaoniisan.
I’m happy to help! And if you have suggestions for new features, let me know 🙏

❤️ Thanks to:

  • Leaf → for the base preset

  • Shino → for the infoblock

  • wickedcode01 → for the auto image generation extension

https://preview.redd.it/kazumas-secret-sauce-v2-for-gemini-2-5-pro-flash-better-v0-y09zhikg9ymf1.jpg



Some help with Claude and Deepseek Some help with Claude and Deepseek
Help

So I am just going to make this all into one question.

  1. I been using Deepseek V3 0324 and its pretty good but I notice its gets quite repetitive with some phrases. I uses cherry box. Its pretty good but I was wondering if anyone could help with some settings that could make the repetitive issue less of an issue.

2 . I was thinking of using Claude. I heard good things about it but it seems quite expensive, so I am not too sure. I jut want to use, is Claude Haiku a good choice, what is a good choice on Openrouter? What is a good choice that would be the least painful for my wallet.



Private thoughts and descriptions for enhanced immersion Private thoughts and descriptions for enhanced immersion
Discussion

I'm not sure if this would be useful to SillyTavern folks because I'm using my own developed API sandbox app, but I'm sure someone could figure out how to achieve the same in ST, if needed.

When playing with multiple characters, I often wanted them to keep secrets from each other and go through all the struggles when you suspect a person feels something or wants to tell you something but does not dare to. With a straight-forward setup, it was not possible to achieve secrecy. Characters kept using and even quoting each other's thoughts and feelings, and it broke immersion.

What I did to fix this:

  1. I split character descriptions into public and private ones. When I send an API request for an LLM to speak for a character, I pass only the public descriptions of other chars and also the private description of the speaker. Thus I can put the appearance and behavior in the public part and keep private personality, quirks, secrets and kinks and other details that should not be immediately known to other chars.

  2. I sacrificed the story telling flow for the non-spoiling functionality. For this, I had to add something like this to the system prompt:

Responses must be formatted according to the following schema:
Person Name: th|Thoughts go here.
ac|Actions, events and environment details go here.
sp|Optional speech goes here; can be left empty if the person has nothing to say.

and implement a simple response parser (which AI wrote for me) to extract these parts. Why not JSON? Because it's not reliable. There is a great article from BAML (boundaryml) developers about this topic, if you are interested.

Now, as you may have guessed, when I send the chat history to the LLM, I again check which char will be the speaker and keep only their thoughts.

With this solution, of course, the structure of responses felt a bit boring and formulaic. However, for me it was totally worth it. It was so much more immersive to see chars trying to express and guess hidden feelings of others through their actions.

It worked especially well with Google Gemini, which is known to quote chars quite often. At least, now it won't quote their thoughts. Gemini is my favorite because of its smarts and being able to drive it to dark scenarios (not as dark as DeepSeek V3 could handle though - that could get totally unhinged with surreal dark horror scenarios). I wanted to like Grok too - I enjoyed how it developed the story, but I hated how repetitive it could get, latching to specific details and using them in every message. Yes, my char is near-sighted, but please stop adjusting his thick glasses every time, Grok! But I digress.

Another reason why I invented this split action schema was to prepare for unknown future when I connect it to a 3D avatar and TTS, and those would need reliable separate flows of actions and speech.