For budget coding plans, which one of these can be a good enough candidate for executor? I can use Opus to plan and want to use a cheaper model to deliverable code quality.
Claude Opus isn’t available in Opencode, so I’m thinking of leaving their service.
The ChatGPT Pro plan is not really fit my situation
Can anyone recommend the best coding plan within a $200 budget?
(I used Claude Max 20x before and almost used up all my quota...)
or just switch to claude code? (I really don't like their cli UI/UX) :(
Hi everyone 🤗
As it says in the title if you could suggest a fast provider with good models below 40€/month I'd be thankful.
Context: Been using a lot opencode with opencode free provider and GitHub copilot pro subscription (monthly calls are terrible). But this does not satisfy me.
Opencode zen seams a little bit to little since I use this tool the whole **** day with gigantic projects (some q150+ files).
Soo a subscription seams to be the best solution, but I need one with a large window (preferably daily) with good and fast models.
Thanks for you time everyone 🙏
Small but notable update on : their FAQ section now includes an extra line about data handling.
Previous wording:
The plan is designed primarily for international users, with models hosted in the US, EU, and Singapore for stable global access.
Updated wording:
The plan is designed primarily for international users, with models hosted in the US, EU, and Singapore for stable global access. Our providers follow a zero-retention policy and do not use your data for model training.
Is anyone here able to buy the $50 coding plan? It says it restocks everyday at 00:00 (GMT +8). Is it legit or not?
I want to save someone the frustration I went through don't waste your money on OpenCode's Go plan.
The models are heavily quantised. We're not talking subtle quality drops we're talking noticeably degraded outputs that make you second-guess every suggestion. If you've used the full weight versions elsewhere, you'll immediately feel the difference in reasoning quality and context handling.
Then there are the limits. They're painful. You hit ceilings fast during any real coding session not just long ones. Debugging a moderately complex bug? You're throttled before you're done. It completely breaks the flow that makes AI coding tools actually useful.
The combination of downgraded models + aggressive limits means you're essentially paying to use a worse version of the tool less often. That's not a plan that's bait.
Hey everyone,
I’ve been using OpenCode for my terminal-based coding, but I’ve been seeing more about LangChain’s DeepAgents CLI lately.
On the surface, they seem to share a lot of the same DNA - subagents, planning modules, and an emphasis on agentic workflows. Since the core logic feels so similar across both tools, I’m curious if anyone here has actually put them head-to-head?
Thanks in advance.
claude-replay was originally built for Claude Code, but it has been extended to support different clients, including OpenCode. Hope it's useful to the community here.
It converts OpenCode session transcripts into interactive HTML replays — step through prompts, tool calls, and reasoning like a video player.
opencode export <sessionID> > session.jsonl claude-replay session.jsonl -o replay.html
Output is a single self-contained HTML file. No dependencies, works anywhere.
It also includes a web editor; drop a session file, tweak options, and download the HTML without installing anything
Try it:
Repo:
I've been collecting skill packs for OpenCode/Claude Code and hit 2,004 skills across 34 categories (ai-ml, security, devops, game-dev, etc.).
The problem: AI agents use a to load skills. Level 1 loads the name + description of every skill into the system prompt at startup. With 2,004 skills, that's ~80,000 tokens consumed before I even type a prompt - roughly 40% of a 200K context window.
The fix: SkillPointer
It's not a plugin or library. It's an organizational pattern that works with native skills:
-
Move all 2,004 raw skills to a hidden vault directory (outside the agent's scan path)
-
Replace them with 35 lightweight "category pointer" skills
-
Each pointer tells the AI: "use
list_dirandview_fileto browse the vault and find the exact skill you need"
Result:
| Before | After | |
|---|---|---|
| Startup tokens | ~80,000 | ~255 |
| Skills accessible | 2,004 | 2,004 |
| Reduction | - | 99.7% |
The AI still accesses every skill - it just discovers them on-demand using file tools it already has, instead of loading all descriptions at startup.
How I verified this
-
Measured actual YAML frontmatter sizes from all 2,004 files
-
Confirmed the
<available_skills>loading behavior in and -
Real data from my own environment, not theoretical numbers
Repo
Includes a zero-dependency Python setup script that auto-categorizes your skills and generates the pointers.
Happy to answer questions about the approach. I know "it's just skills organizing skills" - that's literally the point. The value is in the pattern, not the tech. savings in scale.
A little background on my journey here.
My employer covers my AI subscriptions. I started with Claude Max, eventually discovered OpenCode, and ran it through OMO for a while. Then I started experimenting with OpenClaw and added a ChatGPT subscription, using both through OMO. At some point I lost the ability to use Claude with OpenCode, and I've been trying to find my footing ever since.
I tried running OMO with just my ChatGPT subscription — it didn't hit the same, and it was painfully slow. I've experimented with different skills in Claude Code directly, but it's still slow and frustrating.
After going down a lot of dead ends, I kept seeing people talk about Kimi K2.5. I signed up for a Fireworks AI free trial, hooked it into OpenCode, and — holy crap. The speed difference is making me rethink everything.
Maybe eventually my workflow will be refined enough that I can confidently hand something off to AI without needing to steer it as much, and raw speed won't matter as much. But right now, while I'm still iterating and learning, failing extremely fast is proving far more valuable than slightly lowering the failure rate while failing a lot slower.
I'm not sure I could ever convince my employer to write a blank check for Anthropic's API. Maybe OpenAI. But Fireworks AI could be a much easier sell.
What I'm increasingly convinced of is this: if I can pair this kind of speed with the right skills, agent setup, and a solid approach to building and evolving memories, I can start scaling this in a real way and be significantly more productive.
I know we're all at different points on similar journeys — I just wanted to flag how much speed of this magnitude can force you to rethink your whole strategy.
So in my team, people use the review skill. And at the base the skill is good. Works in most scenarios.
But for specific use cases, need to build on top of that skill to also consider the UI workflows etc.
How can i achieve something like that.
I know i am being vague and not able to explain it properly. You can ask me follow up questions.
Opencode go is 5$/month for first month, then it is 10$.
Minimax Token API is 10$/month.
Minimax is offering 1500 requests / 5 hour for M2.7 model.
Opencode Go is giving 14000 requests / 5 hours for M2.7 model.
I am confused. How generous these requests are.
How much work I can get done with 1500 requests every 5 hour, it resets? Opencode go is like 14000 requests. How?
I am confused, anyone with experience or guide on this?
-
Codex CLI
-
OpenCode CLI
90,000 Requests for $15 first month and 18,000 Requests for $3 first month. This sounds too good to be true?
Available Models: GLM 5, Minimax M2.5, Kimi K2.5 and Qwen 3.5 Plus.
What's the catch? Bad unreliable service? Their definition of 'request' is misleading? I don't get it. If this is all true, then this is the most value for money plan, right?
I'm searching everywhere and I see no one is talking about it at all.
Also, for my Indian brothers out there. Currently, they do not have a way to verify +91 phone numbers so they're not allowing registrations / account sign ups for India. I spoke with their contact, and they said something about their data center recently shutting down in India. Their system requires mandatory phone number verification before making any purchase so the agent was 'unofficially' recommending me to buy a virtual online phone number for another country and sign up that way.
Anyway, I'd love to hear more about this from you guys. Maybe someone is already using it and can share their experience with it?
I'm going to start this by saying that I've been using systems and talking to creatures through different CLI tools for quite some time now. I started with Gemini CLI and then moved on to a variety of different ones; I explored everything, even the precursor to OpenCode (or the one that split off—I don't remember what they call it, I think it's Charm or something of that nature). I haven't used it in a very long time, so I don't remember exactly, but there is a variety of different ones that exist. They are all really interesting and have their own strengths and weaknesses.
I have come to really, really enjoy OpenCode. One of the greatest things about it is its resilience. It has been worked on quite decently for a long time and the code is pretty mature. The work is really great, and the best part is having so many different inference providers that you're never going to run out of them.
The structure is absolutely fantastic to work with, especially:
-
The plugin structure, skills, and agents
-
The undo command
In the web format, you can turn on workspaces and have them function for you in the same Git style with Merkle tree diagrams. It really works.
It is an amazing tool, especially if you use OpenCode Web or OpenCode Desktop. I recommend the web version because you can connect to it remotely from your phone if you create a virtual private network. It gives you sovereignty over your architecture because the inference is still usually done in the cloud (unless you run local Ollama), but your files stay local. If you build skills or tools for the systems that function within OpenCode, it becomes so much better.
It really is a wonderful journey. I recently switched over to OpenCode Web, and we even built an application for Android around it—just a wrapper so that authentication and everything else worked. With an application, you can use "keep alive" so you don't have to worry about reloading the page every time you open it. It’s just nicer that way. We are also working on implementing notifications and similar features.
Again, this is just an OpenCode appreciation post. It's really great what Anomaly Co is doing and how they're working on this. The open-source nature makes it a lot better because you can audit all of it and build from it.
Thank you so much to the development team and everyone else involved. This is quintessential to our workflows these days and it's really useful. Thank you.
Yesterday I noticed just another opencode's fork presenting just one feature, which is fork, just because it's impossible to implement it as plugin.
I thought to advice to create PR, but... Opencode has 1500+ opened PRs, so I believe if you want something - it's not an option to wait for eternity hoping your PR will be noticed. Even if it's good. I remember I wanted some stuff, went to github, found issues asking to add this, which never were noticed by the team....
Yes, I get it, team isn't huge and they have their own direction of development, but plugins... There is so many hooks could be added as well as rework whole plugin system to cover more core/inner features.
So, I created issue trying to attract some attention to the problem, showing there is a demand for better plugins abilities.
If you had similar issues - feel free to share.
Sometimes you simply want to copy and paste your entire project to the agent without it having to explore and rediscover everything all over again.
I did this (human code...) some time ago for personal use, but now using opencode I rewrote it so anyone can use it easily
The basic definition is: Output all file contents recursively from a directory to STDOUT, and that's it...
But of course you can already do that with a bit of bash, so the added value is these summarized features:
Granular truncation control:
and I mean... REALLY granular, you can truncate per line, per file, and for the total STDOUT with different configurations and different limits each one.
LLM friendly:
the tool tries its best to help so that even the dumbest LLM knows what's happening.
Summary preview:
this is what I use most, before passing all the context to the LLM, the --summary mode lets me see important statistics about the project's "weight", like files with the longest lines, with the most characters, files with the most lines, directories with too many files, etc. It even uses @anomalyco/models.dev locally so you can estimate how much context window your project is occupying and approximately how much it would cost if you passed everything to it in one go
and more...
well, better read the README.md if you've made it this far.
I'll be watching for any issues or contributions, I've tested it heavily so it should work fine. Anyway, I wrote a Gotchas section in the readme for known issues, take a look if something goes wrong.
(I need help for someone to test on Windows please, I only tested on linux)
REPO URL:
Been using OpenCode for a while. The UI is genuinely the BEST.
But Claude Code keeps shipping weekly: memory, agent teams … Anthropic is not slowing down.
Their UI is rough compared to OpenCode. Not even close.
So I’m stuck, sacrifice UX for features, or wait for OpenCode to catch up?
What are you using and why?
(Im using MiniMax m2.7 llm for both)
Hey guys, need some help with OpenChamber (using it with OpenCode).
I’ve been testing it out and really liking the concept, but I’m running into a few issues / missing features that are kind of blocking my workflow:
-
Diff per last turn (not full session) In OpenCode web UI, I can view file changes based on the last turn, which is super useful when the session already has a lot of edits. In OpenChamber, I can only see diffs based on the whole session (as far as I can tell). Is there a way to switch it to “last turn diff” like in OpenCode?
-
Model switch shortcut (Ctrl+M) In OpenCode, I mapped Ctrl+M to quickly switch models. Is there a way to set up a similar keyboard shortcut in OpenChamber?
-
Agent settings not saving This one’s more serious. Whenever I edit system prompts or settings per agent (build / plan / general / explore), it says “saved” — but after refresh, everything resets to default. Is this a known bug? Or am I missing something (like a config file, permissions, etc.)?
Would really appreciate any insights, workarounds, or confirmations if these are current limitations. Thanks!
I'm wondering if there's more to OpenCode than just honing your skills and using good prompts. What are your workflows like? What kinds of commands do you build? I think I'm still not making the most of OpenCode
EDIT:
Thank you all for your answers. This post was incredible helpful for me!
Since the releases of GLM-5, MiniMax M2.5, and Kimi K2.5, all I read is how amazing these LLMs are. So many people say how they can replace Sonnet 4.5 in most cases. To test this, I created my own personal benchmark: update a personal project that used to read from OpenCode’s JSON files to instead read from the SQLite db. Sonnet 4.5/4.6 and GPT 5.2/5.3 Codex finished these within 15 minutes and with no issues. GLM-5, MiniMax M2.5, and Kimi K2.5 failed spectacularly. For the same prompt, each model took 40+ minutes and didn’t even produce a working migration. MiniMax M2.5 had issues with tool calling and would just stop randomly. I have tested with OpenCode + Oh My OpenCode + GitHub Copilot (just to see if GPT/Sonnet would do). Am I missing something? How are others getting performance that is anything close to Sonnet/GPT from these cheaper models?
Im confused what to use?
TLDR: On refactoring task OpenCode with Sonnet 4.6 performed significantly better than Claude Code with same model and a bit cheaper (but still very expensive, as both used API), but OpenCode with Codex 5.3 was the best and 3 times as cheap. Also had some fun with open source models, their quality through open router felt really shitty, but through Ollama Cloud they we much more stable, and GLM-5 actually delivered surprisingly well, especially for its price tag.
Today is my second day of journey with OpenCode for personal projects after deciding giving it a go ( for context). This evening I've decided to test how it actually copes against ClaudeCode in more or less equal conditions, but then went a bit down the rabbit hole.
Code "under test" - 10k LoC electron+react app, fully vibe coded during evenings and weekends over past month, using Claude Opus on $100/month plan. Main language typescript, some serious guardrails with eslint, including custom plugins, to keep architecture and code complexity in check - and I was tightly following what Claude does, sometimes giving very precise directions, so I can actually orient in this code myself when needed. Of course there is also test suite, including some E2E using Playwright, and of course sensible also there. Code quality... to my taste meh, but it works. One of the issues - too many undefined/nulls allowed in parameters and structure fields, and hence too many null checks sprinkled over the codebase.
Prompt: "Analyse codebase thoroughly for simplification and deduplication opportunities. Give special attention to simplifying type annotations, especially by reducing amount of potential nulls/undefineds."
All models (except one case specifically mentioned in the end) were tested through OpenRouter API, after each run I was downloading log sheets and running simple analysis on them.
-
Claude Code with Sonnet 4.6, but using OpenRouter API key. Results: $3.85 burned in about 15 minutes, 136 API calls, 6.9M prompt tokens, cache hit rate 88%, 2 files changed, 4 insertions(+), 4 deletions(-) - what did I pay for?
-
OpenCode with same Sonnet 4.6. Results: $3.18 burned in about same 15 minutes, 157 API calls, 7.5M prompt tokens, but cache hit rate 95% with 8 files changed, 43 insertions(+), 44 deletions(-) - all making sense.
-
OpenCode with GPT-5.3-Codex. Results: $1.44 burned in about 7 minutes, 79 API calls, 4.9M prompt tokens, 95% cache hit rate, and 16 files changed, 91 insertions(+), 101 deletions(-) - all making sense.
-
OpenCode with Gemini 3.1 Pro. Results: $1.88 burned in about 9 minutes, 92 API calls, 3.6M prompt tokens, 85% cache hit rate,11 files changed, 94 insertions(+), 65 deletions(-) - well, most of changes did make sense, but I didn't expect that LoC count would grow on such task...
-
OpenCode with Devstral 2. Results: $5 burned before I noticed its explore went nuts and just started hammering API with 200k token prompts each. Brrr.
-
OpenCode with GLM 5. Results: 2 "false starts" (it just was freezing at some point), then on third attempt during plan mode instead of analysing code it started pouring some "thoughts" on place of a human being in a society. I'm not kidding. Must have screenshotted, but good idea comes sometimes too late.
-
OpenCode with GLM 5 from Ollama Cloud ($20 plan). Results: unfortunatelly no detailed statistics, but it ran without problems on the first try, burned about 7% of session limit and 2% of weekly limit, 11 files changed, 47 insertions(+), 42 deletions(-), generally making sense.
-
OpenCode with Devstral 2 as main and Devstral 2 small for exploration, both from Ollama Cloud. Results: again, no detailed statistics, but also ran without problems on the first try, burned another 3% of session limit and about 0.5% of weekly limit, 8 files changed, 20 insertions(+), 15 deletions(-), but... instead of focusing on what I asked it to do, it decided to overhaul a bit error handling. It was actually quite okay, but wtf - I asked for totally different thing.
Let me start with a short disclaimer:
- I'm not a bot, and not using LLM to write this
- I'm a pretty old (40+) professional software developer
- about 2 months ago I plunged into learning agentic coding tools - because I felt I either learn to use them, or become outdated
I started with Junie in my JetBrains IDE + Gemini 3 Flash model, then went to try Claude Code with Pro plan, then went to Max5 about month ago and was active user of Opus 4.6 for quite some personal projects, also managed to build some serious automated guardrails around them to keep architecture in check
So far so good, even though Opus API costs are crazy expensive, I'm getting it at huge discount due to CC subscription, right? Well, it was right, until yesterday, when Anthropic started doing some . And I found myself locked in into single "provider".
Now, due to some recent events I decided to give Opencode a try. First impressions, with free MiniMax M2.5 model - wtf? It is faster, and proposes much more sensible refactorings than Claude "/simplify" command on a medium sized project. And even if I pay API costs for that model, that would have been $0.20 vs $3 (sonnet) or $5 (opus).
Yes, it is just first evening, first impressions, simple test tasks, but - how comes? Code discovery looks much faster and much more reliable (better LSP integration?) than in Claude Code, probably being one of the big reasons why it performs so good. Also minor joys like sandbox enabled by default, or side panel with context usage stats, plan progress and modified files.
And no more vendor lock-in with obscure pricing model. Can use cheap models for simple tasks. If really in doubt - can always check with Opus at premium. Can even get Codex subscription and use GPT models at subsidised rates, just like I was doing with Claude, but unlike Claude - not locked into their tool.
Am I alone in this discovery? Is this just a "candies and flowers" period, and soon I'll get disappointed, or it is really substantially better than what Anthropic is trying to sell us?
Looking to see what actually fast feels like. Any suggestions?
Hey everyone,
If you’ve connected OpenCode to NVIDIA NIM, you know the library looks massive after you do the NVIDIA auth. But let’s be real: half the models in that list are hit-or-miss or just don't respond properly in the TUI.
I ran a quick benchmark on the models that are actually works and stable. Here are the results using a standard "Hi / What are your capabilities?" prompt.
| Model Name | Speed / Latency | Performance Note |
|---|---|---|
| nvidia/nemotron-3-nano-30b-a3b | 1.6s | ⚡ The Speed King. Instant response. |
| minimaxai/minimax-m2.5 | 2.0s | Very snappy. |
| moonshotai/kimi-k2-instruct-0905 | 2.0s | Much faster than the 2.5 version. |
| mistralai/ministral-14b-instruct-2512 | 3.5s | Solid mid-size performance. |
| z-ai/glm5 | 4.0s | Responsive and stable. |
| nvidia/llama-3.3-nemotron-super-49b | 4.0s | Great balance of power/speed. |
| mistralai/mistral-large-3-675b | 4.3s | Best Heavyweight. Insane for its size. |
| openai/gpt-oss-120b | 6.0s | Reliable, but middle of the pack. |
| nvidia/nvidia-nemotron-nano-9b-v2 | 8.0s | Fast start, but overthinks/yaps too much. |
| meta/llama-3.1-405b-instruct | 10s – 50s | Hit or miss. Sometimes fast, sometimes hangs. |
| moonshotai/kimi-k2.5 | 30s | Slow burner. |
| z-ai/glm4.7 | 32s | Significant lag. |
| deepseek-ai/deepseek-v3.2 | 2m | Works, but grab a coffee. ☕ |
Quick Takeaways:
- Best for Speed: Stick to the NVIDIA Native models (Nemotron series) or MiniMax. The optimization on this infra is clearly tuned for them.
- Best for Reasoning: Mistral Large 3 is the winner. It's surprisingly fast for a 675b model on a free tier.
- The "Duds": Half the models in the NIM catalog currently time out immediately or don't respond in the TUI—the ones above are the confirmed survivors.
Has anyone found any other models in the catalog that are actually responding?
Hey guys! I have been using opencode here and there for some time, and I mainly used Codex because of the 2x credits, but since it’s over next month, I like OC and want to use it as a daily driver. I work with .NET, and usually I use opencode at the terminal and review it in Rider, and I also make manual changes there.
I subscribed to the go plan for 1 month, and by now I basically use plan and build agents, and have been trying to use Kimi 2.5 as my main plan. I have no other configurations of anything else done at this moment and would love to hear some tips or some guidance on how to use it more effectively, please.
Is there anything more that I'm missing?
A few weeks back I posted about — our OpenCode fork that intercepts known errors with regex patterns before burning LLM tokens.
That solved repeated errors. But there was another token drain we kept hitting: repeated corrections.
The problem:
You're coding. Terminal crashes. You reopen, run --resume. The AI has no idea what you were doing. wasn't updated before the crash. You spend 10 minutes re-explaining.
Or worse: you correct the AI's behavior. "Use conventional commits." It follows. Context compacts. Correction gone. You correct again. By the 5th time, you've burned 1,000+ tokens saying the same thing.
Patterns saved us from repeated errors. We needed something to save us from repeated corrections.
State versioning:
CyxCode now commits your AI's state on exit — even crashes:
Terminal closes (Ctrl+C, crash, SIGHUP)
↓
Exit handler fires
↓
State committed:
├── goal: "Add JWT auth to API"
├── inProgress: "Fix token expiry"
├── workingFiles: [auth.ts, middleware.ts]
├── discoveries: ["tokens used wrong secret"]
└── corrections: [{rule: "use conventional commits", strength: 3}]
Next session: "update me from last conversation"
AI already knows what you were doing. No re-explaining.
---
Correction tracking:
You correct AI → strength: 1
Correct again → strength: 2
Third time → strength: 3 → AUTO-PROMOTED
Strength 3 = permanently injected into every session. The AI can't forget it.
We also added drift detection — if the AI stops following a learned behavior, it gets auto-reminded.
---
Token math:
| What | Before | After |
|------------------------------------|---------------|-------------------|
| Resume after crash | ~20K tokens | ~200 tokens |
| Correction repeated 5x | 1,000 tokens | 200 tokens (once) |
| Pattern match (from original post) | ~1,500 tokens | 0 tokens |
Patterns handle repeated errors. State versioning handles repeated corrections.
---
What it's NOT:
This saves session context, not code. Git tracks your files. CyxCode tracks what the AI knew and was doing.
---
Current stats:
- 170+ error patterns (up from 136)
- Auto-commit on exit (SIGINT, SIGTERM, SIGHUP)
- Correction strength scoring + auto-promotion
- Drift detection + auto-remind
- Resume injects previous session context
---
Try it:
Having different models, is it possible to make them talk and discuss solutions before coming back? Can OpenCode already do that or is it more of a plugin thing?
A few days ago, in the post "", I saw a comment that said:
I dont mean to plug, but I felt that OmO was also heavy so I built weave which is meant to be lightweight and the workflows are configurable. I would appreciate some feedback.
I've been working with Weave since then, and I reckon it's currently the best framework for managing agent workflows for OpenCode. Especially for people who actually know what they're doing.
It's well-thought-out and, above all, lightweight compared to OmO.
The way it can be configured is literally amazing! You can also add your own (sub)agents to the workflow in a very simple way. And that is its greatest strength, because in my opinion, the key to success is a proper configuration that fits the project, rather than a set of dozens of agents for everything.
This project definitely needs more exposure! And the creator himself is incredibly helpful.
Hey folks,
Now that Claude (especially claude code subscriptions) isn’t really usable in open code workflows anymore, I’m trying to figure out what the next best alternative is.
I’ve been heavily relying on Opus for coding + reasoning, and honestly it’s been hard to replace especially for debugging.
Right now I’m considering trying GLM (z.ai) maybe even the yearly plan, since I’ve heard it’s surprisingly strong for the price and even wins some head-to-head comparisons vs Claude in community tests.
But I’m not fully convinced yet.
Would love to hear from people who’ve actually used different setups:
-
Any underrated platforms / combos (multi-model routers, local setups, etc.)?
I’m curious what people are actually daily driving.
Main use cases for me:
- Full-stack coding
- Debugging messy repos
- Some light agent workflows
Would really appreciate real-world experiences over benchmarks.
I've been working on my projects with Claude Code for a day now, after it became unavailable, and it's truly frustrating.
I don't know why, but with OpenCode I had more control over what I was doing. Now it's like Claude Code is more slop, or I don't know, but I have to repeat things more than once.
I understand that LLM is still LLM, but for me, it's an interface issue. You guys managed to simplify the tool a lot and make it powerful. It's a shame this happened with Claude.
I hope you can bring back that configuration. From my point of view, OpenCode is on another level for working. I'll continue using it with GPT, but I use Claude much more.
Sincerely, thank you for the tool; it's fantastic.
I’m curious to hear your best OpenCode workflow tips, commands, habits, shortcuts, or general ways of working that make your sessions smoother and more effective.
I’d love to make this a thread where everyone can share the little things that actually improve day-to-day usage and help build a better workflow.
For example, two things I do all the time are:
-
I constantly use @ to reference files directly from folders, so the context stays precise.
-
At the end of each session, I use an .md file to ask it to write down what it learned, any useful context, and anything that could help in future sessions.
What are yours?
What commands, patterns, prompts, or routines have made the biggest difference for you in OpenCode?
Would love to collect as many practical ideas as possible.
-
Opencode GO
-
Github Copilot
-
Alibaba's Coding Plan
-
Bytedance coding plan
-
Ollama's subscription
-
Others?
i know Opencode GO is using the quantized model, which is dissatisfied to most of the people.
how do you choose?
Hi guys, I'm new to this... Currently I have a very basic setup. It's just defaults plus this config. I do dotnet C# and angular coding.
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"copilot": {}
},
"model": "github-copilot/gpt-5.3-codex",
"small_model": "github-copilot/gemini-3-flash",
"agent": {
"build": {
"model": "github-copilot/gpt-5.3-codex"
},
"plan": {
"model": "github-copilot/claude-opus-4.6"
}
},
"watcher": {
"ignore": [
".git/**",
".vs/**",
"bin/**",
"obj/**",
"node_modules/**",
"dist/**",
"build/**",
"coverage/**"
]
}
}
The docs said I can define a small_model which I have done, but I'm unsure if it automatically gets used... I haven't seen anything in the UI indicating it's in use, so I'm just assuming it gets used behind the scenes?
My flow is:
-
Plan in Plan mode obviously
-
Ask Plan to review the plan
-
Build mode to implement
-
Ask Plan to review the implementation
Both the before/after reviews seem to often catch mistakes or holes, so they seem useful but I assume they are burning more premium requests?
Do you guys still use Opus 4.6 for reviewing? Or do you switch to a cheaper model once Opus 4.6 has done the initial plan.
Also I've been reading about "temperature" here:
Do you guys tweak temperatures yourself, or just leave it up to OpenCode defaults?
Thanks.
I'm having great fun with OpenCode 👍
Now that's GLM 5.1 is out, will it be added to opencode go subscription plan? and if so, when?
It looks like they announced this on the Coding Go plan, but there's been no update for the Zen model provider. Is there any plan to support the new model?
I have been using opencode for about a week now with bigpickle. I have been able to successfully build a few apps.
What am I missing by not paying for a provider? I’m mainly making react native apps and everything has worked out okay.
I've seen lots of posts (from questions to tips) and it's mostly about the CLI, and while I like it and have been using CLI to do my work, having tried the OpenCode web UI made me realized how limited and inefficient I was with CLI. For about a month I've mostly used the web UI of opencode, and it's so much more enjoyable (AND EFFICIENT) to work with. From the UI features, kb shortcuts, and even the basic keyboard operations support, I really dont know why one would keep using CLI if not just plainly for preference.
Anyway, I was just wondering why i dont here much chatter about the webUI.
EDIT:
So awesome to hear from lots of ya'll... and thank you for the tips!
I recently posted in this subreddit asking for users' opinions on oh-my-opencode. It turned out that most people had encountered the same issues I did, so I continued my search for a reliable framework for coding
(Context: I’m a non-technical person with a basic understanding of engineering, working on building mobile apps on Expo from scratch)
So, after further research and comparisons, I became interested in openspec, but there isn’t much content about it online, especially regarding the update that seems to have been released a few weeks ago (OPSX Workflow).
So I’d love to hear about your experience using this framework - the pros and cons, and in which scenarios it works best for you. I’d also be interested to hear your thoughts on whether it’s worth starting to build a mobile app from scratch using this framework, or if there might be much better options for this use case?
Hey guys, im new to this reddit and ill be sorry for asking this cause this was probably discussed here for 99999 times.
But what AIs subscriptions under 10$ do you guys use? and why.
Me personally im using copilot 10$ plan, but i dont know if its that good with opencode - and recently acquired the OpenCode Go. Im planning to change my copilot subscription for another one, but i dont know any.
Would love your opinions on that
Are we all cancelling our max subs?
I'm a little surprised by how little discussion there is around this since "_they_" flipped the switch and stopped letting the opencode client connect.
So what are y'all doing? Accessing via a proxy plugin/tool? Another provider (OMO wants to connect through the antigravity provider), or giving they-who-shall-not-be-named The Bird and switching to another model all together?
You might remember me from the 9 MCP tool eval I ran against SanityEval and posted here. I've spent the last few months researching, testing, and building a new tool called Vera. It's a code indexing and search tool designed specifically for AI coding agents, and it's built to be as local-first and friction-less as possible.
A lot of the existing code indexing and search tools are bloated and heavy. A quick recap; when I tested about 9 different MCP tools recently, I found that most of them actually make agent eval scores worse. Tools like Serena actually caused the largest negative impact in evals. The closest alternative that actually performed well was Claude Context, but that required a cloud service for storage (yuck) and lacks reranking support, which makes a massive difference in retrieval quality. Roo Code unfortunately suffers the similar issues, requiring cloud storage (or a complicated setup of running qdrant locally) and lacks reranking support.
I used to maintain Pampax, a fork of someone's code search tool. Over time, I made a lot of improvements to it, but the upstream foundation was pretty fragile. Deep-rooted bugs, questionable design choices, and no matter how much I patched it up, I kept running into new issues.
So I decided to build something from the ground up after realizing that I could have built something a lot better.
The Core
Vera runs BM25 keyword search and vector similarity in parallel, merges them with Reciprocal Rank Fusion, then a cross-encoder reranks the top candidates. That reranking stage is the key differentiator. Most tools retrieve candidates and stop there. Vera actually reads query + candidate together and scores relevance jointly. The difference: 0.60 MRR@10 with reranking vs 0.28 with vector retrieval alone.
Fully Local Storage
I evaluated multiple storage backends (LanceDB, etc.) and settled on SQLite + sqvec + Tantivy in Rust. This was consistently the fastest and highest quality retrieval combo across all my tests. This solution is embedded, no need to run a separate qdrant instance, use a cloud service or anything. Storage overhead is tiny too: the index is usually around 1.33x the size of the code being indexed. 10MB of code = ~13.3MB database.
63 Languages, Single Binary
Tree-sitter structural parsing extracts functions, classes, methods, and structs as discrete chunks, not arbitrary line ranges. 63 languages supported, unsupported extensions still get indexed via text chunking. One static binary with all grammars compiled in. No Python, no NodeJS, no language servers, no per-language toolchains. .gitignore is respected, and can be supplemented or overridden with a .veraignore.
Model Agnostic
Vera is completely model-agnostic, so you can hook it up to whatever local inference engine or remote provider API you want. Any OpenAI-compatible endpoint works, including local ones from llama.cpp, etc. You can also run fully offline with curated ONNX models (vera setup downloads them and auto-detects your GPU). GPU backends supported: CUDA (NVIDIA), ROCm (AMD), DirectML (Windows), CoreML (Apple), OpenVINO (Intel). Only model calls leave your machine if you use remote endpoints. Indexing, storage, and search always stay local.
Indexing the entire Vera codebase with ONNX CUDA on a RTX 4080 takes only about 8 seconds. CPU works too but is slower. After the first index, vera update . only re-embeds changed files, so incremental updates are near instant on GPU, and only a couple seconds on most CPUs.
Works as CLI or MCP
Use it as a normal CLI tool or as an MCP server (vera mcp). The MCP server exposes search_code, index_project, update_project, and get_stats tools. Docker images are available too (CPU, CUDA, ROCm, OpenVINO). For agent CLI usage, Vera ships with agent that tell your agent how to write effective queries and when to reach for tools like `rg` instead.
Benchmarks
I wanted to keep things grounded instead of making vague claims. All benchmark data, reproduction guides, and ablation studies are in the repo.
Comparison against other approaches on the same workload (v0.4.0, 17 tasks across ripgrep, flask, fastify):
| Metric | ripgrep | cocoindex-code | vector-only | Vera hybrid |
|---|---|---|---|---|
| Recall@5 | 0.2817 | 0.3730 | 0.4921 | 0.6961 |
| Recall@10 | 0.3651 | 0.5040 | 0.6627 | 0.7549 |
| MRR@10 | 0.2625 | 0.3517 | 0.2814 | 0.6009 |
| nDCG@10 | 0.2929 | 0.5206 | 0.7077 | 0.8008 |
Vera has improved a lot since that comparison. Here's v0.4.0 vs current on the same 21-task suite (ripgrep, flask, fastify, turborepo):
| Metric | v0.4.0 | v0.7.0+ |
|---|---|---|
| Recall@1 | 0.2421 | 0.7183 |
| Recall@5 | 0.5040 | 0.7778 (~54% improvement) |
| Recall@10 | 0.5159 | 0.8254 |
| MRR@10 | 0.5016 | 0.9095 |
| nDCG@10 | 0.4570 | 0.8361 (~83% improvement) |
I see similar semantic search tools make crazy claims like 70-90% token usage reduction. I haven't benchmarked this myself so I won't throw around random numbers like that (honestly I think it would be very hard to benchmark deterministically), but the token savings are real. Tools like this help coding agents use their context window more effectively instead of burning it on bloated search results. Vera also defaults to token-efficient Markdown code blocks optimized for AI agent consumption, instead of verbose JSON, which cuts output size ~35-40%.
Install and usage
bunx @vera-ai/cli install # or: npx -y @vera-ai/cli install / uvx vera-ai install vera setup # downloads local models, auto-detects GPU vera index . vera search "authentication logic"
One command install, one command setup, done. Works as CLI or MCP server. You can install Vera's SKILL files to any project with vera agent install. The documentation on Github should cover anything else not covered here.
Other recent additions based on user requests:
-
Docker support for MCP (CPU, CUDA, ROCm, OpenVINO images)
-
vera doctorfor diagnosing setup issues -
vera repairto re-fetch missing local assets -
vera upgradeto inspect and apply binary updates -
Auto update checks
A big thanks to my users in my Discord server, they've helped a lot with catching bugs, making suggestions and good ideas. Please feel free to join for support, requests, or just to chat about LLM and tools.
The perspective of a frustrated user that decided to test out Claude Code before subscription expires after my cancel. I did cancel because Anthropic decided to ban OpenCode, but I decided to give a try to Claude Code, as I have 18 days of Max 20x subscription that I plan to use to accomplish a few tasks in my project.
Issues with Claude Code from my perspective:
- No easy Control + T to change reasoning effort;
- Less control over the agent, it automatically switches to plan mode;
- No LSP configurated out-of-the-box, have to install and wire it up yourself;
- UI is definitely inferior, just basic dark/light theme and almost all texts are plain white, green blocks are like horrible to read, with comments on gray barely readable;
- I miss hotkeys from OpenCode like to "undo message" and it rapidly gives you the possibility to edit and resend;
- Less reasoning visibility; OpenCode definitely is more readable and verbose about streamed reasoning;
- Can't see what explorer subagents are doing, while in OpenCode we can fine-tune the best model to the task aswell as see what each subagent is doing (even general coding subagent);
- Scrolling is definitely inferior in Claude Code, but it is something you can get used to it;
- Lack of compatiblity is a pain, conventions are always great for everyone... This really tells the kind of company Anthropic is, reminds me of Apple forcing their lightning ports into customers;
- No Sidebar like OpenCode that shows you how many tokens your context window has consumed, all compactions happens automatically when LLM wants, so experiencied users that know how to manage that window, when to compact, when to write a plan file and execute with a fresh session, gets impossible to be applied, you just have to go with what the model wants;
- Alternation between accept edits, allow comma/nds, plan mode, all in that shift+tab to cycle is definitely confusing;
- In one session, I asked the agent to my docker up, and I had a fresh OS, so it did "cd /etc", and then the Claude Code got locked to that, when I typed @ to mention a file/folder, it was in that /etc to mention system files... I have no idea if this is a bug or intended feature, but I definitely don't want to mention files outside my project without my explicit consent;
- Model selection and alternation is terrible, why Opus 4.6 have to be selected by selecting Default? So I trust Default will be Opus 4.6 forever? While Claude Code tries to be more user-friendly, it becomes more obscure than anything;
- Despite Anthropic being very successful at building SOTA coding models, they have like 5 incidents in a normal day, and global outage on a bad day, so using OpenCode is overall having redundancy, to easily be able to alternate when Anthropic is under its regular incidents and "Elevated errors on Opus 4.6".
And a few I miss from OpenCode:
- Configure my own cheap/faster models to operate on explorer subagent, having more efficiency on token consumption on expensive models that I use to plan and code;
- Manually deploy my subagents, that I can configure, like a Code Reviewer subagent with specific model, reasoning effort and prompt suited to verify several requirements of my project;
- LSP installed and enabled by default, I see most Claude Code users don't know this yet, but this increases code quality, reduces errors and save you several tokens at the end of the day.
Well, those were my perspectives after only a few hours using Claude Code. I did not test MCP/skills/fine-tuning config files, so I do know there is headroom to customize and improve my experience, but I'm just a frustrated OpenCode user that is leaving Anthropic until their market practices get better. I'd happily pay more for my Max 20x subscription to keep using OpenCode, but Anthropic is more interested in vendor locking you into their shitty software that offer less experiencie.
I'm posting this out as a positive feedback, maybe Claude Coded can improve, and maybe Anthropic can accept more that market cannot have only one player (idk this hardcoded rule file is really annoying to whole other level...).
PS: I know Max plans give Anthropic considerable losses, but this is part of they getting market, right? If I hire gmail and use Outlook client, who cares? The issue would be if I used OpenClaw or something like this to automate requests and use it like an enterprise API. So we're not ditching out from the objective by using OpenCode, and clearly advanced users get better results and more control over the process with it. Lets hope some day Anthropic changes its mind on this matter. Farewell, Claude.
PS2: Tried to post this in Claude reddit, but ClaudeAI-mod-bot refused my post because "your post looks like its designed to be disruptive". Nice censorship.
When using opemcode, how can I paste images on powershell? Can ctrl+v only paste text
I have some agents with very strict rules, such as "Expert" and "Single-Orchestrator," for example.
I have general rules in AGENTS.md, specific rules for these agents in their respective .md files, and I also have a plugin that injects a reminder so these agents don't forget who they are and what they should follow.
It works perfectly for GLM-5-Turbo and Kimi K2.5, but MiniMax M2.7 simply IGNORES everything. It refuses to follow the rules.
It's practically impossible to "automate" this rule-following behavior for MiniMax M2.7.
When I tell it in the prompt what it should do and follow, it usually respects it, but when I don't, when I leave it to the agent's own rules and the plugin, it completely ignores it. It's as if it doesn't even work, even though it works perfectly for the other two models.
This makes it almost impossible to use MiniMax M2.7 in my case.
Has anyone else noticed this behavior?
I'm using MiniMax M2.7 via the MiniMax Token Plan, GLM-5-Turbo via the ZAi Coding Plan and Kimi K2.5 via OpenCode Go.
Updated openCode today and realized we have a new model to test with. Been using it today. Surprised when I popped on here that no one has talked about it yet. What do you guys think?
So far, my opinion is that it is... weird, but incredible. Compared to other models, it's like speaking in a different language. But holy crap, does it ever work well. I have never had a more complete project code analysis from a single prompt. It seems to be doing a far more complete job of everything I ask it, even with vague prompts. But it does need guard-railed pretty heavily so it doesn't fly off the handle and start doing whatever it wants. It also seemingly outright ignores the plan and build modes, so I've had to give it the exact permission error it will get in plan mode so it knows what's going on and doesn't loop through trying over and over again to write to files in plan mode.
But in the testing project I've had going, purely to test new models, it has basically completed it in about 4 hours. I've had this project rolling between models for months. I'm going to have to start a new project to test with now.
hello everyone, I'm a complete noob in agentic vibecoding and AI IDE. and I'm curious is there any way to make several models work in one team at the same time? like in a way that grok 4.20 works, but using different models like mimo pro for writing code, nemotron super for data analysis, Gemini for planning and Claude for proofreading?
I'm using a Xiaomi MiMo V2 Pro via OpenRouter with OpenCode, and for some reason, the reasoning tokens in the UI sometimes seem to keep repeating. Or, if I hide them, it just keeps “thinking” indefinitely. I haven't experienced this very often with other models. Do you know what's causing this?
It's really great for front-end/UI work. But it doesn't seem to run reliably here :(. Once this behavior occurs during a session, you basically have to restart, because it keeps getting stuck in an endless loop.
How can i fix this? it cant even run bun?
The /btw feature at claude is a really great feature, it makes the session interactive all along, where you aren't dependent on the task longevity.
I assume safely that the devs of opencode are aware of this feature and can add it it no time.
I wonder why they havent so far.
I can just hope that it's not because of ego, and the will of creating something else, different from other coding agents.
Which leads me to the underline point, the purpose of opensource projects is to adopt the BEST FEATUREs and convey them to the public, even if it means to copy a feature. That's OK! Opensource projects should synthesize the best available features from commercial and other opensource to produce the best value for the public. Copying is one of the best why to do so!
Hi
Im trying to run ollama locally, to use with opencode. And to try and get it running i been trying to use gemini because it wont let connect to whats running locally.
Everything should be running fine in regards to the model.
Gemini wants me to make a opencode/opencode.json file containing this:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (Local)",
"options": {
"baseURL": "http://127.0.0.1:11434/v1"
},
"models": {
"qwen2.5-coder:14b": {
"name": "Qwen 14B"
}
}
}
},
"model": "ollama/qwen2.5-coder:14b"
}But it doesnt let me see the local model. been trying for a day now almost back and forth with nukes, new json files and so on.
anyone had a successful installation of opencode with ollama local on an intelcard on arch (omarchy)?
Provider APIs
APIs run by the companies that train or fine-tune the models themselves.
Google Gemini 🇺🇸 - Gemini 2.5 Pro, Flash, Flash-Lite +4 more. 5-15 RPM, 100-1K RPD. 1
Cohere 🇺🇸 - Command A, Command R+, Aya Expanse 32B +9 more. 20 RPM, 1K/mo.
Mistral AI 🇪🇺 - Mistral Large 3, Small 3.1, Ministral 8B +3 more. 1 req/s, 1B tok/mo.
Zhipu AI 🇨🇳 - GLM-4.7-Flash, GLM-4.5-Flash, GLM-4.6V-Flash. Limits undocumented.
Inference providers
Third-party platforms that host open-weight models from various sources.
GitHub Models 🇺🇸 - GPT-4o, Llama 3.3 70B, DeepSeek-R1 +more. 10-15 RPM, 50-150 RPD.
NVIDIA NIM 🇺🇸 - Llama 3.3 70B, Mistral Large, Qwen3 235B +more. 40 RPM.
Groq 🇺🇸 - Llama 3.3 70B, Llama 4 Scout, Kimi K2 +17 more. 30 RPM, 14,400 RPD.
Cerebras 🇺🇸 - Llama 3.3 70B, Qwen3 235B, GPT-OSS-120B +3 more. 30 RPM, 14,400 RPD.
Cloudflare Workers AI 🇺🇸 - Llama 3.3 70B, Qwen QwQ 32B +47 more. 10K neurons/day.
LLM7 🇬🇧 - DeepSeek R1, Flash-Lite, Qwen2.5 Coder +27 more. 30 RPM (120 with token).
Kluster AI 🇺🇸 - DeepSeek-R1, Llama 4 Maverick, Qwen3-235B +2 more. Limits undocumented.
OpenRouter 🇺🇸 - DeepSeek R1, Llama 3.3 70B, GPT-OSS-120B +29 more. 20 RPM, 50 RPD.
Hugging Face 🇺🇸 - Llama 3.3 70B, Qwen2.5 72B, Mistral 7B +many more. $0.10/mo in free credits.
Edit: If anyone looking for a paid API for OpenClaw MiniMax Starting at $10 a month:
Hi,
I'm actually a Mac user so have not had an issue with opencode, but my office has Windows machines that are fairly locked down, and attempting to get Opencode using "npm -g install opencode-ai" will install but won't run due to a "is not digitally signed. You cannot run this script on the current system" error.
Usually "disable security" is not an option, so what can be done here? I would have actually thought OpenCode would have been signed.
Claude Opus isn’t available in Opencode, so I’m thinking of leaving their service.
The ChatGPT Pro plan is not really fit my situation
Can anyone recommend the best coding plan within a $200 budget?
(I used Claude Max 20x before and almost used up all my quota...)
or just switch to claude code? (I really don't like their cli UI/UX) :(
A unified desktop application for browsing conversation histories from multiple AI coding assistants — **Claude Code**, **Codex**, **Gemini CLI**, and **OpenCode** — all in one place.
"show me ASCII wireframes before you write the component file" seems to work wonders, makes it easy to see, and add an extra thinking step before writing.
I feel kind of stupid for taking so long to realize this.
Basically the title.
I love the harness, use it daily, and try to stay up to date on the changes each version pushes out (which can be a lot with the OC PR velocity 😂).
This weekend saw versions v1.3.4-v1.3.7, and had a lot of updates in them. I'd love to be able to know more about each change; the why, how to use, and benefit each has for users would be great.
If the changes added are from an issue/pull request, the change log notes usually link to those and I could read more about why the change was requested and use context information to figure out the rest. However, there were only 3-5 of those in this weekends updates.
I don't know where to go to get that information, if it even exists. I'm part of the Discord, this sub (obvs), and the other OC sub. Does anyone know of any place to glean that info? For example, a blog, dev notes, etc.?
Maybe if one of the maintainers see this, a neat feature to add with PR's would be to have whatever bot drafts the notes also draft why they're important to users. 🤷♂️ Usually software version changes would have some footnotes on why this change was important/etc.
As an example to what lead me to this post, here's some cherry-picked examples from the change log notes that are ambiguous and I would love to know more about:
Add prompt slot feature
Add single target plugin entrypoints
Effectify Plugin service internals
Effectify Skill service internals
Add TUI plugins support
Add model variant selection dialog
TIA!
This is an excerpt from the official docs:
OpenCode Go includes the following limits:
-
5 hour limit — $4 of usage
-
Weekly limit — $10 of usage
-
Monthly limit — $20 of usage
In terms of tokens, $20 of usage is roughly equivalent to:
-
69 million GLM-5 tokens
-
121 million Kimi K2.5 tokens
-
328 million MiniMax M2.5 tokens
Below are the prices per 1M tokens.
| Model | Input | Output | Cached Read |
|---|---|---|---|
| GLM-5 | $1.00 | $3.20 | $0.20 |
| Kimi K2.5 | $0.60 | $3.00 | $0.10 |
| MiniMax M2.5 | $0.30 | $1.20 | $0.03 |
One important thing to note is the chart inside the zen page lists glm-5, kimi k2.5 and minimax m2.5 with a (lite) suffix. The suffix is not explained anywhere yet.
I keep getting "Provider is overloaded..." using Claude models, but when I open Claude Code, everything works normally.
Anyone know what's going on? Maybe some kind of traffic preferencing from Anthropic?
I'm struggling to get opencode vil to commit and push after a finished/end at a prompt where files have been changed. I have included it in both the in the repository and in the config file opencode.json. in the opencode.json I have tried with different permission settings but even "permission":"allow" fails to allow automatic git commit and push. Any suggestions?
in codex I only had to include it in the agens.md.
edit:spelling
I feel the time is coming for us developers to choose our AI tools. Most are going for Claude Code, but I don't trust companies that isolate themselves in a bubble. They may be better now, but AI is a commodity; soon some Chinese company will be doing the same thing for half the price.
So I've been thinking about what to adopt for my AI workflow and I thought about Open Code. I want a place where I have the freedom to easily switch providers, but also a place with an interesting vision to facilitate our workflow.
Is Open Code the most solid option currently?
Hey all — not trying to self promote, but I use my iPad pro for development and needed a native opencode client to connect to my opencode server so I made one. If anyone is interested:
basically the title. one tool i would like is the browser controll tool which allows copilot to launch and control vs code's browser window. I know we can add web mcp to achive that but is there any easier way of just porting those tools over?
My use case: purely terminal-based coding with OpenCode, no IDE. I don't care about the ChatGPT web UI or Copilot's VS Code integration. Just want the best GPT-5.4 throughput for the money in OpenCode. I don't use Codex cli, Deep Research, Sora any of that either.
I keep getting this error
JSON Schema not supported: could not understand the instance `{'items': {'type': 'string'}}`.
is it just me? I used the opencode provider login with the api key.
How is the context abstracted between primary agents and subagents
Does subagent gets the whole context of primary agent into its prompt? I think yes
After a subagents task completion, does the whole context of the task of the subagent is added to the primary agent's context ? It must be yes or atleast a summary of what is done
If multiple subagents are used in a single run of primary agent, then does it mean primary agent at some point looses its own prompt?
I am not counting Chat summarization to manage context as prompt being present in the context
Or, does opencode has a mechanism to detect it and reinject the prompt back
Robust functioning tooling is as important as the coding ability of the llms.
Are there any llm that is finetuned specifically for opencode? Otherwise it's a great opportunity for os llm providers :)
Which local llms do you find most robust? Does vllm vs ollama matter?
Thanks, I'm new with Opencode
Hi everyone,
Over the past month our team has been experimenting with something related to AI agents and data infrastructure.
As many of you are probably experiencing, the ecosystem around agentic systems is moving very quickly. There’s a lot of work happening around models, orchestration frameworks, and agent architectures. Many times though, agents struggle to access reliable structured data.
In practice, a lot of agent workflows end up looking like this:
-
Search for a dataset or API
-
Read documentation
-
Try to understand the structure
-
Write a script to query it
-
Clean the result
-
Finally run the analysis
For agents this often becomes fragile or leads to hallucinated answers if the data layer isn’t clear, so we started experimenting with something we’re calling BotMarket.
The idea is to develop a place where AI agents can directly access structured datasets that are already organized and documented for programmatic use. Right now the datasets are mostly trade and economic data (coming from the work we’ve done with the Observatory of Economic Complexity), but the longer-term idea is to expand into other domains as well.
To be very clear: this is still early territory. We’re sharing it here because I figured communities like this one are probably the people most likely to break it, critique it, and point out what we’re missing.
If you’re building with:
• LangChain
• CrewAI
• OpenAI Agents
• local LLM agents
• data pipelines that involve LLM reasoning
we’d genuinely love to hear what you think about this tool. You can try it here
We also opened a small Discord where we’re discussing ideas and collecting feedback from people experimenting with agents:
If you decide to check it out, we’d love to hear:
• what works
• what datasets would be most useful
Thanks for reading! and genuinely curious to hear how people here are thinking about this and our approach.
Fuck Anthropic
❤️Opencode
Is there a way to have opencode on wsl and make devtool mcp interact with my windows chrome?
-
first of all, it hardcodes to use plan mode. Meaning if its a simple ask task, where it doesn't need to write anything, it will still create a plan (which is acceptable) but then goes on to switch (forced behaviour) to build mode to execute it. It shouldn't. It should stay in plan(read only) and execute the plan.
It should be aware when to use the plan file, when to use the exitPlanMode etc. not as a default hardcoded behaviour.
There is something off in the implementation, it's definitely not the best way to implement it.
I know it's labelled "experimental" but it should be atleast workable.
Any other solutions/plugins/pr which solve this issue properly. (Not a hack, so that it doesn't mess up in edge cases.)
I posted about the gibberish output from z.ai's coding plan when using GLM 5 and mentioned the issue arises as context exceeds ~80k tokens.
After experiencing it multiple times today, it seems to be triggering not at 80k but almost immediately after exceeding 100k.
Work-Around: Set your harness to auto-compact below that. I've been using 95k all day without any issues.
In OpenCode it's particularly easy - in opencode.json, simply add this:
"zai-coding-plan": {
"models": {
"glm-5": {
"limit": {
"context": 95000,
"output": 8192
}
}
}
},...other harnesses will have their own methods.
Since adding the above, I get the expected "Compaction" prompt before issues can arise. It's worked fine all day for me after many extremely long conversations.
Side-Effects: This is not a solution but a workaround, because smaller contexts are a pain for other reasons. An example I ran into a few times today: a tool call fails, GLM auto-corrects the call, 'remembers' that what's required for it to work the next time - but that nuance gets lost after auto-compacting and it wastes time/tokens re-learning again post-compact.
The Actual Solution: is for z.ai to kindly fix their API issues (which were introduced with their post-new year "" communication, which sped GLM 5 up but introduced this issue at the same time.)
Another alternative I guess would be other GLM providers: we know it's not an underlying model issue because the first months post-launch, GLM 5 via this same provider was flawless (albeit slow) up until >180k context-sizes.
HTH.
I've tried to put together an .opencode/ directory in a separate repo, that integrates the definition of my agents and my context, leveraging the OAC framework.
The repository has the following structure:
.opencode/ (1.8 MB) ├── agent/ │ ├── core/ # OAC core agents │ ├── subagents/ # Specialized sub-agents │ │ ├── core/ # ContextScout, TaskManager, etc. │ │ └── code/ # CoderAgent, CodeReviewer, etc. │ ├── context/ ├── skills/ ... README.md
My goal is to be able to include this repo as a submodule into other projects so that any updates to the subagents or the skills are going to be also available across the projects.
What is the best way to proceed here?
Just noticed that OpenCode currently has the Go subscription for 5$ for the first month.
Hello i recently started using opencode couple of months back. I had been using this consistently for the past few days. Today i tried using my normal gemini 3.1 pro preview model and it threw an "This service has been disabled in this account for violation of Terms of Service. Please submit an appeal to continue using this product." This is the same for all of the anti-gravity models. Can someone suggest a fix. Any help is appricaiated Thank you
TLDR; Using the /review, the agent produced incorrect review comments. How do I best make it "learn"?
The scenario, mistaking {$} for a regex pattern.
The code under review was Go web code, The output of the review contained the following:
2. Invalid URL pattern in exercise-handler.go (internal/web/exercise-handler.go:36)
- Uses "GET /{$}" which doesn't work - http.ServeMux doesn't support regex patterns
- Will cause routing failures
Go's standard library Mux matches /{$} to the root path, where / matches subpaths. (Mux corresponds to what normally called a router in other langauges/libraries)
I suspect this knowledge would be a good candidate for a skill, specifically a "Go web frontend skill".
-
How to best write such a skill?
-
Inspired by a LinkedIn post by Chris Jenkins, I instructed the agent to update its own knowledge.
-
Some argue that you need to keep the context small, and event
agent.md/ skills and agents my add too much context.
-
-
How do I actually ensure that this skill is used when appropriate.
-
E.g., this is relevant for Go's built in Mux - if I chose to replace it with a third party, it would no longer be relevant.
-
What I did
I gave it a shot with the following prompt:
Critical issue nr. 2 is invalid. Go's http.ServeMux matches the pattern `/{$}` with the root path itself, where the pattern `/` matches all subpaths. Create a skill for thisAnd it produced:
---
name: HttpServeMuxPatterns
description: Knowledge about http.ServeMux URL pattern matching behavior
type: knowledge
---
## Pattern Syntax
http.ServeMux uses a simple path-based matching system with specific pattern semantics:
### Pattern Details
- `/{$}` - Match the root path, similar to `/`
- `/` - Match all subpaths
- `/path/to/resource` - Exact path match
### Key Behavior
`http.ServeMux` matches the pattern `/{$}` with the root path itself. The pattern `/` is equivalent and also matches all paths under the root.
### Practical Usage
Both patterns are valid and interchangeable:
- `mux.HandleFunc("GET /{$}", handler)` ✓
- `mux.HandleFunc("GET /", handler)` ✓
- `mux.HandleFunc("GET /path", handler)` ✓
Both patterns work correctly and serve the same routing behavior.
### Common Mistake
Misunderstanding the difference between:
- URL patterns as used by http.ServeMux (not regex)
- Actual URL path components
The slash is a separator, not a wildcard character like `*` or `+`.
The problem is that choosing a provider is actually really hard. You end up digging through tons of Reddit threads trying to find real user experiences with each provider.
I used antigravity-oauth and was perfectly happy with it but recently Google has started actively banning accounts for that, so it’s no longer an option.
The main issue for me ofc is budget. It’s pretty limited when it comes to subscriptions. I can afford to spend around $20.
I’ve already looked into a lot of options. Here’s what I’ve managed to gather so far:
-
Alibaba - very cheap. On paper the models look great, limits are huge and support seems solid. But there are a lot of negative reports. The models are quantized which causes issues in agent workflows (they tend to get stuck in loops), and overall they seem noticeably less capable than the original providers.
-
Antigravity - former “best value for money” provider. As I mentioned earlier if you use it via the OC plugin now you can quickly get your account restricted for violating the ToS.
-
Chutes - also a former “best value for money” option. They changed their subscription terms and the quality of service dropped significantly. Models run very slowly and connection drops are frequent.
-
NanoGPT - I couldn’t find much solid information. One known issue is that they’ve stopped allowing new users to subscribe. From what I understand it’s a decent provider with a large selection of models including chinese ones.
-
Synthetic - basically the same situation as Chutes: prices went up, limits went down. Not really worth it anymore.
-
OpenRouter - still a solid provider. PAYG pricing, very transparent costs, and reliable service. Works well as a backup provider if you hit the limits with your main one.
-
Claude - expensive. Unless you’re planning to use CC, it doesn’t really make sense. Personally anthropic feels like an antagonist to me. Their policies, actions, and some statements from their CEO really put me off. The whole information environment around them feels kind of messy. That said the models themselves are genuinely very good.
-
Copilot - maybe the new “best value for money”? Hard to say. Their request accounting is a bit strange. Many people report that every tool call counts as a separate request which causes you to hit limits very quickly when using agent workflows. Otherwise it’s actually very good. For a standard subscription you get access to all the latest US models. Unfortunately there are no Chinese models available.
-
Codex - currently a very strong option. The new GPT models are good both for coding and planning. Standard pricing, large limits (especially right now). However, there isn’t much information about real-world usage with OC.
-
Chinese models - z.AI (GLM), Kimi, MiniMax. The situation here is very mixed. Some people are very happy, others are not. Most of the complaints are about data security and model quantization by various providers. Personally I like Chinese models, but it’s true that because of their size many providers quantize them heavily, sometimes to the point of basically “lobotomizing” the model.
So that’s as far as my research got. Now to the actual point of the post lol.
Why am I posting this? I still haven’t decided which provider to choose. I enjoy working on pet projects in OC. After spending the whole day writing code at work, the last thing you want when you get home is to sit down and write more code. But I still want to keep building projects, so I’ve found agent-based programming extremely helpful. The downside is that it burns through a huge amount of tokens/requests/money.
For work tasks I never hit any limits. I have a team subscription to Claude (basically the Pro plan), and I’ve never once hit the limit when using it strictly for work.
So I’d like to ask you to share your experience, setups, and general recommendations for agent-driven development in OC. I’d really appreciate detailed responses. Thanks!
Hi, I have noticed that if I ask something withing ollama directly the response is almost instant but when I use opencode it takes a while until i get something.
does this happen to someone else? Thanks
Hi, I wonder if it is possible to include num_ctx 32768 for context lengh within the config.json.
what is the "output" parameter here doing?
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b-16k": {
"name": "qwen2.5-coder:7b"
},
"qwen3.5:4b": {
"name": "qwen3.5:4b",
"limit": {
"context": 32768,
"output": 32768
}
}
}
}
}
}
I have a LLM that I would like to connect to but I can't find it on the list of provider...
Opencode Go plan generates better output and less mistakes than the kimi coding plan, the usage is good enough not as much as kimi coding plan provides, if monthly limit is just removed from Opencode Go for 15$/month I'd use it way more often but for now it is just for emergencies
What do you guys think
Hey all,
The more I research existing open code setups, skills, agents, tools, plugins, etc the more I feel like this is a very overwhelming world. Has anyone invested on a formal or at least more structured eval framework? I think it would be immensely valuable.
I imagine the existing model-based benchmarks could theoretically be used, but I was hoping to have something where I can throw a particular setup at a problem and have it output a solution so that I could then compare the solution quality + time it took + tokens required etc etc