Post

Conversation

if you are running local ai or thinking to start, if i could give you one single piece of advice it is this: choose your agentic harness carefully. it matters more than the model. i have lost count of how many people have dm'd me saying their local model is "dumb" or "broken" or "not as good as the cloud one." then they switch from openclaw or some other bloated framework to hermes agent and the same model suddenly works. just clean tool calls and the agent doing the thing it was supposed to do. hermes agent is the best general purpose agent i have used in 2026. drives my single 3090 with qwen 3.6 27b dense q4, drives my dgx spark with nemotron omni q8, and the same harness handles coding, research, video editing, automation, anything you point it at. packed with skills out of the box (browser tools, code, github, jupyter, multimodal, more than i have used yet), full tool calling that holds across long sessions, persistent memory, sub agents. if you tried local ai once or twice and gave up because it felt half baked, the issue might not have been the model. it might have been the harness wrapping it. swap the harness, run the same model again, and watch what changes. hermes agent is the one i recommend to everyone running local. and especially to anyone who almost gave up on it.
Image
Quote
Sudo su
@sudoingX
most of you don't know how big a deal it is that a single rtx 3090 from 2020 runs qwen 27b dense q4 with 256k context at 40 tok/s, full agentic loops on hermes agent, zero tool call failures. the more i build on this card the more i think nobody really knows how untapped it