Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. For example, Ternary Bonsai 4B scores roughly 8 points higher on average across benchmarks with just 300MB more memory footprint, compared to the 1-bit Bonsai 4B. For many deployments, this offers a new balance of capability, and deployment efficiency. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB). Our earlier 1-bit Bonsai models established a new Pareto frontier. Ternary Bonsai pushes that frontier further.
About us
Concentrating intelligence.
- Website
-
https://prismml.com
External link for PrismML
- Industry
- Information Services
- Company size
- 11-50 employees
- Type
- Privately Held
Employees at PrismML
Updates
-
Running 1-bit Bonsai 8B locally on an M4 Pro alongside a standard 16-bit 8B model. Same class of model but a very different deployment profile. What stands out isn’t just performance, but intelligence density. By compressing the model down to 1-bit, you unlock: 1. Dramatically lower memory usage 2. Much higher throughput 3. Real viability for local, on-device use This is the shift that matters. The next leap in AI won’t just come from scaling parameters, it will come from making models smaller, faster, and more efficient. When that happens, entirely new use cases open up: real-time systems, edge deployment, and products that simply weren’t practical before.
-
Today, PrismML is emerging from stealth. We are an AI lab with Caltech origins centered on building the most concentrated form of intelligence. At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in intelligence density, not just sheer parameter count. Our first proof point is 1-bit Bonsai 8B, a 1-bit model that fits into 1.15 GBs of memory and delivers over 10x the intelligence density of its full-precision counterparts. It is 14x smaller, 8x faster, and 5x more energy efficient on edge hardware while remaining competitive with other models in its parameter-class. We are open-sourcing the model under the Apache 2.0 license, along with 1-bit Bonsai 4B and 1.7B. When advanced models become small, fast, and efficient enough to run locally, the design space for AI changes immediately. We believe in a future of on-device agents, real-time robotics, offline intelligence, and entirely new products that were previously impossible. This is the future we’re building toward. We’re fortunate to be backed by Khosla Ventures, Cerberus Ventures, Caltech, and supported by Google.
-