PrismML

82 posts
Square profile picture and Opens profile photo
PrismML
@PrismML
Centering AI research on efficiency. discord.gg/prismml

PrismML’s posts

Pinned
Square profile picture
Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence. At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in
Image
We’re looking for people who love building from scratch! DM if interested.
Quote
Omead Pooladzandi
@HessianFree
Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics
My job is literally to do interesting and crazy experiments, if you are into that sort of thing DM Omead or I (or , , , , , ) and come build with us at !
Quote
Omead Pooladzandi
@HessianFree
Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics
We’re expanding our highly technical team at — people who love pushing model quality end-to-end, from training dynamics to shipped models. If you’ve scaled LLM training, RL/SFT, evals, distillation, long context, kernels, or infra, we’d love to talk.
Quote
Omead Pooladzandi
@HessianFree
Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics
Hey! is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics
Training gets the headlines. Inference gets the bill. As agents move from novelty to default workload, the hard problem isn't the model anymore. It's every millisecond and every watt between a prompt and the next token. A coding agent running for six hours straight is a very
Image
Image
Huge shoutout to This shouldn’t be possible: a tiny model punching way above its weight. The largest version is just 1.14 GB, which means it’s small enough for a phone. Fast on a phone (spoiler: Pico for iOS is coming soon!). Insanely fast on a MacBook Pro M1 Max.
Quote
Pico AI Server and Pico AI Studio
@PicoGPT
Replying to @PicoGPT
How fast you ask? About 109 tokens per second on an M1 Max MacBook Pro fast.
The media could not be played.
Tiny local models like Bonsai are going to change things. For the last three years, the default way most people used AI was simple: frontier models lived in data centres, you reached them through an API, and anything local felt like a toy. That will probably stop being true in
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
Pico Local AI Server 1.4.21 is now available on the Mac App Store. This release adds support for Ternary Bonsai, a lightning-fast model that outperforms many much larger models
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
Ternary is actually surprisingly powerful. Validated by bitnet and now again here. In the new model training research/experimentation I've been working on, ternary weights (in some places) actually beats bf16 (by a not-insignificant amount), at least up to the 7b scale (and
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
Interesting work here 👇
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
Ternary Bonsai 8B is within 5% of Qwen 3 8B at 9x lower memory! Congratulations on yet another exciting release! cc
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
One of the things I tried researching but found really hard. 1.58bpw is insane 10x smaller than original, I hope they push it to much larger models
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
The models are getting smaller. Great for OpenClaws and Hermes. Gotta heat them up! Yesterday someone told me "phones are three to five years away." Oh, really?
Quote
PrismML
@PrismML
Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks.
Image
Ternary Bonsai: state-of-the-art intelligence at 1.58 bits. The models are so small they can even run locally in your browser on WebGPU! ⚡️ Here's the 8B version (just ~2GB in size) running at 60 tokens per second on my M4 Max. Try the demo out yourself! 👇
People are the most valuable resource in action.
Quote
Omead Pooladzandi
@HessianFree
> > anon asked for one more state
 > > we added zero
 > > +600 MB
 > > +5 benchmark points
 > > 75.5 avg at 1.75 GB
 > > still ~1/9 the size of Qwen3 8B
 > > shout out brahmagupta
 > > zero mattered x.com/PrismML/status…
Square profile picture
Thanks and the team!
Quote
merve
@mervenoyann
new open-source Bonsai models are out 🔥 > ternary weights in 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB) > comes in MLX, ONNX weights and WebGPU browser demo 😍 > a2.0 licensed 👏 x.com/PrismML/status…