Member-only story
Alibaba’s Qwen3.5 Small Models: 0.8B Parameters Can Process Video, The Edge AI Era Has Arrived
Alibaba’s Tongyi Qianwen has released the Qwen3.5 series of four small models: 0.8B, 2B, 4B, and 9B. The core innovation of this model family lies in the Gated DeltaNet hybrid attention mechanism — a technology borrowed from their 397B large model.
This architecture uses three linear attention layers for every one full attention layer. The linear layers handle routine computations with constant memory usage, while the full attention layer activates only when precise calculations are needed. This 3:1 ratio allows the models to maintain high quality while controlling memory growth, enabling even the 0.8B model to support a 262,000-token context window.
Technical Breakthrough: Native Multimodal Design
Qwen3.5 was trained from the beginning using multimodal tokens with early fusion training. The visual encoder employs 3D convolution to capture motion information in videos. The 4B and 9B models can understand UI interfaces and count objects in videos — capabilities that previously required models with ten times more parameters.