Model Inference Pricing Explanation
Concepts
Billing Unit
Token: A token represents a common sequence of characters. The number of tokens used for each English character may vary. For example, a single character like "antidisestablishmentarianism" might be broken down into several tokens, while a short and common phrase like "word" might use just one token.Generally speaking, for a typical English text, 1 token is roughly equivalent to 3-4 English characters. The exact number of tokens generated by each call can be obtained through the Token Calculation API.
Billing Logic
Chat Completion API charges: We bill both the Input and Output based on usage. If you upload and extract content from a document and then pass the extracted content as Input to the model, the document content will also be billed based on usage.File-related interfaces (file content extraction/file storage) are temporarily free. In other words, if you only upload and extract a document, this API itself will not incur any charges.
Product Pricing
Explanation:The prices listed below are all inclusive of tax.
Multi-modal Model kimi-k2.5 Recharge
Model | Unit | Input Price (Cache Hit) | Input Price (Cache Miss) | Output Price | Context Window |
|---|---|---|---|---|---|
kimi-k2.5 | 1M tokens | $0.10 | $0.60 | $3.00 | 262,144 tokens |
- kimi-k2.5 is Kimi's most versatile model to date, featuring a native multimodal architecture that supports both visual and text input, thinking and non-thinking modes, and dialogue and agent tasks.
- Context length 256k, supports long thinking and deep reasoning.
- Supports automatic context caching functionality, ToolCalls, JSON Mode, Partial Mode, and internet search functionality.
Generation Model kimi-k2
Model | Unit | Input Price (Cache Hit) | Input Price (Cache Miss) | Output Price | Context Window |
|---|---|---|---|---|---|
kimi-k2-0905-preview | 1M tokens | $0.15 | $0.60 | $2.50 | 262,144 tokens |
kimi-k2-0711-preview | 1M tokens | $0.15 | $0.60 | $2.50 | 131,072 tokens |
kimi-k2-turbo-previewRecommended | 1M tokens | $0.15 | $1.15 | $8.00 | 262,144 tokens |
kimi-k2-thinking | 1M tokens | $0.15 | $0.60 | $2.50 | 262,144 tokens |
kimi-k2-thinking-turbo | 1M tokens | $0.15 | $1.15 | $8.00 | 262,144 tokens |
- kimi-k2 is a Mixture-of-Experts (MoE) foundation model with exceptional coding and agent capabilities, featuring 1 trillion total parameters and 32 billion activated parameters. In benchmark evaluations covering general knowledge reasoning, programming, mathematics, and agent-related tasks, the K2 model outperforms other leading open-source models
- kimi-k2-0905-preview: Context length 256k. Based on kimi-k2-0711-preview, with enhanced agentic coding abilities, improved frontend code quality and practicality, and better context understanding
- kimi-k2-turbo-preview: Context length 256k. High-speed version of kimi-k2, always aligned with the latest kimi-k2 (kimi-k2-0905-preview). Same model parameters as kimi-k2, output speed up to 60 tokens/sec (max 100 tokens/sec)
- kimi-k2-0711-preview: Context length 128k
- kimi-k2-thinking: Context length 256k. A thinking model with general agentic and reasoning capabilities, specializing in deep reasoning tasks Usage Notes
- kimi-k2-thinking-turbo: Context length 256k. High-speed version of kimi-k2-thinking, suitable for scenarios requiring both deep reasoning and extremely fast responses
- Supports ToolCalls, JSON Mode, Partial Mode, and internet search functionality
- Does not support vision functionality
- Supports automatic context caching functionality. Cached tokens are charged at the input price (cache hit) rate. You can view "context caching" type cost details in the console
Generation Model Moonshot-v1
| Model | Unit | Input Price | Output Price | Context Window |
|---|---|---|---|---|
| moonshot-v1-8k | 1M tokens | $0.20 | $2.00 | 8,192 tokens |
| moonshot-v1-32k | 1M tokens | $1.00 | $3.00 | 32,768 tokens |
| moonshot-v1-128k | 1M tokens | $2.00 | $5.00 | 131,072 tokens |
| moonshot-v1-8k-vision-preview | 1M tokens | $0.20 | $2.00 | 8,192 tokens |
| moonshot-v1-32k-vision-preview | 1M tokens | $1.00 | $3.00 | 32,768 tokens |
| moonshot-v1-128k-vision-preview | 1M tokens | $2.00 | $5.00 | 131,072 tokens |
Here, 1M = 1,000,000. The prices in the table represent the cost per 1M tokens consumed.