The full model list and reference pricing has moved to the Supported Models page.
Billing categories
| Category | Description | Cost |
|---|---|---|
| Input | Content you send to the model (prompt) | Base price |
| Output | Content the model generates | 3-5× input |
| Cache write | First-time prompt caching | 1.25× input |
| Cache read | Cache hit | 0.1× input |
How to save money
- Choose the right model — use cheaper models (DeepSeek-V3.2, GLM-5.1) for simple tasks
- Keep prompts concise — shorter input = fewer tokens = lower cost
- Control output length — set
max_tokensto limit output - Leverage caching — fixed system prompts are cached automatically at 10% of input price
- Reduce context — don’t include unnecessary conversation history

