Models

/

Grok 4.1 Fast (Non-Reasoning)

A frontier multimodal model optimized specifically for high-performance agentic tool calling.

At a glance

Modalities

Context window

2,000,000

Pricing

Capabilities

Function calling

Connect the xAI model to external tools and systems.

Structured outputs

Return responses in specific, organized formats.

Reasoning

The model thinks before responding.

Pricing

Input

Tokens

$0.20/ 1M tokens

Cached tokens

$0.05/ 1M tokens

Output

Tokens

$0.50/ 1M tokens

You are charged for each token used when making calls to our API.

Using cached input tokens can significantly reduce your costs.

This model is available on multiple clusters, you can find full regional based pricing below.

Show batch API pricing

Details

Model name

Aliases

Regionus-east-1, eu-west-1
Pricing per million tokens *
Input$0.20
Cached input$0.05
Output$0.50
Rate limits
Requests per minute1,800
Tokens per minute10,000,000
Batch pricing

We charge different rates for requests which exceed the 128K context window