Models

/

Grok 3 Mini

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible.

At a glance

Modalities

Context window

131,072

Pricing

Capabilities

Function calling

Connect the xAI model to external tools and systems.

Structured outputs

Return responses in specific, organized formats.

Reasoning

The model can think before responding.

Pricing

Input

Tokens

$0.30/ 1M tokens

Cached tokens

$0.075/ 1M tokens

Output

Tokens

$0.50/ 1M tokens

You are charged for each token used when making calls to our API.

Using cached input tokens can significantly reduce your costs.

This model is available on multiple clusters, you can find full regional based pricing below.

Details

Model name

Aliases

Regionus-east-1, eu-west-1
Pricing per million tokens *
Input$0.30
Cached input$0.075
Output$0.50
Rate limits
Requests per minute1,400
Tokens per minute4,000,000

Quickstart

from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(api_key="<YOUR_XAI_API_KEY_HERE>")

chat = client.chat.create(model="grok-3-mini", temperature=0)
chat.append(system("You are a PhD-level mathematician."))
chat.append(user("What is 2 + 2?"))

response = chat.sample()
print(response.content)