Models

/

Grok 4

Our latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades.

At a glance

Modalities

Context window

256,000

Pricing

Capabilities

Function calling

Connect the xAI model to external tools and systems.

Structured outputs

Return responses in specific, organized formats.

Reasoning

The model thinks before responding.

Pricing

Input

Tokens

$3.00/ 1M tokens

Cached tokens

$0.75/ 1M tokens

Output

Tokens

$15.00/ 1M tokens

You are charged for each token used when making calls to our API.

Using cached input tokens can significantly reduce your costs.

This model is available on multiple clusters, you can find full regional based pricing below.

Show batch API pricing

Details

Model name

Aliases

Regionus-east-1, eu-west-1
Pricing per million tokens *
Input$3.00
Cached input$0.75
Output$15.00
Rate limits
Requests per minute500
Tokens per minute20,000,000
Batch pricing

We charge different rates for requests which exceed the 128K context window

Quickstart

from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(api_key="<YOUR_XAI_API_KEY_HERE>")

chat = client.chat.create(model="grok-4-0709", temperature=0)
chat.append(system("You are a PhD-level mathematician."))
chat.append(user("What is 2 + 2?"))

response = chat.sample()
print(response.content)