Model Capabilities

Chat Completions

Chat Completions is offered as a legacy endpoint. Most of our new features will come to Responses API first.

Looking to migrate? Check out our Migrating to Responses API guide for a detailed comparison and migration steps.

Text in, text out. Chat is the most popular feature on the xAI API, and can be used for anything from summarizing articles, generating creative writing, answering questions, providing customer support, to assisting with coding tasks.

Prerequisites

xAI Account: You need an xAI account to access the API.
API Key: Ensure that your API key has access to the Chat Completions endpoint and the model you want to use is enabled.

If you don't have these and are unsure of how to create one, follow the Hitchhiker's Guide to Grok.

You can create an API key on the xAI Console API Keys Page.

Set your API key in your environment:

Bash

export XAI_API_KEY="your_api_key"

A basic chat completions example

You can also stream the response, which is covered in Streaming Response.

The user sends a request to the xAI API endpoint. The API processes this and returns a complete response.

import os

from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    timeout=3600, # Override default timeout with longer timeout for reasoning models
)

chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(system("You are a PhD-level mathematician."))
chat.append(user("What is 2 + 2?"))

response = chat.sample()
print(response.content)

import os
import httpx
from openai import OpenAI

client = OpenAI(
    api_key="<YOUR_XAI_API_KEY_HERE>",
    base_url="https://api.x.ai/v1",
    timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)

completion = client.chat.completions.create(
    model="grok-4-1-fast-reasoning",
    messages=[
        {"role": "system", "content": "You are a PhD-level mathematician."},
        {"role": "user", "content": "What is 2 + 2?"},
    ],
)

print(completion.choices[0].message)

import OpenAI from "openai";

const client = new OpenAI({
    apiKey: "<api key>",
    baseURL: "https://api.x.ai/v1",
    timeout: 360000, // Override default timeout with longer timeout for reasoning models
});

const completion = await client.chat.completions.create({
    model: "grok-4-1-fast-reasoning",
    messages: [
        {
            role: "system",
            content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
        },
        {
            role: "user",
            content: "What is the meaning of life, the universe, and everything?"
        },
    ],
});
console.log(completion.choices[0].message);

import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';

const result = await generateText({
  model: xai('grok-4-1-fast-reasoning'),
  system:
    "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
  prompt: 'What is the meaning of life, the universe, and everything?',
});

console.log(result.text);

curl https://api.x.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
    "messages": [
        {
            "role": "system",
            "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
        },
        {
            "role": "user",
            "content": "What is the meaning of life, the universe, and everything?"
        }
    ],
    "model": "grok-4-1-fast-reasoning",
    "stream": false
}'

Response:

'2 + 2 equals 4.'

ChatCompletionMessage(
  content='2 + 2 equals 4.',
  refusal=None,
  role='assistant',
  audio=None,
  function_call=None,
  tool_calls=None
)

{
  role: 'assistant',
  content: `Ah, the ultimate question! According to Douglas Adams' "The Hitchhiker's Guide to the Galaxy," the answer to the ultimate question of life, the universe, and everything is **42**. However, the guide also notes that the actual question to which this is the answer is still unknown. Isn't that delightfully perplexing? Now, if you'll excuse me, I'll just go ponder the intricacies of existence.`
  refusal: null
}

// result object structure
{
  text: "Ah, the ultimate question! As someone...",
  finishReason: "stop",
  usage: {
    inputTokens: 716,
    outputTokens: 126,
    totalTokens: 1009,
    reasoningTokens: 167
  },
  totalUsage: { /* same as usage */ }
}

{
  "id": "0daf962f-a275-4a3c-839a-047854645532",
  "object": "chat.completion",
  "created": 1739301120,
  "model": "grok-4-1-fast-reasoning",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The meaning of life, the universe, and everything is a question that has puzzled philosophers, scientists, and hitchhikers alike. According to the Hitchhiker's Guide to the Galaxy, the answer to this ultimate question is simply \"42\". However, the exact nature of the question itself remains unknown. So, while we may have the answer, the true meaning behind it is still up for debate. In the meantime, perhaps we should all just enjoy the journey and have a good laugh along the way!",
        "refusal": null
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 41,
    "completion_tokens": 104,
    "total_tokens": 145,
    "prompt_tokens_details": {
      "text_tokens": 41,
      "audio_tokens": 0,
      "image_tokens": 0,
      "cached_tokens": 0
    }
  },
  "system_fingerprint": "fp_84ff176447"
}

Conversations

The xAI API is stateless and does not process new request with the context of your previous request history.

However, you can provide previous chat generation prompts and results to a new chat generation request to let the model process your new request with the context in mind.

An example message:

JSON

{
  "role": "system",
  "content": [{ "type": "text", "text": "You are a helpful and funny assistant."}]
}
{
  "role": "user",
  "content": [{ "type": "text", "text": "Why don't eggs tell jokes?" }]
},
{
  "role": "assistant",
  "content": [{ "type": "text", "text": "They'd crack up!" }]
},
{
  "role": "user",
  "content": [{"type": "text", "text": "Can you explain the joke?"}],
}

By specifying roles, you can change how the model ingests the content. The system role content should define, in an instructive tone, the way the model should respond to user request. The user role content is usually used for user requests or data sent to the model. The assistant role content is usually either in the model's response, or when sent within the prompt, indicates the model's response as part of conversation history.

The developer role is supported as an alias for system. Only a single system/developer message should be used, and it should always be the first message in your conversation.

Image understanding

Some models allow image in the input. The model will consider the image context, when generating the response.

Constructing the message body - difference from text-only prompt

The request message to image understanding is similar to text-only prompt. The main difference is that instead of text input:

JSON

[
{
    "role": "user",
    "content": "What is in this image?"
}
]

We send in content as a list of objects:

JSON

[
{
    "role": "user",
    "content": [
{
    "type": "image_url",
    "image_url": {
    "url": "data:image/jpeg;base64,<base64_image_string>",
    "detail": "high"
}
},
{
    "type": "text",
    "text": "What is in this image?"
}
    ]
}
]

The image_url.url can also be the image's url on the Internet.

Image understanding example

import os

from xai_sdk import Client
from xai_sdk.chat import user, image

client = Client(api_key=os.getenv('XAI_API_KEY'))

image_url = "https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"

chat = client.chat.create(model="grok-4")
chat.append(
    user(
        "What's in this image?",
        image(image_url=image_url, detail="high"),
    )
)

response = chat.sample()
print(response.content)

import os
from openai import OpenAI

XAI_API_KEY = os.getenv("XAI_API_KEY")
image_url = (
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"
)

client = OpenAI(
    api_key=XAI_API_KEY,
    base_url="https://api.x.ai/v1",
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": image_url,
                    "detail": "high",
                },
            },
            {
                "type": "text",
                "text": "What's in this image?",
            },
        ],
    },
]

completion = client.chat.completions.create(
    model="grok-4",
    messages=messages,
)

print(completion.choices[0].message.content)

import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: "https://api.x.ai/v1",
});
const image_url =
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png";

const completion = await openai.chat.completions.create({
    model: "grok-4",
    messages: [
        {
            role: "user",
            content: [
                {
                    type: "image_url",
                    image_url: {
                        url: image_url,
                        detail: "high",
                    },
                },
                {
                    type: "text",
                    text: "What's in this image?",
                },
            ],
        },
    ],
});

console.log(completion.choices[0].message.content);

import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';

const result = await generateText({
model: xai('grok-4'),
messages: [
        {
            role: 'user',
            content: [
                {
                    type: 'image',
                    image: new URL(
                        'https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png',
                    ),
                },
                {
                    type: 'text',
                    text: "What's in this image?",
                },
            ],
        },
    ],
});

console.log(result.text);

Image input general limits

Maximum image size: 20MiB
Maximum number of images: No limit
Supported image file types: jpg/jpeg or png.
Any image/text input order is accepted (e.g. text prompt can precede image prompt)

Image detail levels

The "detail" field controls the level of pre-processing applied to the image that will be provided to the model. It is optional and determines the resolution at which the image is processed. The possible values for "detail" are:

"auto": The system will automatically determine the image resolution to use. This is the default setting, balancing speed and detail based on the model's assessment.
"low": The system will process a low-resolution version of the image. This option is faster and consumes fewer tokens, making it more cost-effective, though it may miss finer details.
"high": The system will process a high-resolution version of the image. This option is slower and more expensive in terms of token usage, but it allows the model to attend to more nuanced details in the image.