Model Capabilities

Comparison with Chat Completions API

The Responses API is the recommended way to interact with xAI models. Here's how it compares to the legacy Chat Completions API:

FeatureResponses APIChat Completions API (Deprecated)
Stateful Conversations Built-in support via previous_response_id Stateless - must resend full history
Server-side Storage Responses stored for 30 days No storage - manage history yourself
Reasoning Models Full support with encrypted reasoning content Limited - only grok-3-mini returns reasoning_content
Agentic Tools Native support for tools (search, code execution, MCP) Function calling only
Billing Optimization Automatic caching of conversation history Full history billed on each request
Future Features All new capabilities delivered here first Legacy endpoint, limited updates

Key API Changes

Parameter Mapping

Chat CompletionsResponses APINotes
messagesinputArray of message objects
max_tokensmax_output_tokensMaximum tokens to generate
previous_response_idContinue a stored conversation
storeControl server-side storage (default: true)
includeRequest additional data like reasoning.encrypted_content

Response Structure

The response format differs between the two APIs:

Chat Completions returns content in choices[0].message.content:

JSON

{
  "id": "chatcmpl-123",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    }
  }]
}

Responses API returns content in an output array with typed items:

JSON

{
  "id": "resp_123",
  "output": [{
    "type": "message",
    "role": "assistant",
    "content": [{
      "type": "output_text",
      "text": "Hello! How can I help you?"
    }]
  }]
}

Multi-turn Conversations

With Chat Completions, you must resend the entire conversation history with each request. With Responses API, you can use previous_response_id to continue a conversation:

Python

# First request
response = client.responses.create(
    model="grok-4",
    input=[{"role": "user", "content": "What is 2+2?"}],
)

# Continue the conversation - no need to resend history
second_response = client.responses.create(
    model="grok-4",
    previous_response_id=response.id,
    input=[{"role": "user", "content": "Now multiply that by 10"}],
)

Migration Path

Migrating from Chat Completions to Responses API is straightforward. Here's how to update your code for each SDK:

Vercel AI SDK

Switch from xai() to xai.responses():

Javascript

  model: xai('grok-4'),
  model: xai.responses('grok-4'),

OpenAI SDK (JavaScript)

Switch from client.chat.completions.create to client.responses.create, and rename messages to input:

Javascript

const response = await client.chat.completions.create({
const response = await client.responses.create({
    messages: [
    input: [
        { role: "user", content: "Hello!" }
    ],
});

OpenAI SDK (Python)

Switch from client.chat.completions.create to client.responses.create, and rename messages to input:

Python

response = client.chat.completions.create(
response = client.responses.create(
    messages=[
    input=[
        {"role": "user", "content": "Hello!"}
    ],
)

cURL

Change the endpoint from /v1/chat/completions to /v1/responses, and rename messages to input:

Bash

curl https://api.x.ai/v1/chat/completions \
curl https://api.x.ai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -d '{ "model": "grok-4", "messages": [{"role": "user", "content": "Hello!"}] }'
  -d '{ "model": "grok-4", "input": [{"role": "user", "content": "Hello!"}] }'

This will work for most use cases. If you have a unique integration, refer to the Responses API documentation for detailed guidance.