Model Capabilities
Comparison with Chat Completions API
The Responses API is the recommended way to interact with xAI models. Here's how it compares to the legacy Chat Completions API:
| Feature | Responses API | Chat Completions API (Deprecated) |
|---|---|---|
| Stateful Conversations | Built-in support via previous_response_id | Stateless - must resend full history |
| Server-side Storage | Responses stored for 30 days | No storage - manage history yourself |
| Reasoning Models | Full support with encrypted reasoning content | Limited - only grok-3-mini returns reasoning_content |
| Agentic Tools | Native support for tools (search, code execution, MCP) | Function calling only |
| Billing Optimization | Automatic caching of conversation history | Full history billed on each request |
| Future Features | All new capabilities delivered here first | Legacy endpoint, limited updates |
Key API Changes
Parameter Mapping
| Chat Completions | Responses API | Notes |
|---|---|---|
messages | input | Array of message objects |
max_tokens | max_output_tokens | Maximum tokens to generate |
| — | previous_response_id | Continue a stored conversation |
| — | store | Control server-side storage (default: true) |
| — | include | Request additional data like reasoning.encrypted_content |
Response Structure
The response format differs between the two APIs:
Chat Completions returns content in choices[0].message.content:
JSON
{
"id": "chatcmpl-123",
"choices": [{
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
}
}]
}
Responses API returns content in an output array with typed items:
JSON
{
"id": "resp_123",
"output": [{
"type": "message",
"role": "assistant",
"content": [{
"type": "output_text",
"text": "Hello! How can I help you?"
}]
}]
}
Multi-turn Conversations
With Chat Completions, you must resend the entire conversation history with each request. With Responses API, you can use previous_response_id to continue a conversation:
Python
# First request
response = client.responses.create(
model="grok-4",
input=[{"role": "user", "content": "What is 2+2?"}],
)
# Continue the conversation - no need to resend history
second_response = client.responses.create(
model="grok-4",
previous_response_id=response.id,
input=[{"role": "user", "content": "Now multiply that by 10"}],
)
Migration Path
Migrating from Chat Completions to Responses API is straightforward. Here's how to update your code for each SDK:
Vercel AI SDK
Switch from xai() to xai.responses():
Javascript
model: xai('grok-4'),
model: xai.responses('grok-4'),
OpenAI SDK (JavaScript)
Switch from client.chat.completions.create to client.responses.create, and rename messages to input:
Javascript
const response = await client.chat.completions.create({
const response = await client.responses.create({
messages: [
input: [
{ role: "user", content: "Hello!" }
],
});
OpenAI SDK (Python)
Switch from client.chat.completions.create to client.responses.create, and rename messages to input:
Python
response = client.chat.completions.create(
response = client.responses.create(
messages=[
input=[
{"role": "user", "content": "Hello!"}
],
)
cURL
Change the endpoint from /v1/chat/completions to /v1/responses, and rename messages to input:
Bash
curl https://api.x.ai/v1/chat/completions \
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{ "model": "grok-4", "messages": [{"role": "user", "content": "Hello!"}] }'
-d '{ "model": "grok-4", "input": [{"role": "user", "content": "Hello!"}] }'
This will work for most use cases. If you have a unique integration, refer to the Responses API documentation for detailed guidance.