Model Capabilities

Generate Text

The Responses API is the preferred way of interacting with our models via API. It allows optional stateful interactions with our models, where previous input prompts, reasoning content, and model responses are saved and stored on xAI's servers. You can continue the interaction by appending new prompt messages instead of resending the full conversation. This behavior is on by default. If you would like to store your request/response locally, please see Disable storing previous request/response on server.

The responses will be stored for 30 days, after which they will be removed. This means you can use the response ID to retrieve or continue a conversation within 30 days of sending the request. If you want to continue a conversation after 30 days, please store your responses history and the encrypted thinking content locally, and pass them in a new request body.

For Python, we also offer our xAI SDK which covers all of our features and uses gRPC for optimal performance. It's fine to mix both. The xAI SDK allows you to interact with all our products such as Collections, Voice API, API key management, and more, while the Responses API is more suited for chatbots and usage in RESTful APIs.


Prerequisites

Create an API key on the xAI Console API Keys Page. Set your API key in your environment:

Bash

export XAI_API_KEY="your_api_key"

Creating a new model response

Start by creating a response:

import os
from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
    timeout=3600,
)

chat = client.chat.create(model="grok-4.3")
chat.append(system("You are Grok, an AI agent built to answer helpful questions."))
chat.append(user("How big is the universe?"))
response = chat.sample()

print(response)

# The response ID that can be used to continue the conversation later

print(response.id)

Disable storing previous request/response on server

If you do not want to store your previous request/response on the server, you can set store: false on the request.

import os
from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
    timeout=3600,
)

chat = client.chat.create(model="grok-4.3", store_messages=False)
chat.append(system("You are Grok, an AI agent built to answer helpful questions."))
chat.append(user("How big is the universe?"))
response = chat.sample()

print(response)

Returning encrypted thinking content

If you want to return the encrypted thinking traces, you need to specify use_encrypted_content=True in xAI SDK or gRPC request message, or include: ["reasoning.encrypted_content"] in the request body.

Make sure to use a reasoning model when working with encrypted thinking content.

Modify the steps to create a chat client (xAI SDK) or change the request body as following:

chat = client.chat.create(model="grok-4.3",
        use_encrypted_content=True)

See Adding encrypted thinking content on how to use the returned encrypted thinking content when making a new request.


Chaining the conversation

We now have the id of the first response. With Chat Completions API, we typically send a stateless new request with all the previous messages.

With Responses API, we can send the id of the previous response, and the new messages to append to it.

import os
from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
    timeout=3600,
)

chat = client.chat.create(model="grok-4.3", store_messages=True)
chat.append(system("You are Grok, an AI agent built to answer helpful questions."))
chat.append(user("How big is the universe?"))
response = chat.sample()

print(response)

# The response ID that can be used to continue the conversation later

print(response.id)

# New steps

chat = client.chat.create(
    model="grok-4.3",
    previous_response_id=response.id,
    store_messages=True,
)
chat.append(user("How do stars form?"))
second_response = chat.sample()

print(second_response)

# The response ID that can be used to continue the conversation later

print(second_response.id)

Adding encrypted thinking content

After returning the encrypted thinking content, you can also add it to a new response's input.

Make sure to use a reasoning model when working with encrypted thinking content.

import os
from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
    timeout=3600,
)

chat = client.chat.create(model="grok-4.3", store_messages=True, use_encrypted_content=True)
chat.append(system("You are Grok, an AI agent built to answer helpful questions."))
chat.append(user("How big is the universe?"))
response = chat.sample()

print(response)

# The response ID that can be used to continue the conversation later

print(response.id)

# New steps

chat.append(response)  ## Append the response and the SDK will automatically add the outputs from response to message history

chat.append(user("How do stars form?"))
second_response = chat.sample()

print(second_response)

# The response ID that can be used to continue the conversation later

print(second_response.id)

Retrieving a previous model response

If you have a previous response's ID, you can retrieve the content of the response.

import os
from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
    timeout=3600,
)

response = client.chat.get_stored_completion("<The previous response's id>")

print(response)

Delete a model response

If you no longer want to store the previous model response, you can delete it.

import os
from xai_sdk import Client
from xai_sdk.chat import user, system

client = Client(
    api_key=os.getenv("XAI_API_KEY"),
    management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
    timeout=3600,
)

response = client.chat.delete_stored_completion("<The previous response's id>")
print(response)

Last updated: May 14, 2026