Function calling
Connect the xAI models to external tools and systems to build AI assistants and various integrations.
Introduction
Function calling enables language models to use external tools, which can intimately connect models to digital and physical worlds.
This is a powerful capability that can be used to enable a wide range of use cases.
- Calling public APIs for actions ranging from looking up football game results to getting real-time satellite positioning data
- Analyzing internal databases
- Browsing web pages
- Executing code
- Interacting with the physical world (e.g. booking a flight ticket, opening your tesla car door, controlling robot arms)
Overview
Step 1: Design the function interface for the model
You can start by designing a function interface for the language model. Some questions you should aim to think about are:
- What do you want the assistant to be able to do?
- What functions should it be able to use?
This is very similar to designing helper functions when you write code.
Let's walk through a simple example together. Assume, you want an assistant for navigating a website. You want the assistant to be able to do the following:
- Navigate to a specific page
- Click on a button
- Fill in a form
- Submit a form
Your straw-man implementation might look like this:
python
def open_website(url):
pass
def click(html, button):
pass
def assistant():
input = Input("Hello! I am here helping you to navigate a website! What do you want to do?")
html = open_website(url)
button1 = ... # parse out a button from input magically
html = click(html, button1)
button2 = ... # parse out another button from input magically
html = click(html, button2)
input2 = Input("Voilà we are done! Do you want to join xAI? :)")
button3 = ... # parse out a button from input2 magically
html = click(html, button3)
...
input3 = Input("We will get back to you soon!")
Instead of having such hard-coded Python methods for your assistant, you can use xAI's language models to prompt the LLM to carry out the task for you.
Some tips to keep in mind when designing functions:
- Factor out anything that is not text understanding to functions
- Functions need to be implementable and executable (i.e. it can call another assistant!)
- Select a proper level of abstraction so that it is not too high level that it makes assistant’s job trivial compared to implementing the function(e.g.
make_me_money
) and not too low level that it takes the model a really long time to do anything (e.g.move_cursor_left_by_one_pixel
)
Step 2: Describe the functions
To make this design concrete, you need to write down abstract function definitions similar to defining abstract helper functions via signatures:
python
functions = [
{
"name": "open_website",
"description": "Open a website and return the HTML as a string",
"parameters": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "A URL",
"example_value": "https://x.ai/",
},
},
"required": ["url"],
"optional": [],
},
},
{
"name": "click",
"description": "Click any button on a website and returns the new HTML",
"parameters": {
"type": "object",
"properties": {
"html": {
"type": "string",
"description": "A HTML",
},
"button": {
"type": "string",
"description": "A text description of a button on the html page",
},
},
"required": ["html", "button"],
"optional": [],
},
},
]
Descriptions should be written in detail so the LLM can understand what it does and how to call it just by looking at this.
Step 3: Implement the functions
python
import requests
def open_website(url):
return requests.get(url).text
def click(html, button):
...
return html
Step 4 (optional): “Program” your assistant
Write a system prompt describing what your assistant should do. You can also don’t write any system prompt and just leave it to Grok!
Step 5: Hook up your assistant with the functions
Let's start by getting a tool execution request from the assistant. To do this, we need to start passing the function definitions, system prompts, and any prior dialogues to the assistant.
python
from openai import OpenAI
import json
MODEL_NAME = "grok-beta"
XAI_API_KEY = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=XAI_API_KEY,
base_url="https://api.x.ai/v1",
)
tools = [{"type": "function", "function": f} for f in functions]
messages = [
{"role": "system", "content": "You are a helpful webpage navigation assistant. Use the supplied tools to assist the user."},
{"role": "user", "content": "Hi, can you go to the career page of xAI website?"}
]
response = client.chat.completions.create(
model=MODEL_NAME,
messages=messages,
tools=tools,
)
If the assistant decides no tool call is necessary. In that case, the response will contain a direct reply to the user in the normal way that Chat Completions does. This typically happens when the assistant wants clarification from the user or user is asking some questions that do not need function calls.
Alternatively, if the assistant decides a tool call is necessary, the response will contain a function call request. You can see an example below:
python
Choice(
finish_reason='tool_calls',
index=0,
logprobs=None,
message=chat.completionsMessage(
content="I am opening the xAI website to navigate to the career page.",
role='assistant',
function_call=None,
tool_calls=[
chat.completionsMessageToolCall(
id='call_1234',
function=Function(
arguments='{"url":"https://x.ai/"}',
name='open_website'),
type='function')
])
)
Next, we need to handle the tool execution request from the model. Let's implement the function that will be called by the model.
python
tool_call = response.choices[0].message.tool_calls[0]
arguments = json.loads(tool_call['function']['arguments'])
url = arguments.get('url')
# Call the open_website function with the extracted url
html = open_website(url)
Lastly, we need to provide the tool execution result to the model. Consider the possibility that the model can decide more function calls are needed. Then, the model will generate another function call request, or generate a response to the user based on the function call results.
python
# Create a message containing the result of the function call
function_call_result_message = {
"role": "tool",
"content": "<html>...<html>",
"tool_call_id": response['choices'][0]['message']['tool_calls'][0]['id']
}
# Prepare the chat completion call payload
messages = [
{"role": "system", "content": "You are a helpful webpage navigation assistant. Use the supplied tools to assist the user."},
{"role": "user", "content": "Hi, can you go to the career page of xAI website?"}
response['choices'][0]['message'],
function_call_result_message
]
response = client.chat.completions.create(
model=MODEL_NAME,
messages=messages,
tools=tools,
)
Function calling modes
By default, the model will automatically decide whether a function call is necessary and select which functions to call, as determined by the tool_choice: "auto"
setting.
We offer three ways to customize the default behavior:
- To force the model to always call one or more functions, you can set
tool_choice: "required"
. The model will then always call function. Note this could force the model to hallucinate parameters. - To force the model to call a specific function, you can set
tool_choice: {"type": "function", "function": {"name": "my_function"}}
. - To disable function calling and force the model to only generate a user-facing message, you can either provide no tools, or set
tool_choice: "none"
.