Batching Asynchronous Requests
Because queries take time to travel to our servers and back to your client, a common use case is to send hundreds, even thousands, of requests simultaneously to the xAI servers. Doing this synchronously would take too long, so we construct our requests, send them asynchronously, and gather the results when the batch is finished processing.
Therefore, we can send 100 requests concurrently in the time required to send just one!
To batch a group of asynchronous requests using AsyncOpenAI, you can leverage Python's asyncio library for concurrent execution:
Your response should look something like this: