#### Models and Pricing

# Speech to Text

The Speech to Text API transcribes audio into text. Use the REST endpoint for file-based batch transcription, or the streaming endpoint for real-time low-latency transcription.

## At a glance

| | Details |
|---|---|
| **Modalities** | Audio → Text |
| **REST pricing** | $0.10 / hr |
| **Streaming pricing** | $0.20 / hr |
| **Region** | us-east-1 |

## Pricing

| | Details |
|---|---|
| **REST (per hour)** | $0.10 / hr |
| **Streaming (per hour)** | $0.20 / hr |

## Rate Limits

| | REST | Streaming |
|---|---|---|
| **RPM** (Requests per minute) | 600 | 600 |
| **RPS** (Requests per second) | 10 | 10 |
| **Concurrent sessions** | — | 100 per team |

## Capabilities

* REST and streaming transcription
* Multiple audio formats (WAV, MP3, WebM, OGG, M4A)
* Multiple languages
* Real-time interim results (streaming)

## Availability

| | Details |
|---|---|
| **Cluster** | us-east-1 |

## Documentation

* [Speech to Text Guide](/developers/model-capabilities/audio/speech-to-text) — Getting started with speech to text
* [Voice APIs Guide](/developers/model-capabilities/audio/voice) — Overview of all voice capabilities
* [Models and Pricing](/developers/models#voice-api-pricing) — Full pricing overview
