GPT-4o mini Realtime Preview

Input:$75/M

Output:$300/M

GPT-4o mini Realtime Preview is a real-time multimodal model for interactive voice and visual experiences. It handles speech, text, and images with streaming input and output, plus tool/function calling for grounded actions. Typical uses include voice assistants, live call handling, real-time captioning, and visual question answering over camera or screen content. Technical highlights include bidirectional audio, vision understanding, streaming responses, and structured outputs via functions.

Commercial Use

Features

Pricing

API

Versions

Pricing for GPT-4o mini Realtime Preview

Explore competitive pricing for GPT-4o mini Realtime Preview, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how GPT-4o mini Realtime Preview can enhance your projects while keeping costs manageable.

Comet Price (USD / M Tokens)	Official Price (USD / M Tokens)	Discount
Input:$75/M Output:$300/M	Input:$93.75/M Output:$375/M	-20%

Versions of GPT-4o mini Realtime Preview

The reason GPT-4o mini Realtime Preview has multiple snapshots may include potential factors such as variations in output after updates requiring older snapshots for consistency, providing developers a transition period for adaptation and migration, and different snapshots corresponding to global or regional endpoints to optimize user experience. For detailed differences between versions, please refer to the official documentation.

version
gpt-4o-mini-realtime-preview
gpt-4o-mini-realtime-preview-2024-12-17