A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
Specifications
Context32,000
Max Output4,096
Inputtext, audio, image
Outputtext, audio
Performance (7-day Average)
Uptime
TPS
RURT
API Paths
/v1/realtime
Pricing
Input$0.60× 1.1/ MTokens
Output$2.40× 1.1/ MTokens
Cached Input$0.06× 1.1/ MTokens
Input Audio$10.00× 1.1/ MTokens
Output Audio$20.00× 1.1/ MTokens
input image$0.80× 1.1/ MTokens