Model Comparison
Compare two models side by side to make informed decisions based on pricing, specifications, and performance.
V
Supports JSON output
The best model in the world for multimodal understanding, and our most powerful agentic and vibe-coding model yet.
Pricing
Input
$1.40/M
Output
$8.40/M
Cached Input
—
Specifications
Context
1,000,000
Maximum Output
64,000
Inputtext, image, audio, video
Outputtext, json
No model selected