Text Model Cost Calculator
Batch: ~50% off, higher latency. Flex: balanced. Standard: real-time. Priority: up to 2ร cost, fastest.
โ ๏ธ For reasoning models (o-series), reasoning tokens are billed as output tokens but are not visible via the API.
Total Cost:
$0.0000
Select a model and enter token counts
Embedding Models Cost Calculator
Total Cost:
$0.0000
Select a model and enter token count
Fine-Tuning Models Cost Calculator
Inference usage after fine-tuning:
Total Cost:
$0.0000
Select a model and enter values
Audio & Speech Models Cost Calculator
Total Cost:
$0.0000
Select a model and enter input
Note: GPT-4o Mini TTS also billed on audio output tokens at $12.00/1M tokens. Transcription models billed on audio input tokens. Rates above use the simplified per-minute estimate from OpenAI's pricing page.
Image Generation Cost Calculator
Total Cost:
$0.0000
Select model options
Note: For GPT Image models, text output tokens (including reasoning) are billed at model per-token rates separately. The prices above are per-image generation costs only.
Video Generation Cost Calculator (Sora)
Resolution: Portrait 720ร1280 / Landscape 1280ร720
Total Cost:
$0.00
Select model options
Built-in Tools Pricing
| Tool | Cost | Notes |
|---|---|---|
| Container Usage (Code Interpreter / Hosted Shell) Current billing |
1 GB (default): $0.03 / container 4 GB: $0.12 / container 16 GB: $0.48 / container 64 GB: $1.92 / container | Billed per container invocation |
| Container Usage (Code Interpreter / Hosted Shell) Starting March 31, 2026 |
1 GB (default): $0.03 / 20 min / container 4 GB: $0.12 / 20 min / container 16 GB: $0.48 / 20 min / container 64 GB: $1.92 / 20 min / container | Billed per 20-minute session |
| File Search Storage | $0.10 / GB / day (1 GB free) | Storage for file search vector index |
| File Search Tool Call Responses API only | $2.50 / 1,000 calls | Per tool invocation |
| Web Search All models (non-preview) | $10.00 / 1,000 calls + search content tokens at model rate | For gpt-4o-mini and gpt-4.1-mini: search content tokens charged as fixed 8,000 input tokens/call |
| Web Search Preview Reasoning models (o-series, gpt-5) | $10.00 / 1,000 calls + search content tokens at model rate | Same call rate as standard web search |
| Web Search Preview Non-reasoning models | $25.00 / 1,000 calls search content tokens are free | Higher per-call rate but no additional token charges |
| Moderation (omni-moderation) | Free โ๏ธ | No charge for content moderation |
Web Search Cost Estimator
Estimated Web Search Cost:
$0.0000
Container Usage Estimator
Estimated Container Cost:
$0.0000
Frequently Asked Questions
1. What's new in GPT Image 1.5 & GPT Image Latest?
These are OpenAI's newest image generation models. GPT Image 1.5 and GPT Image Latest share identical pricing:
Low: $0.009 (1024ร1024) / $0.013 (rectangular)
Medium: $0.034 / $0.05
High: $0.133 / $0.20
Text output tokens (including model reasoning) are billed separately at model rates.
2. What are the different pricing tiers?
Batch: ~50% off โ for non-time-sensitive bulk tasks with higher latency.
Flex: Balanced pricing with moderate latency.
Standard: Regular real-time pricing.
Priority: Up to 2ร standard cost for fastest processing.
3. How do o-series reasoning models bill tokens?
Reasoning tokens (used internally by o-series models to "think") are not visible via the API but are still counted in the context window and billed as output tokens. This means your actual bill may be higher than expected based on visible output alone. Deep Research variants (o3, o4-mini) are available at higher rates.
4. Container billing change โ March 31, 2026
Starting March 31, 2026, container usage (Code Interpreter and Hosted Shell) switches from per-container billing to per 20-minute session billing. The per-session rates are identical to current per-container rates. Long-running tasks may cost more under the new model.
5. How does web search pricing work?
Two cost components apply: (1) Tool call fee per 1,000 calls and (2) Search content tokens retrieved from the index.
Reasoning & standard models: $10/1k calls + tokens at model rate.
Non-reasoning preview models: $25/1k calls โ but search content tokens are free.
For gpt-4o-mini / gpt-4.1-mini: tokens charged as a fixed block of 8,000 input tokens/call.
6. What's cached input pricing?
Cached pricing offers 75โ90% discounts when reusing previously processed prompt prefixes. For example, GPT-4.1 drops from $2.00 to $0.50 per 1M input tokens when cached. Not all models or tiers support caching โ the calculator will indicate N/A where unavailable.
7. What are DALLยทE 2 & DALLยทE 3 correct sizes?
DALLยทE 2: 256ร256 ($0.016), 512ร512 ($0.018), 1024ร1024 ($0.020).
DALLยทE 3 Standard: 1024ร1024 ($0.04), 1024ร1792 or 1792ร1024 ($0.08).
DALLยทE 3 HD: 1024ร1024 ($0.08), 1024ร1792 or 1792ร1024 ($0.12).
8. Embedding model costs
text-embedding-3-small: $0.020/1M (standard), $0.010/1M (batch)
text-embedding-3-large: $0.130/1M (standard), $0.065/1M (batch)
text-embedding-ada-002: $0.100/1M (standard), $0.050/1M (batch)
9. Sora video generation pricing
Sora-2: $0.10/second โ Portrait 720ร1280 or Landscape 1280ร720.
Sora-2 Pro: $0.30/second โ same resolutions as Sora-2.
Sora-2 Pro (HD): $0.50/second โ Portrait 1024ร1792 or Landscape 1792ร1024.
10. Fine-tuning o4-mini
o4-mini fine-tuning is billed at $100/training hour (not per token). Usage inference is $4.00 input / $16.00 output per 1M tokens (standard). Enabling data sharing halves inference costs: $2.00 input / $8.00 output. GPT-4.1 training is $25.00/1M training tokens.