OpenAI: GPT-5.4 Pro

openai/gpt-5.4-pro

Released Mar 5, 20261,050,000 context
$30/M input tokens$180/M output tokens$10/K web search

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs. Optimized for step-by-step reasoning, instruction following, and accuracy, GPT-5.4 Pro excels at agentic coding, long-context workflows, and multi-step problem solving.

Providers for GPT-5.4 Pro

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Latency
108.3s
Throughput
13tps
Uptime
100.0%
Uptime 100.0 percent
Total Context
Max Output
Input Price
Output Price
Cache Read
Cache Write
Input Audio
Input Audio Cache
1.05M
128K
≤272K$30>272K$60
≤272K$180>272K$270
--
--
--
--

Performance for GPT-5.4 Pro

Compare different providers across OpenRouter

Throughput

OpenAI
Avg7 tok/s

Latency

OpenAI
Avg98.83 s

E2E Latency

OpenAI
Avg209.93 s

Tool Call Error Rate

OpenAI
Avg0.67 %

Effective Pricing for GPT-5.4 Pro

Actual cost per million tokens across providers over the past hour

Weighted Average

Weighted Avg Input Price

$30.00

per 1M tokens

Weighted Avg Output Price

$180.00

per 1M tokens

ProviderInput $/1MOutput $/1MCache Hit Rate
OpenAI
$30.00$180.000.0%

Input Price / 1M tokens (7 days)

Mar 6Mar 8Mar 10Mar 1208162432$/1M

Output Price / 1M tokens (7 days)

Mar 6Mar 8Mar 10Mar 12050100150200$/1M

Benchmarks for GPT-5.4 Pro

Performance metrics and benchmarks

Reasoning

CritPt

Research-level physics reasoning

30.0%

Metrics sourced from Artificial Analysis

Apps using GPT-5.4 Pro

Top public apps this month

1.
Favicon for https://openclaw.ai/
OpenClaw
The AI that actually does things
295Mtokens
2.
Favicon for https://cline.bot/
Cline
Autonomous coding agent right in your IDE
176Mtokens
3.
Favicon for https://claude.ai/apple-touch-icon.png
Claude Code
The AI for problem solvers
74.6Mtokens
4.
Favicon for https://kilocode.ai/
Kilo Code
AI coding agent for VS Code
74.3Mtokens
5.
Favicon for https://galaxy.ai/
58.9Mtokens
Mar 5Mar 7Mar 9Mar 11Mar 13

Recent activity on GPT-5.4 Pro

Total usage per day on OpenRouter

Mar 5Mar 6Mar 7Mar 8Mar 9Mar 10Mar 11Mar 12Mar 131.5B3B4.5B6B
Prompt
316M
Reasoning
15.7M
Completion
5.42M

Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.

Uptime stats for GPT-5.4 Pro

Uptime stats for GPT-5.4 Pro on the only provider

When an error occurs in an upstream provider, we can recover by routing to another healthy provider, if your request filters allow it. You can access uptime data programmatically through the Endpoints API

Learn more about our load balancing and customization options.

Sample code and API for GPT-5.4 Pro

OpenRouter normalizes requests and responses across providers for you.

OpenRouter supports reasoning-enabled models that can show their step-by-step thinking process. Use the reasoning parameter in your request to enable reasoning, and access the reasoning_details array in the response to see the model's internal reasoning before the final answer. When continuing a conversation, preserve the complete reasoning_details when passing messages back to the model so it can continue reasoning from where it left off. Learn more about reasoning tokens.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

// Stream the response to get reasoning tokens in usage
const stream = await openrouter.chat.send({
  model: "openai/gpt-5.4-pro",
  messages: [
    {
      role: "user",
      content: "How many r's are in the word 'strawberry'?"
    }
  ],
  stream: true
});

let response = "";
for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    response += content;
    process.stdout.write(content);
  }

  // Usage information comes in the final chunk
  if (chunk.usage) {
    console.log("\nReasoning tokens:", chunk.usage.reasoningTokens);
  }
}

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.

More models from OpenAI