Skip to main content
POST
/
responses
Create a response
curl --request POST \
  --url https://openrouter.ai/api/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "input": "Tell me a joke",
  "model": "openai/gpt-4o"
}
'
{
  "created_at": 1700000000,
  "id": "resp_abc123",
  "model": "openai/gpt-4o",
  "object": "response",
  "output": [
    {
      "content": [
        {
          "text": "Why did the chicken cross the road? To get to the other side!",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "type": "message"
    }
  ],
  "status": "completed",
  "usage": {
    "completion_tokens": 20,
    "prompt_tokens": 10,
    "total_tokens": 30
  }
}

Authorizations

Authorization
string
header
required

API key as bearer token in Authorization header

Headers

X-OpenRouter-Metadata
enum<string>

Opt-in to surface routing metadata on the response under openrouter_metadata. Defaults to disabled. The legacy header X-OpenRouter-Experimental-Metadata is also accepted for backward compatibility. Opt-in level for surfacing routing metadata on the response under openrouter_metadata.

Available options:
disabled,
enabled
Example:

"enabled"

Body

application/json

Request schema for Responses endpoint

background
boolean | null
cache_control
object

Enable automatic prompt caching. When set at the top level, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.

Example:
{ "type": "ephemeral" }
debug
object

Debug options for inspecting request transformations (streaming only)

Example:
{ "echo_upstream_body": true }
frequency_penalty
number<double> | null
image_config
object

Provider-specific image configuration options. Keys and values vary by model/provider. See https://openrouter.ai/docs/guides/overview/multimodal/image-generation for more details.

Example:
{ "aspect_ratio": "16:9", "quality": "high" }
include
enum<string>[] | null
Available options:
file_search_call.results,
message.input_image.image_url,
computer_call_output.output.image_url,
reasoning.encrypted_content,
code_interpreter_call.outputs
input

Input for a response request - can be a string or array of items

Example:
[
{
"content": "What is the weather today?",
"role": "user"
}
]
instructions
string | null
max_output_tokens
integer | null
max_tool_calls
integer | null
metadata
object | null

Metadata key-value pairs for the request. Keys must be ≤64 characters and cannot contain brackets. Values must be ≤512 characters. Maximum 16 pairs allowed.

Example:
{
"session_id": "abc-def-ghi",
"user_id": "123"
}
modalities
enum<string>[]

Output modalities for the response. Supported values are "text" and "image".

Available options:
text,
image
Example:
["text", "image"]
model
string
models
string[]
parallel_tool_calls
boolean | null
plugins
object[]

Plugins you want to enable for this request, including their settings.

Example:
{
"allowed_models": ["anthropic/*", "openai/gpt-4o"],
"cost_quality_tradeoff": 7,
"enabled": true,
"id": "auto-router"
}
presence_penalty
number<double> | null
previous_response_id
string | null
prompt
object | null
Example:
{
"id": "prompt-abc123",
"variables": { "name": "John" }
}
prompt_cache_key
string | null
provider
object | null

When multiple model providers are available, optionally indicate your routing preference.

Example:
{ "allow_fallbacks": true }
reasoning
object | null

Configuration for reasoning mode in the response

Example:
{
"effort": "medium",
"summary": "auto",
"enabled": true
}
route
enum<string> | null
deprecated

DEPRECATED Use providers.sort.partition instead. Backwards-compatible alias for providers.sort.partition. Accepts legacy values: "fallback" (maps to "model"), "sort" (maps to "none").

Available options:
fallback,
sort,
null
Example:

"fallback"

safety_identifier
string | null
service_tier
enum<string> | null
default:auto
Available options:
auto,
default,
flex,
priority,
scale,
null
session_id
string

A unique identifier for grouping related requests (e.g., a conversation or agent workflow). When provided, OpenRouter uses it as the sticky routing key, routing all requests in the session to the same provider to maximize prompt cache hits. Also used for observability grouping. If provided in both the request body and the x-session-id header, the body value takes precedence. Maximum of 256 characters.

Maximum string length: 256
stop_server_tools_when
object[]

Stop conditions for the server-tool agent loop. Any condition firing halts the loop (OR logic). When set, this overrides max_tool_calls.

Minimum array length: 1

A single condition that, when met, halts the server-tool agent loop.

Example:
{ "step_count": 5, "type": "step_count_is" }
Example:
[
{ "step_count": 5, "type": "step_count_is" },
{
"max_cost_in_dollars": 0.5,
"type": "max_cost"
}
]
store
boolean
default:false
stream
boolean
default:false
temperature
number<double> | null
text
object

Text output configuration including format and verbosity

Example:
{
"format": { "type": "text" },
"verbosity": "medium"
}
tool_choice
Available options:
auto
Example:

"auto"

tools
object[]

Function tool definition

Example:
{
"description": "Get the current weather in a location",
"name": "get_weather",
"parameters": {
"properties": {
"location": {
"description": "The city and state",
"type": "string"
},
"unit": {
"enum": ["celsius", "fahrenheit"],
"type": "string"
}
},
"required": ["location"],
"type": "object"
},
"type": "function"
}
top_k
integer
top_logprobs
integer | null
top_p
number<double> | null
trace
object

Metadata for observability and tracing. Known keys (trace_id, trace_name, span_name, generation_name, parent_span_id) have special handling. Additional keys are passed through as custom metadata to configured broadcast destinations.

Example:
{
"trace_id": "trace-abc123",
"trace_name": "my-app-trace"
}
truncation
enum<string> | null
Available options:
auto,
disabled,
null
Example:

"auto"

user
string

A unique identifier representing your end-user, which helps distinguish between different users of your app. This allows your app to identify specific users in case of abuse reports, preventing your entire app from being affected by the actions of individual users. Maximum of 256 characters.

Maximum string length: 256

Response

Successful response

Complete non-streaming response from the Responses API

completed_at
integer | null
required
created_at
integer
required
error
object | null
required

Error information returned from the API

Example:
{
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please try again later."
}
frequency_penalty
number<double> | null
required
id
string
required
incomplete_details
object | null
required
Example:
{ "reason": "max_output_tokens" }
instructions
required
Example:
[
{
"content": "What is the weather today?",
"role": "user"
}
]
metadata
object | null
required

Metadata key-value pairs for the request. Keys must be ≤64 characters and cannot contain brackets. Values must be ≤512 characters. Maximum 16 pairs allowed.

Example:
{
"session_id": "abc-def-ghi",
"user_id": "123"
}
model
string
required
object
enum<string>
required
Available options:
response
output
object[]
required

An output item from the response

Example:
{
"content": [
{
"text": "Hello! How can I help you today?",
"type": "output_text"
}
],
"id": "msg-abc123",
"role": "assistant",
"status": "completed",
"type": "message"
}
parallel_tool_calls
boolean
required
presence_penalty
number<double> | null
required
status
enum<string>
required
Available options:
completed,
incomplete,
in_progress,
failed,
cancelled,
queued
Example:

"completed"

temperature
number<double> | null
required
tool_choice
required
Available options:
auto
Example:

"auto"

tools
object[]
required

Function tool definition

Example:
{
"description": "Get the current weather in a location",
"name": "get_weather",
"parameters": {
"properties": {
"location": {
"description": "The city and state",
"type": "string"
},
"unit": {
"enum": ["celsius", "fahrenheit"],
"type": "string"
}
},
"required": ["location"],
"type": "object"
},
"type": "function"
}
top_p
number<double> | null
required
background
boolean | null
max_output_tokens
integer | null
max_tool_calls
integer | null
output_text
string
previous_response_id
string | null
prompt
object | null
Example:
{
"id": "prompt-abc123",
"variables": { "name": "John" }
}
prompt_cache_key
string | null
reasoning
object | null
Example:
{ "effort": "medium", "summary": "auto" }
safety_identifier
string | null
service_tier
enum<string> | null
Available options:
auto,
default,
flex,
priority,
scale,
null
Example:

"default"

store
boolean
text
object

Text output configuration including format and verbosity

Example:
{
"format": { "type": "text" },
"verbosity": "medium"
}
top_logprobs
integer
truncation
enum<string> | null
Available options:
auto,
disabled,
null
Example:

"auto"

usage
object

Token usage information for the response

Example:
{
"input_tokens": 10,
"input_tokens_details": { "cached_tokens": 0 },
"output_tokens": 25,
"output_tokens_details": { "reasoning_tokens": 0 },
"total_tokens": 35,
"cost": 0.0012,
"cost_details": {
"upstream_inference_cost": null,
"upstream_inference_input_cost": 0.0008,
"upstream_inference_output_cost": 0.0004
}
}
user
string | null
error_type
enum<string>

Canonical OpenRouter error type, stable across all API formats

Available options:
context_length_exceeded,
max_tokens_exceeded,
token_limit_exceeded,
string_too_long,
authentication,
permission_denied,
payment_required,
rate_limit_exceeded,
provider_overloaded,
provider_unavailable,
invalid_request,
invalid_prompt,
not_found,
precondition_failed,
payload_too_large,
unprocessable,
content_policy_violation,
refusal,
invalid_image,
image_too_large,
image_too_small,
unsupported_image_format,
image_not_found,
image_download_failed,
server,
timeout,
unmapped
Example:

"rate_limit_exceeded"

openrouter_metadata
object
Example:
{
"attempt": 1,
"endpoints": {
"available": [
{
"model": "openai/gpt-4o",
"provider": "OpenAI",
"selected": true
}
],
"total": 1
},
"is_byok": false,
"region": "iad",
"requested": "openai/gpt-4o",
"strategy": "direct",
"summary": "available=1, selected=OpenAI"
}