Openrouter Router API

Endpoint: POST https://fal.run/openrouter/router/vision Endpoint ID: openrouter/router/vision

Try it in the Playground

Run this model interactively with your own prompts.

Quick Start

import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "openrouter/router/vision",
    arguments={
        "image_urls": [
            "https://fal.media/files/tiger/4Ew1xYW6oZCs6STQVC7V8_86440216d0fe42e4b826d03a2121468e.jpg"
        ],
        "prompt": "Caption this image for a text-to-image model with as much detail as possible.",
        "model": "google/gemini-2.5-flash"
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)

Input Schema

image_urls

list<string>

required

List of image URLs to be processed

prompt

string

required

Prompt to be used for the image

system_prompt

string

System prompt to provide context or instructions to the model

model

string

required

Name of the model to use. Charged based on actual token usage.

reasoning

boolean

default:"false"

Should reasoning be the part of the final answer.

temperature

float

default:"1"

This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input. Default value: 1Range: 0 to 2

max_tokens

integer

This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.

Output Schema

output

string

required

Generated output

usage

UsageInfo

required

Token usage information

Input Example

{
  "image_urls": [
    "https://fal.media/files/tiger/4Ew1xYW6oZCs6STQVC7V8_86440216d0fe42e4b826d03a2121468e.jpg"
  ],
  "prompt": "Caption this image for a text-to-image model with as much detail as possible.",
  "system_prompt": "Only answer the question, do not provide any additional information or add any prefix/suffix other than the answer of the original question. Don't use markdown.",
  "model": "google/gemini-2.5-flash",
  "reasoning": false,
  "temperature": 1
}

Output Example

{
  "output": "A close-up of a tiger's face focusing on its bright orange iris and the area around its eye, with white fur eyebrows and a contrasting black and rich orange striped fur pattern. The word \"FLUX\" is overlaid in bold, white, brush-stroke styled text across the tiger's face.",
  "usage": {
    "completion_tokens": 63,
    "cost": 0.0005595,
    "prompt_tokens": 1340,
    "total_tokens": 1403
  }
}

🚀 Usage with OpenAI Client

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://fal.run/openrouter/router/openai/v1",
    api_key="not-needed",
    default_headers={
        "Authorization": f"Key {os.environ['FAL_KEY']}",
    },
)

response = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[
        {
            "role": "system",
            "content": "Only answer the question, do not provide any additional information or add any prefix/suffix other than the answer of the original question. Don't use markdown.",
        },
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Caption this image for a text-to-image model with as much detail as possible."},
                {
                    "type": "image_url",
                    "image_url": "https://fal.media/files/tiger/4Ew1xYW6oZCs6STQVC7V8_86440216d0fe42e4b826d03a2121468e.jpg",
                },
            ],
        },
    ],
    temperature=1,
)

print(response.choices[0].message.content)

🚿 Streaming Example

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://fal.run/openrouter/router/openai/v1",
    api_key="not-needed",
    default_headers={
        "Authorization": f"Key {os.environ['FAL_KEY']}",
    },
)

stream = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[
        {
            "role": "system",
            "content": "Only answer the question, do not provide any additional information or add any prefix/suffix other than the answer of the original question. Don't use markdown.",
        },
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Caption this image for a text-to-image model with as much detail as possible."},
                {
                    "type": "image_url",
                    "image_url": "https://fal.media/files/tiger/4Ew1xYW6oZCs6STQVC7V8_86440216d0fe42e4b826d03a2121468e.jpg",
                },
            ],
        },
    ],
    temperature=1,
    stream=True,
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta:
        print(chunk.choices[0].delta.content, end="", flush=True)

📚 Documentation

For more details, visit the official docs:

🔗 OpenRouter API Docs
⚡ fal.ai API Docs

Limitations

temperature range: 0 to 2

Try it in the Playground

​Quick Start

​Input Schema

​Output Schema

​Input Example

​Output Example

​🚀 Usage with OpenAI Client

​🚿 Streaming Example

​📚 Documentation

​Limitations

Quick Start

Input Schema

Output Schema

Input Example

Output Example

🚀 Usage with OpenAI Client

🚿 Streaming Example

📚 Documentation

Limitations