OpenRouter Chat Completions [OpenAI Compatible] | Large Language Models

Docs Blog Pricing Enterprise Careers Research Grants

Log-in Sign-up

openrouter/router/openai/v1/chat/completions

Run any LLM (Large Language Model) with fal, powered by OpenRouter. This endpoint is compatible with the OpenAI API.

Inference

Commercial use

Schema

LLMs

Playground API

Input

Result

Idle

Waiting for your input...

You will be charged based on the number of input and output tokens.

Logs

🚀 Usage with OpenAI Client

python
from openai import OpenAI
import os

client = OpenAI(
    base_url="https://fal.run/openrouter/router/openai/v1",
    api_key="not-needed",
    default_headers={
        "Authorization": f"Key {os.environ['FAL_KEY']}",
    },
)

response = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[
        {"role": "user", "content": "Write a short story (under 200 words) about an AI that learns to dream. Use vivid sensory details and end with a surprising twist that makes the reader feel both awe and melancholy."},
    ],
)

print(response.choices[0].message.content)

🚿 Streaming Example

python
from openai import OpenAI
import os

client = OpenAI(
    base_url="https://fal.run/openrouter/router/openai/v1",
    api_key="not-needed",
    default_headers={
        "Authorization": f"Key {os.environ['FAL_KEY']}",
    },
)

stream = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms."},
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta:
        print(chunk.choices[0].delta.content, end="", flush=True)

📚 Documentation

For more details, visit the official docs:

🔗 OpenRouter API Docs
⚡ fal.ai API Docs