Generative
media platform
for developers.

Build the next generation of creativity
with fal. Lightning fast inference.

DocumentationGet started

Peak performance,
no compromise on quality.

Access the highest quality generative media models.

Optimized by the fal Inference Engineā„¢.
fal0.0s
alternative 1 0.0s
alternative 20.0s
flux[dev] inference speed

fal Inference Engineā„¢ is
the fastest way to run
diffusion models

Run diffusion models up to 4x faster. Enable new user experiences leveraging our real time infrastructure.

Features

Where developer experience meets the fastest AI.

Inference for Your Private Diffusion Model

If you are training your own diffusion transformer model, we would like to partner with you to run inference on your model. Fal's inference engine can run your model up to 50% faster and cost effective. Scale to thousands of GPUs when needed and pay only for what you use.

Blazing Fast Inference Engine for Diffusion Models

We have built world's fastest inference engine for diffusion models. We can run the FLUX models up to 400% faster than other alternatives.

Best LoRA Trainer In the Industry for Flux

Fal's head of AI research, Simo Ryu, was the first to implement LoRAs for diffusion models. We now bring you the best LoRA trainer for FLUX. You can personalize or train a new style in less than 5 minutes.

import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/fast-sdxl", {
  input: {
    prompt: "photo of a cat wearing a kimono"
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

World class developer experience

Use one of our client libraries to integrate fal directly into your applications.

Pricing

Fast, reliable, and cost-efficient.

fal.ai adapts to your usage, ensuring you only pay for the computing power you consume. It's cost-effective scalability at its best.

Some models are billed by model output. Please, check the model playground page for latest pricing information.

Choose a budget
$20.00
GPU H100 icon
GPUH100
VRAM80GB
CPUs12
CPU Memory112GB
Price per second$0.00125/s
GPU A100 icon
GPUA100
VRAM40GB
CPUs10
CPU Memory60GB
Price per second$0.00111/s
GPU A6000 icon
GPUA6000
VRAM48GB
CPUs8
CPU Memory48GB
Price per second$0.000575/s

Billing Based on Model Output

The models below are billed by model output, instead of compute seconds.
Model NameUnit Price (USD)
FLUX.1 [dev]
FLUX.1 [schnell]
FLUX.1 [pro]
Stable Diffusion 3 - Medium
Stable Video