Generative
media platform
for developers.

Build the next generation of creativity
with fal. Lightning fast inference.

Documentation Get started

Peak performance,
no compromise on quality.

Access the highest quality
generative media models.

Optimized by the fal Inference Engine™.

Model Gallery

Kling 1.6image-to-video MiniMax (Hailuo AI) Video 01 Liveimage-to-videomotiontransformation FLUX1.1 [pro] ultratext-to-imagehigh-resrealism FLUX.1 [dev]text-to-image Recraft V3text-to-imagevectortypographystyle Train Flux LoRAtraininglorapersonalization MiniMax (Hailuo AI) Video 01image-to-videomotiontransformation Train Flux LoRAs For Portraitstraininglorapersonalization AuraFlowtext-to-imagetypographystyle

Explore more models

fal0.0s

alternative 1 0.0s

alternative 20.0s

flux[dev] inference speed

fal Inference Engine™ is
the fastest way to run
diffusion models

Run diffusion models up to 4x faster. Enable new user experiences leveraging our real time infrastructure.

Features

Where developer experience meets the fastest AI.

Inference for Your Private Diffusion Model

If you are training your own diffusion transformer model, we would like to partner with you to run inference on your model. Fal's inference engine can run your model up to 50% faster and cost effective. Scale to thousands of GPUs when needed and pay only for what you use.

Blazing Fast Inference Engine for Diffusion Models

We have built world's fastest inference engine for diffusion models. We can run the FLUX models up to 400% faster than other alternatives.

Fine-tune your own models

Best LoRA Trainer In the Industry for Flux

Fal's head of AI research, Simo Ryu, was the first to implement LoRAs for diffusion models. We now bring you the best LoRA trainer for FLUX. You can personalize or train a new style in less than 5 minutes.

import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/fast-sdxl", {
  input: {
    prompt: "photo of a cat wearing a kimono"
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

World class developer experience

Use one of our client libraries to integrate fal directly into your applications.

Pricing

Fast, reliable, and cost-efficient.

fal.ai adapts to your usage, ensuring you only pay for the computing power you consume. It's cost-effective scalability at its best.

For private serverless model pricing, please see our enterprise pricing page.

Billing Based on Model Output

The models below are billed by model output, instead of compute seconds.

Model Name	Unit Price (USD)

Generativemedia platformfor developers.