Build the next generation of creativity
with fal. Lightning fast inference.
Access the highest quality
generative media models.
fal Inference Engineā¢ is
the fastest way to run
diffusion models
Run diffusion models up to 4x faster. Enable new user experiences leveraging our real time infrastructure.
If you are training your own diffusion transformer model, we would like to partner with you to run inference on your model. Fal's inference engine can run your model up to 50% faster and cost effective. Scale to thousands of GPUs when needed and pay only for what you use.
We have built world's fastest inference engine for diffusion models. We can run the FLUX models up to 400% faster than other alternatives.
Fal's head of AI research, Simo Ryu, was the first to implement LoRAs for diffusion models. We now bring you the best LoRA trainer for FLUX. You can personalize or train a new style in less than 5 minutes.
import * as fal from "@fal-ai/serverless-client";
const result = await fal.subscribe("fal-ai/fast-sdxl", {
input: {
prompt: "photo of a cat wearing a kimono"
},
logs: true,
onQueueUpdate: (update) => {
if (update.status === "IN_PROGRESS") {
update.logs.map((log) => log.message).forEach(console.log);
}
},
});
Use one of our client libraries to integrate fal directly into your applications.
fal.ai adapts to your usage, ensuring you only pay for the computing power you consume. It's cost-effective scalability at its best.
Some models are billed by model output. Please, check the model playground page for latest pricing information.
Model Name | Unit Price (USD) | |
---|---|---|
FLUX.1 [dev] | ||
FLUX.1 [schnell] | ||
FLUX.1 [pro] | ||
Stable Diffusion 3 - Medium | ||
Stable Video |