fal Assets is now live!

Real-time
generative media
inference

Build the next generation of creativity with fal. Lightning fast inference.

Realtime Models

Segment Anything Model 3SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

Segment Anything Model 2SAM 2 is a model for segmenting images and videos in real-time.

Sam 3 1SAM 3.1 builds comes with Object Multiplex, a shared-memory approach for joint multi-object tracking that delivers faster speeds with larger number of objects tracked.

Stable Diffusion XL LightningRun SDXL at the speed of light

Sam 3SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

MuseTalkMuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.

Sam 3SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

Sam 3 1SAM 3.1 builds comes with Object Multiplex, a shared-memory approach for joint multi-object tracking that delivers faster speeds with larger number of objects tracked.

Sam 3SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

Segment Anything Model 2SAM 2 is a model for segmenting images and videos in real-time.

Sam 3 1SAM 3.1 builds comes with Object Multiplex, a shared-memory approach for joint multi-object tracking that delivers faster speeds with larger number of objects tracked.

Sam 3 1SAM 3.1 builds comes with Object Multiplex, a shared-memory approach for joint multi-object tracking that delivers faster speeds with larger number of objects tracked.

Latent Consistency Models (v1.5/XL)Run SDXL at the speed of light

Optimized Latent Consistency (SDv1.5)Produce high-quality images with minimal inference steps. Optimized for 512x512 input image size.

Latent Consistency Models (v1.5/XL)Run SDXL at the speed of light

Sam 3SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

Latent Consistency Models (v1.5/XL)Run SDXL at the speed of light

FlashheadSoulX-FlashHead is a unified 1.3B-parameter framework designed for high-fidelity, infinite-length, and real-time streaming portrait video generation.

Latent Consistency (SDXL & SDv1.5)Produce high-quality images with minimal inference steps.

SDXL Realtime

This fast inference capability opens up new possibilities for application types that were previously not feasible, such as real-time creativity tools and using the camera as a real-time model input.

# of steps2

Inference time0.214s

fal Demos for Realtime