Step-Video Text to Video

fal-ai/stepfun-video
Step-Video is a state-of-the-art (SoTA) text-to-video pre-trained model with 30 billion parameters and the capability to generate videos up to 204 frames.
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle
This generation takes approximately 20m.

Loading pricing info...

Logs