Seedance 2.0 Fast (Text to Video) API on fal

Run Seedance 2.0 AI Fast Text To Video API on fal

ByteDance's most advanced text-to-video model, available on fal as `bytedance/seedance-2.0/fast/text-to-video`.

Overview

The simplest Seedance 2.0 endpoint: provide a text prompt, get a fully produced cinematic video. No image or reference input required. The model handles camera direction, physics, multi-shot editing, and synchronized audio all in a single pass.

Key capabilities:

Native audio generation: music, SFX, and lip-synced dialogue, all included at no extra cost
Director-level camera control: dolly zooms, rack focuses, tracking shots, POV switches, handheld movement
Realistic physics: collisions, fabric, character motion, vehicle chases, fight scenes
Multi-shot editing: natural cuts within a single generation, up to 15 seconds
Cinematic 720p output

Parameters

Parameter	Type	Default	Description
`prompt`	string	—	Text description of the video to generate
`resolution`	enum	`720p`	`480p` (faster/cheaper) or `720p`
`duration`	enum	`auto`	`auto` or any integer from `4` to `15` seconds
`aspect_ratio`	enum	`auto`	`auto`, `21:9`, `16:9`, `4:3`, `1:1`, `3:4`, `9:16`
`generate_audio`	boolean	`true`	Synchronized audio: SFX, ambient sound, lip-synced speech. Same price either way.
`seed`	integer	—	Fix for reproducibility (minor variation may still occur)
`end_user_id`	string	—	Optional identifier for the end user

Pricing

Billed per second of generated 720p output:

Tier	Rate	10-sec clip
Fast tier	$0.2419 / sec	~$2.42
Standard tier	$0.3034 / sec	~$3.03
Token-based billing	$0.014 / 1,000 tokens	—

Token formula: `tokens = (height × width × duration × 24) / 1024`

Audio generation is included at no extra cost regardless of the `generate_audio` setting.

Quick Start

Python

bash
pip install fal-client
export FAL_KEY="YOUR_API_KEY"

python
import fal_client

result = fal_client.subscribe(
    "bytedance/seedance-2.0/fast/text-to-video",
    arguments={
        "prompt": "A lone astronaut walks across a rust-colored Martian plain at dusk, dust devils swirling in the distance, handheld camera, cinematic.",
        "resolution": "720p",
        "duration": "10",
        "aspect_ratio": "16:9",
        "generate_audio": True,
    },
    with_logs=True,
    on_queue_update=lambda u: [print(l["message"]) for l in u.logs]
    if isinstance(u, fal_client.InProgress) else None,
)

print(result["video"]["url"])

JavaScript / Node.js

bash
npm install @fal-ai/client
export FAL_KEY="YOUR_API_KEY"

js
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("bytedance/seedance-2.0/fast/text-to-video", {
  input: {
    prompt: "A lone astronaut walks across a rust-colored Martian plain at dusk, dust devils swirling in the distance, handheld camera, cinematic.",
    resolution: "720p",
    duration: "10",
    aspect_ratio: "16:9",
    generate_audio: true,
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

console.log(result.data.video.url);

Output

json
{
  "video": {
    "url": "https://...",
    "content_type": "video/mp4",
    "file_name": "output.mp4",
    "file_size": 4823041
  },
  "seed": 42
}

Async / Queue Usage

python
handler = fal_client.submit(
    "bytedance/seedance-2.0/fast/text-to-video",
    arguments={...},
    webhook_url="https://your-server.com/webhook",
)

request_id = handler.request_id
status = fal_client.status("bytedance/seedance-2.0/fast/text-to-video", request_id, with_logs=True)
result = fal_client.result("bytedance/seedance-2.0/fast/text-to-video", request_id)

Choosing the Right Endpoint

All three fast-tier endpoints share the same schema and pricing. Choose based on your inputs:

Endpoint	When to use
`bytedance/seedance-2.0/fast/text-to-video`	Text prompt only, no reference media
`bytedance/seedance-2.0/fast/image-to-video`	Animate a single starting image, optionally with an ending frame
`bytedance/seedance-2.0/fast/reference-to-video`	Multi-modal: up to 9 images, 3 videos, and 3 audio clips as references

bytedance/seedance-2.0/fast/text-to-video

Input

Result

What would you like to do next?

Logs