bytedance/seedance-2.0/fast/reference-to-video
Input
Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Customize your input with more control.
Result
What would you like to do next?
For every second of 720p video you generated, you will be charged $0.2419/second. Your request will cost $0.0112 per 1000 tokens. The number of tokens is given by (height of output video * width of output video * (input video duration + output video duration) * 24) / 1024. If video inputs are provided the price is multiplied by 0.6. With video inputs and 720p resolution the price is $0.14515 per second.
Logs
Run Seedance 2.0 AI Fast Reference To Video API on fal
ByteDance's most advanced video generation model, available on fal as `bytedance/seedance-2.0/fast/reference-to-video`.
Overview
Seedance 2.0 is a true multi-modal production tool that accepts a rich combination of inputs alongside a text prompt, then generates cinematic 720p video with synchronized audio.
Key capabilities:
- Native audio generation: music, dialogue, and sound effects rendered alongside the video
- Director-level camera control: dolly zooms, rack focuses, tracking shots, POV switches
- Realistic physics: weight, collisions, fabric, and character motion
- Multi-shot editing: a single generation can include natural cuts, up to 15 seconds
- Cinematic output at 720p
Inputs
| Modality | Limit | Formats | Notes |
|---|---|---|---|
| Text prompt | 1 | — | Reference uploaded assets as `@Image1`, `@Video1`, `@Audio1`, etc. |
| Images | Up to 9 | JPEG, PNG, WebP | Max 30 MB each |
| Videos | Up to 3 | MP4, MOV | Combined duration 2–15 s, total under 50 MB, 480p–720p resolution |
| Audio | Up to 3 | MP3, WAV | Combined duration ≤ 15 s, max 15 MB each; requires at least one image or video |
Total files across all modalities must not exceed 12.
Pricing
Billed per second of generated output:
| Condition | Rate |
|---|---|
| Standard (720p, fast tier) | $0.2419 / sec |
| With video input provided | ~$0.1452 / sec (0.6× multiplier) |
| Token-based billing | $0.014 / 1,000 tokens |
Token formula: `tokens = height of output video * width of output video * (input video duration + output video duration) * 24) / 1024`
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
`prompt` | string | — | Text description of the video to generate |
`image_urls` | list<string> | — | Reference image URLs |
`video_urls` | list<string> | — | Reference video URLs |
`audio_urls` | list<string> | — | Reference audio URLs |
`resolution` | enum | `720p` | `480p` (faster/cheaper) or `720p` |
`duration` | enum | `auto` | `auto` or any integer from `4` to `15` seconds |
`aspect_ratio` | enum | `auto` | `auto`, `21:9`, `16:9`, `4:3`, `1:1`, `3:4`, `9:16` |
`generate_audio` | boolean | `true` | Generate synchronized audio (SFX, ambient, lip-sync) |
`seed` | integer | — | Fix seed for reproducibility (minor variation may still occur) |
`end_user_id` | string | — | Optional identifier for the end user |
Quick Start
Python
bashpip install fal-client export FAL_KEY="YOUR_API_KEY"
pythonimport fal_client result = fal_client.subscribe( "bytedance/seedance-2.0/fast/reference-to-video", arguments={ "prompt": "A surfer rides a massive wave at golden hour. @Image1 sets the scene.", "image_urls": ["https://your-host.com/beach.jpg"], "resolution": "720p", "duration": "auto", "aspect_ratio": "16:9", "generate_audio": True, }, with_logs=True, on_queue_update=lambda u: [print(l["message"]) for l in u.logs] if isinstance(u, fal_client.InProgress) else None, ) print(result["video"]["url"])
JavaScript / Node.js
bashnpm install @fal-ai/client export FAL_KEY="YOUR_API_KEY"
jsimport { fal } from "@fal-ai/client"; const result = await fal.subscribe("bytedance/seedance-2.0/fast/reference-to-video", { input: { prompt: "A surfer rides a massive wave at golden hour. @Image1 sets the scene.", image_urls: ["https://your-host.com/beach.jpg"], resolution: "720p", duration: "auto", aspect_ratio: "16:9", generate_audio: true, }, logs: true, onQueueUpdate: (update) => { if (update.status === "IN_PROGRESS") { update.logs.map((log) => log.message).forEach(console.log); } }, }); console.log(result.data.video.url);
Output
json{ "video": { "url": "https://...", "content_type": "video/mp4", "file_name": "output.mp4", "file_size": 4823041 }, "seed": 42 }
Async / Queue Usage
For longer generations, submit to the queue and poll:
pythonhandler = fal_client.submit( "bytedance/seedance-2.0/fast/reference-to-video", arguments={...}, webhook_url="https://your-server.com/webhook", ) request_id = handler.request_id status = fal_client.status("bytedance/seedance-2.0/fast/reference-to-video", request_id, with_logs=True) result = fal_client.result("bytedance/seedance-2.0/fast/reference-to-video", request_id)
Fast vs. Standard Tier
The fast tier uses the same schema and parameters as the standard endpoints: lower latency and lower cost, same capabilities.
| Fast | Standard | |
|---|---|---|
| Endpoint suffix | `.../fast/reference-to-video` | `.../reference-to-video` |
| Latency | Lower | Higher |
| Cost | Lower | Higher |
| Output quality | Same | Same |
