bytedance/seedance-2.0/fast/reference-to-video

ByteDance's most advanced reference-to-video model, fast tier. Lower latency and cost with up to 9 images, 3 videos, and 3 audio clips as inputs.
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

For every second of 720p video you generated, you will be charged $0.2419/second. Your request will cost $0.0112 per 1000 tokens. The number of tokens is given by (height of output video * width of output video * (input video duration + output video duration) * 24) / 1024. If video inputs are provided the price is multiplied by 0.6. With video inputs and 720p resolution the price is $0.14515 per second.

Logs

Run Seedance 2.0 AI Fast Reference To Video API on fal

ByteDance's most advanced video generation model, available on fal as `bytedance/seedance-2.0/fast/reference-to-video`.


Overview

Seedance 2.0 is a true multi-modal production tool that accepts a rich combination of inputs alongside a text prompt, then generates cinematic 720p video with synchronized audio.

Key capabilities:

  • Native audio generation: music, dialogue, and sound effects rendered alongside the video
  • Director-level camera control: dolly zooms, rack focuses, tracking shots, POV switches
  • Realistic physics: weight, collisions, fabric, and character motion
  • Multi-shot editing: a single generation can include natural cuts, up to 15 seconds
  • Cinematic output at 720p

Inputs

ModalityLimitFormatsNotes
Text prompt1Reference uploaded assets as `@Image1`, `@Video1`, `@Audio1`, etc.
ImagesUp to 9JPEG, PNG, WebPMax 30 MB each
VideosUp to 3MP4, MOVCombined duration 2–15 s, total under 50 MB, 480p–720p resolution
AudioUp to 3MP3, WAVCombined duration ≤ 15 s, max 15 MB each; requires at least one image or video

Total files across all modalities must not exceed 12.


Pricing

Billed per second of generated output:

ConditionRate
Standard (720p, fast tier)$0.2419 / sec
With video input provided~$0.1452 / sec (0.6× multiplier)
Token-based billing$0.014 / 1,000 tokens

Token formula: `tokens = height of output video * width of output video * (input video duration + output video duration) * 24) / 1024`


Parameters

ParameterTypeDefaultDescription
`prompt`stringText description of the video to generate
`image_urls`list<string>Reference image URLs
`video_urls`list<string>Reference video URLs
`audio_urls`list<string>Reference audio URLs
`resolution`enum`720p``480p` (faster/cheaper) or `720p`
`duration`enum`auto``auto` or any integer from `4` to `15` seconds
`aspect_ratio`enum`auto``auto`, `21:9`, `16:9`, `4:3`, `1:1`, `3:4`, `9:16`
`generate_audio`boolean`true`Generate synchronized audio (SFX, ambient, lip-sync)
`seed`integerFix seed for reproducibility (minor variation may still occur)
`end_user_id`stringOptional identifier for the end user

Quick Start

Python
bash
pip install fal-client
export FAL_KEY="YOUR_API_KEY"
python
import fal_client

result = fal_client.subscribe(
    "bytedance/seedance-2.0/fast/reference-to-video",
    arguments={
        "prompt": "A surfer rides a massive wave at golden hour. @Image1 sets the scene.",
        "image_urls": ["https://your-host.com/beach.jpg"],
        "resolution": "720p",
        "duration": "auto",
        "aspect_ratio": "16:9",
        "generate_audio": True,
    },
    with_logs=True,
    on_queue_update=lambda u: [print(l["message"]) for l in u.logs]
    if isinstance(u, fal_client.InProgress) else None,
)

print(result["video"]["url"])
JavaScript / Node.js
bash
npm install @fal-ai/client
export FAL_KEY="YOUR_API_KEY"
js
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("bytedance/seedance-2.0/fast/reference-to-video", {
  input: {
    prompt: "A surfer rides a massive wave at golden hour. @Image1 sets the scene.",
    image_urls: ["https://your-host.com/beach.jpg"],
    resolution: "720p",
    duration: "auto",
    aspect_ratio: "16:9",
    generate_audio: true,
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

console.log(result.data.video.url);

Output

json
{
  "video": {
    "url": "https://...",
    "content_type": "video/mp4",
    "file_name": "output.mp4",
    "file_size": 4823041
  },
  "seed": 42
}

Async / Queue Usage

For longer generations, submit to the queue and poll:

python
handler = fal_client.submit(
    "bytedance/seedance-2.0/fast/reference-to-video",
    arguments={...},
    webhook_url="https://your-server.com/webhook",
)

request_id = handler.request_id
status = fal_client.status("bytedance/seedance-2.0/fast/reference-to-video", request_id, with_logs=True)
result = fal_client.result("bytedance/seedance-2.0/fast/reference-to-video", request_id)

Fast vs. Standard Tier

The fast tier uses the same schema and parameters as the standard endpoints: lower latency and lower cost, same capabilities.

FastStandard
Endpoint suffix`.../fast/reference-to-video``.../reference-to-video`
LatencyLowerHigher
CostLowerHigher
Output qualitySameSame