Happy Horse 1.0 is now on fal

bytedance/seedance-2.0/fast/image-to-video

ByteDance's most advanced image-to-video model, fast tier. Lower latency and cost with synchronized audio, start and end frame control, and motion prompts.
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

For every second of 720p video you generated, you will be charged $0.2419/second. Your request will cost $0.0112 per 1000 tokens. The number of tokens is given by (height of output video * width of output video * duration * 24) / 1024.

Logs

Seedance 2.0 : Fast Tier Image to Video API

ByteDance's most advanced image-to-video model, available on fal as `bytedance/seedance-2.0/fast/image-to-video`.


Overview

Provide a starting image URL and a text prompt describing the desired motion. The model preserves the visual content of your image and animates it — with cinematic camera control, realistic physics, and synchronized audio all included.

Key capabilities:

  • Native audio generation: music, SFX, and dialogue at no extra cost
  • Director-level camera control: dolly zooms, tracking shots, POV switches, rack focuses
  • Realistic physics: weight, collisions, fabric, and character motion
  • Multi-shot cuts possible within a single generation, up to 15 seconds
  • Start-and-end frame control: provide both a starting and ending image and the model transitions between them
  • Cinematic output at 720p

Inputs

ParameterRequiredTypeDescription
`prompt`YesstringText describing the desired motion and action
`image_url`YesstringStarting frame to animate. JPEG, PNG, WebP, max 30 MB
`end_image_url`NostringOptional ending frame. When provided, the video transitions from start to end image

Parameters

ParameterTypeDefaultDescription
`resolution`enum`720p``480p` (faster/cheaper) or `720p`
`duration`enum`auto``auto` or any integer from `4` to `15` seconds
`aspect_ratio`enum`auto``auto` (inferred from input image), `21:9`, `16:9`, `4:3`, `1:1`, `3:4`, `9:16`
`generate_audio`boolean`true`Synchronized audio: SFX, ambient sound, lip-synced speech
`seed`integerFix for reproducibility (minor variation may still occur)
`end_user_id`stringOptional identifier for the end user

Pricing

Billed per second of generated 720p output:

TierRate
Fast tier$0.2419 / sec
Standard tier$0.3024 / sec
Token-based billing$0.014 / 1,000 tokens

Token formula: `tokens = (height × width × duration × 24) / 1024`

A 10-second fast clip costs approximately $2.42, versus ~$3.03 on standard.


Quick Start

Python
bash
pip install fal-client
export FAL_KEY="YOUR_API_KEY"
python
import fal_client

result = fal_client.subscribe(
    "bytedance/seedance-2.0/fast/image-to-video",
    arguments={
        "prompt": "The cat slowly turns its head and blinks, fur ruffling in a gentle breeze.",
        "image_url": "https://your-host.com/cat.jpg",
        "resolution": "720p",
        "duration": "auto",
        "aspect_ratio": "auto",
        "generate_audio": True,
    },
    with_logs=True,
    on_queue_update=lambda u: [print(l["message"]) for l in u.logs]
    if isinstance(u, fal_client.InProgress) else None,
)

print(result["video"]["url"])

With start and end frames:

python
result = fal_client.subscribe(
    "bytedance/seedance-2.0/fast/image-to-video",
    arguments={
        "prompt": "The sun sets behind the mountains, sky shifting from gold to deep purple.",
        "image_url": "https://your-host.com/golden-hour.jpg",
        "end_image_url": "https://your-host.com/twilight.jpg",
        "duration": "8",
        "aspect_ratio": "16:9",
    },
)
JavaScript / Node.js
bash
npm install @fal-ai/client
export FAL_KEY="YOUR_API_KEY"
js
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("bytedance/seedance-2.0/fast/image-to-video", {
  input: {
    prompt: "The cat slowly turns its head and blinks, fur ruffling in a gentle breeze.",
    image_url: "https://your-host.com/cat.jpg",
    resolution: "720p",
    duration: "auto",
    aspect_ratio: "auto",
    generate_audio: true,
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

console.log(result.data.video.url);

Output

json
{
  "video": {
    "url": "https://...",
    "content_type": "video/mp4",
    "file_name": "output.mp4",
    "file_size": 4823041
  },
  "seed": 42
}

Async / Queue Usage

For longer generations, submit to the queue and poll:

python
handler = fal_client.submit(
    "bytedance/seedance-2.0/fast/image-to-video",
    arguments={...},
    webhook_url="https://your-server.com/webhook",
)

request_id = handler.request_id
status = fal_client.status("bytedance/seedance-2.0/fast/image-to-video", request_id, with_logs=True)
result = fal_client.result("bytedance/seedance-2.0/fast/image-to-video", request_id)

Fast vs. Standard Tier

The fast tier uses the exact same schema and parameters as `bytedance/seedance-2.0/image-to-video`.

FastStandard
Endpoint`bytedance/seedance-2.0/fast/image-to-video``bytedance/seedance-2.0/image-to-video`
Cost (10 sec)~$2.42~$3.03
LatencyLowerHigher
Output qualitySameSame

Compared to Reference-to-Video

Image to VideoReference to Video
Starting image1 (required)Up to 9 (optional)
Ending image1 (optional)Not supported
Reference videosNot supportedUp to 3
Reference audioNot supportedUp to 3
Use caseAnimate a single imageMulti-reference, multi-modal generation

Use image-to-video when you have one image to animate. Use reference-to-video when you need multi-modal inputs or want to reference multiple visual assets in a single prompt.


Availability

  • April 2, 2026: Launched with geographic and enterprise-only restrictions
  • April 9, 2026: All restrictions lifted, fully open with no geographic or use-case limitations