- Image To Video
- Text To Video
Endpoint:
POST https://fal.run/fal-ai/bytedance/seedance/v1.5/pro/image-to-video
Endpoint ID: fal-ai/bytedance/seedance/v1.5/pro/image-to-videoTry it in the Playground
Run this model interactively with your own prompts.
Quick Start
Input Schema
The text prompt used to generate the video
The aspect ratio of the generated video Default value:
"16:9"Possible values: 21:9, 16:9, 4:3, 1:1, 3:4, 9:16, autoVideo resolution - 480p for faster generation, 720p for balance, 1080p for higher quality Default value:
"720p"Possible values: 480p, 720p, 1080pDuration of the video in seconds Default value:
"5"Possible values: 4, 5, 6, 7, 8, 9, 10, 11, 12Whether to fix the camera position
Random seed to control video generation. Use -1 for random.
If set to true, the safety checker will be enabled. Default value:
trueWhether to generate audio for the video Default value:
trueThe URL of the image used to generate video
The URL of the image the video ends with. Defaults to None.
Output Schema
Generated video file
Seed used for generation
Input Example
Output Example
Use Cases
| Use Case | Why Seedance 1.5 Pro fits |
|---|---|
| Photo animation | Breathe life into a still portrait or product shot with realistic motion and ambient sound. |
| Character animation | Turn concept art or a single character frame into a speaking, emoting performance with lip-sync. |
| Product reveals | Start on a hero shot, end on packaging — the model animates the transition with cinematic flair. |
| Scene transitions | Define start and end compositions for precise A-to-B shots — useful for ads, trailers, or music videos. |
| Storyboard-to-video | Convert illustrated storyboard frames into rough-cut motion tests with matching audio. |
| Social content | Animate memes, portraits, or fan art into shareable clips with sound. |
| Virtual avatars | Animate a single headshot into a talking-head video with natural speech and lip-sync. |
Key Features
| Feature | Description |
|---|---|
| Start frame conditioning | Upload an image to set the opening composition, lighting, subject, and style. |
| End frame conditioning | Optionally upload a second image to define where the shot lands — the model generates the motion path between them. |
| Native audio generation | Dialogue, sound effects, and ambient audio rendered alongside the video. Lip movements stay locked to speech. |
| Cinematic camera work | Pan, tilt, zoom, dolly, orbit, tracking shots — describe the move in your prompt. |
| Character consistency | The subject from your start frame stays stable throughout — face, clothing, and expression. |
| High resolution | Output up to 1080p with smooth temporal consistency. |
Controls
| Parameter | Options | Notes |
|---|---|---|
prompt | Text (required) | Describe action, dialogue, camera, and sound |
image_url | URL (required) | Start frame — sets the opening composition |
end_image_url | URL (optional) | End frame — defines the closing composition |
aspect_ratio | 21:9 · 16:9 · 4:3 · 1:1 · 3:4 · 9:16 | Default: 16:9 |
resolution | 480p · 720p | 480p for faster iteration; 720p for final output |
duration | 4–12 seconds | Default: 5 |
generate_audio | true / false | Default: true — set false for silent video |
camera_fixed | true / false | Lock the camera in place (tripod shot) |
seed | Integer | Set a value for reproducibility; use -1 for random |
Start Frame / End Frame
This is the core differentiator from text-to-video. You control the opening and closing compositions directly.| Frame | What it does |
|---|---|
Start frame (image_url) | Required. Sets the initial subject, pose, lighting, color grade, and environment. The model animates forward from here. |
End frame (end_image_url) | Optional. Defines the final composition. The model generates a motion path that lands precisely on this frame. |
- Use the same subject in both frames for smooth transitions.
- Match aspect ratio and style between start and end frames.
- Motion is generated in latent space — not interpolated — so physics and camera movement feel natural.
Prompting Tips
Your prompt guides what happens between the frames:| Element | Example |
|---|---|
| Action | ”She turns to face the camera and smiles” |
| Dialogue | "I've been waiting for this moment." (use quotes) |
| Camera | ”Slow push-in ending on a close-up” |
| Audio/Foley | ”Soft piano, room reverb, fabric rustling” |
- The start frame already defines the scene — focus your prompt on motion and sound.
- For talking heads, put the dialogue in quotes and describe the emotion:
"I can't believe it," voice breaking with emotion. - Use
camera_fixed: trueif you want the subject to move but the frame to stay locked.
Specs
| Spec | Value |
|---|---|
| Max duration | 12 seconds |
| Max resolution | 1080p |
| Audio | Mixed dialogue + foley + score, 48 kHz AAC |
| Output format | MP4 (H.264) |
API
fal.ai → Seedance 1.5 Pro image-to-videoRelated
- Bytedance — Video Generation
Limitations
aspect_ratiorestricted to:21:9,16:9,4:3,1:1,3:4,9:16,autoresolutionrestricted to:480p,720p,1080p- Content moderation via safety checker