Build production-ready video generation using ByteDance's Seedance 1.5 Pro . This guide covers Python and JavaScript implementations, parameter configuration for synchronized audio-visual output, and error handling with exponential backoff.
Building with Seedance 1.5 Pro
ByteDance's Seedance 1.5 Pro introduces a dual-branch diffusion transformer architecture that generates video and audio simultaneously within a shared latent space1. This architectural approach addresses the synchronization challenges that have historically plagued text-to-video systems, where audio is typically added as a post-processing step. The model produces tight lip-sync and natural foley without requiring separate audio generation or manual alignment.
This guide walks through integrating Seedance 1.5 Pro via fal's serverless infrastructure. The model accepts text prompts and returns video files with synchronized audio, supporting resolutions up to 720p and durations from 4 to 12 seconds. For workflows requiring image input, fal also offers an image-to-video endpoint that animates start and end frames with generated motion and audio.
API Architecture
Seedance 1.5 Pro operates as a serverless inference endpoint. You send requests and receive generated videos without managing infrastructure. The fal endpoint (fal-ai/bytedance/seedance/v1.5/pro/text-to-video) handles computational complexity behind the scenes.
The dual-branch architecture represents a departure from traditional video generation pipelines. Research on unified audio-video generation has demonstrated that processing both modalities in a shared latent space produces stronger cross-modal alignment than sequential approaches2. Seedance 1.5 Pro applies this principle at production scale, enabling synchronized dialogue, spatial sound effects, and coordinated ambient audio.
falMODEL APIs
The fastest, cheapest and most reliable way to run genAI models. 1 API, 100s of models
Development Environment Setup
The Seedance 1.5 API uses API key authentication through fal's platform. Obtain your API key from the fal dashboard following the quickstart guide.
For Python developers, install the fal client library:
pip install fal-client
JavaScript developers using Node.js should install:
npm install @fal-ai/client
Store your API key as an environment variable rather than hardcoding it. This prevents accidental exposure in version control and simplifies credential rotation.
Python Integration
Here is a complete Python implementation that generates a video from a text prompt:
import fal_client
import os
fal_client.api_key = os.getenv("FAL_KEY")
def generate_video(prompt, duration="5", resolution="720p"):
result = fal_client.subscribe(
"fal-ai/bytedance/seedance/v1.5/pro/text-to-video",
arguments={
"prompt": prompt,
"duration": duration,
"resolution": resolution,
"aspect_ratio": "16:9",
"generate_audio": True,
"enable_safety_checker": True
}
)
return result
video = generate_video(
prompt="A golden retriever playing fetch in a park at sunset, slow motion",
duration="8"
)
print(f"Video URL: {video['video']['url']}")
The subscribe method handles the asynchronous nature of video generation automatically, waiting for completion and returning the final result. This approach simplifies your code compared to managing polling loops manually through the Queue API.
JavaScript Implementation
For web developers building interactive applications:
import { fal } from "@fal-ai/client";
fal.config({ credentials: process.env.FAL_KEY });
const result = await fal.subscribe(
"fal-ai/bytedance/seedance/v1.5/pro/text-to-video",
{
input: {
prompt:
"Chef tossing vegetables in a wok, flames rising, restaurant kitchen",
duration: "6",
resolution: "720p",
generate_audio: true,
},
logs: true,
onQueueUpdate: (update) => {
if (update.status === "IN_PROGRESS") {
console.log(update.logs.map((log) => log.message).join("\n"));
}
},
}
);
console.log(result.data.video.url);
The onQueueUpdate callback provides real-time feedback during generation, which you can use to display progress indicators in your user interface.
Response Schema
The API returns a JSON object with the following structure:
{
"video": {
"url": "https://v3b.fal.media/files/.../video.mp4",
"content_type": "video/mp4"
},
"seed": 42
}
The video.url field contains the download URL for the generated MP4 file. The seed value enables reproducibility when passed back to subsequent requests.
Request Parameters
| Parameter | Options | Description |
|---|---|---|
prompt | Text (required) | Scene description including action, dialogue, camera movement, and sound |
aspect_ratio | 21:9, 16:9, 4:3, 1:1, 3:4, 9:16 | Match to distribution platform. Default: 16:9 |
resolution | 480p, 720p | Use 480p for iteration, 720p for final output |
duration | 4 to 12 seconds | Longer videos require more processing. Default: 5 |
generate_audio | true, false | Enable synchronized audio. Default: true |
camera_fixed | true, false | Lock camera position for stability |
seed | Integer or -1 | Set specific value for reproducibility, -1 for random |
Prompt engineering matters significantly with Seedance 1.5 Pro. Instead of generic descriptions like "a person walking," construct detailed scene compositions: "middle-aged woman in business attire walking through a modern office lobby, morning sunlight streaming through glass windows, her heels clicking on marble floors." Include camera movements, lighting conditions, and audio elements for precise control.
When selecting aspect ratios, consider your distribution channel:
- TikTok and Instagram Stories perform best with
9:16 - YouTube prefers
16:9 - Cinematic content benefits from
21:9 - Social media feed posts work well with
1:1
Performance Characteristics
Expect generation times of 30-45 seconds for a typical 5-second 720p clip. First requests after idle periods may experience additional cold start latency. Plan your UX accordingly: display progress indicators and avoid blocking user interfaces.
For high-volume applications, implement request queuing rather than sequential processing. The Queue API allows you to submit requests and poll for completion or receive results via webhooks. When using webhooks, your endpoint receives a POST request containing the same response schema documented above, allowing you to process completed videos asynchronously.
Error Handling
Production applications require robust error handling with retry logic for transient failures:
import time
def generate_with_retry(prompt, max_retries=3, backoff_factor=2):
for attempt in range(max_retries):
try:
return fal_client.subscribe(
"fal-ai/bytedance/seedance/v1.5/pro/text-to-video",
arguments={"prompt": prompt, "duration": "5", "resolution": "720p"}
)
except fal_client.exceptions.RateLimitError:
if attempt < max_retries - 1:
time.sleep(backoff_factor ** attempt)
else:
raise
except fal_client.exceptions.ValidationError:
raise # Don't retry invalid input
This pattern handles rate limiting gracefully while avoiding infinite retry loops on permanent failures like invalid prompts.
Pricing
Each 720p 5-second video with audio costs approximately $0.26. The pricing formula:
tokens = (height × width × FPS × duration) / 1024
cost = tokens × rate
For a 720p video (1280×720) at 24fps for 5 seconds:
tokens = (1280 × 720 × 24 × 5) / 1024 = 108,000 tokens
cost with audio = 108,000 × ($2.4 / 1,000,000) = $0.26
Without audio, the rate drops to $1.2 per million tokens. The 480p resolution generates faster and consumes fewer tokens, making it suitable for previews before committing to full-quality renders. Monitor usage patterns and implement budget alerts to prevent unexpected charges.
Production Deployment Checklist
Before launching your Seedance 1.5 integration, verify these elements:
- Environment variables: API keys stored securely, never in source code
- Error monitoring: Logging integrated with services like Sentry or DataDog
- Rate limiting: Client-side throttling implemented and tested
- Webhook handlers: If using async workflows, endpoints secured and validated
- Content moderation: Safety checker enabled unless you have specific reasons otherwise
- Cost alerts: Budget monitoring configured
- Timeout configuration: Account for 30-45 second generation times plus potential cold starts
- User feedback: Progress indicators and clear error messages in your UI
Next Steps
Once you have mastered basic integration, explore advanced patterns. Experiment with seed-based variation generation, where you create multiple versions of the same concept by varying seeds while keeping prompts constant. Build prompt templates that users can customize with structured inputs rather than free-text. For image-driven workflows, the image-to-video endpoint accepts start and end frames, generating motion and audio between them.
Recently Added
References
-
ByteDance Seed. "Seedance 1.5 pro." seed.bytedance.com, 2025. https://seed.bytedance.com/en/seedance1_5_pro ↩
-
Zhao, L., et al. "UniForm: A Unified Multi-Task Diffusion Transformer for Audio-Video Generation." arXiv preprint arXiv:2502.03897, 2025. https://arxiv.org/abs/2502.03897 ↩

![Image-to-image editing with LoRA support for FLUX.2 [klein] 9B from Black Forest Labs. Specialized style transfer and domain-specific modifications.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8aaeb2%2FFZOclk1jcZaVZAP_C12Qe_edbbb28567484c48bd205f24bafd6225.jpg&w=3840&q=75)
![Image-to-image editing with LoRA support for FLUX.2 [klein] 4B from Black Forest Labs. Specialized style transfer and domain-specific modifications.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8aae07%2FWKhXnfsA7BNpDGwCXarGn_52f0f2fdac2c4fc78b2765b6c662222b.jpg&w=3840&q=75)
![Image-to-image editing with Flux 2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f49%2FnKsGN6UMAi6IjaYdkmILC_e20d2097bb984ad589518cf915fe54b4.jpg&w=3840&q=75)
![Text-to-image generation with FLUX.2 [klein] 9B Base from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f3c%2F90FKDpwtSCZTqOu0jUI-V_64c1a6ec0f9343908d9efa61b7f2444b.jpg&w=3840&q=75)
![Image-to-image editing with Flux 2 [klein] 9B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f50%2FX8ffS5h55gcigsNZoNC7O_52e6b383ac214d2abe0a2e023f03de88.jpg&w=3840&q=75)
![Text-to-image generation with Flux 2 [klein] 4B Base from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f36%2FbYUAh_nzYUAUa_yCBkrP1_2dd84022eeda49e99db95e13fc588e47.jpg&w=3840&q=75)
![Image-to-image editing with Flux 2 [klein] 4B from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f40%2F-9rbLPCsz36IFb-4t3J2L_76750002c0db4ce899b77e98321ffe30.jpg&w=3840&q=75)
![Text-to-image generation with Flux 2 [klein] 4B from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f30%2FUwGq5qBE9zqd4r6QI7En0_082c2d0376a646378870218b6c0589f9.jpg&w=3840&q=75)







![Flux 2 [klein] User Guide | fal](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a9bd5%2F61y0X7Wgt4WQKzO1y69ew_1768560214814.png&w=828&q=75)
