Kling 3.0 is here, exclusively on fal!

Skyreels V1 (Image-to-Video) Image to Video

fal-ai/skyreels-i2v
SkyReels V1 is the first and most advanced open-source human-centric video foundation model. By fine-tuning HunyuanVideo on O(10M) high-quality film and television clips
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.3 per video.

Logs

SkyReels V1 | [image-to-video]

Hunyuan's SkyReels V1 delivers human-centric video generation from static images at $0.30 per video through fine-tuning HunyuanVideo on 10M+ film and television clips. This model emphasizes specialized human motion quality, focusing on facial expressions, body language, and natural movement patterns trained specifically on cinematic footage. Built for creators who need believable human subjects without the uncanny valley effect common in general video models.

Use Cases: Character Animation for Film | Social Media Content Creation | Marketing Video Production


Performance

At $0.30 per video generation, SkyReels V1 positions as a specialized alternative to general-purpose video models, trading broader scene capabilities for superior human motion fidelity.

MetricResultContext
Output FormatMP4 videoSingle image input to video output
Inference Steps1-50 configurableDefault 30 steps balances quality and speed
Cost per Video$0.303.3 generations per $1.00 on fal
Aspect Ratios16:9, 9:16Optimized for social and widescreen formats
Guidance Scale1.0-20.0 rangeDefault 6.0 for balanced prompt adherence

Human-Centric Motion Architecture

SkyReels V1 builds on HunyuanVideo's foundation with targeted fine-tuning on 10 million film and television clips, a dataset specifically curated for human subjects rather than general scenes. This training approach contrasts with broad video models that attempt universal scene generation by prioritizing natural human movement patterns, facial micro-expressions, and body language coherence.

What this means for you:

  • Cinematic Human Motion: Training on professional film footage produces movement quality that matches cinematography standards rather than synthetic motion patterns

  • Prompt-Driven Control: Text descriptions guide video generation with configurable guidance scale (1.0-20.0) and negative prompts to refine unwanted attributes

  • Flexible Aspect Ratios: Native support for 16:9 widescreen and 9:16 vertical formats eliminates post-production cropping for platform-specific content

  • Deterministic Generation: Optional seed parameter enables reproducible results for iterative refinement workflows


Technical Specifications

SpecDetails
ArchitectureHunyuanVideo (fine-tuned)
Input FormatsSingle image URL (JPG, PNG, WebP, GIF, AVIF) + text prompt
Output FormatsMP4 video
Inference Steps1-50 configurable (default: 30)
LicenseCommercial use permitted

API Documentation | Quickstart Guide | Enterprise Pricing


How It Stacks Up

MuseTalk Image to Video – SkyReels V1 generates full-body human motion from single images with text prompt control. MuseTalk focuses specifically on audio-driven facial animation and lip-sync for talking head videos.

Kling Video v2.6 Pro Image to Video – SkyReels V1 trades general-purpose video generation for specialized human motion quality at $0.30 per video. Kling v2.6 Pro handles broader scene types with advanced prompt interpretation for multi-purpose video generation needs.