Run the latest models all in one Sandbox 🏖️

Pixverse Image to Video

fal-ai/pixverse/swap
Generate high quality video clips by swapping person, objects and background using Pixverse Swap.
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

For 5s video your request will cost $0.15 for 360p and 540p, $0.2 for 720p and $0.4 for 1080p. If input video duration is greater than 5 s the cost will double. For $1 you can run this model with approximately 2 times.

Logs

Pixverse Swap | [image/video-to-video]

Pixverse's Swap technology delivers targeted video manipulation at $0.15-$0.40 per 5-second clip, trading broad generation capabilities for surgical precision in person, object, and background replacement. This image-to-video approach bypasses prompt engineering entirely. You provide the video and reference image, and Pixverse handles the rest through keyframe-based swapping across three distinct modes.

Use Cases: Content Personalization | Product Placement | Background Replacement


Performance

At $0.15 per 5-second video (360p/540p) or $0.20-$0.40 for higher resolutions, Pixverse Swap positions itself as a specialized editing tool rather than a generation engine, costs double for videos exceeding 5 seconds, making it most economical for short-form content manipulation.

MetricResultContext
Processing Cost$0.15-$0.40 per 5sResolution-dependent: 360p/540p ($0.15), 720p ($0.20), 1080p ($0.40); doubles for >5s videos
Swap Modes3 distinct modesPerson, object, and background targeting via keyframe selection
Resolution SupportUp to 720p standard1080p available but not supported in current implementation
Audio HandlingOriginal audio preservedOptional toggle to maintain source video soundtrack
Related EndpointsPixverse v5.5 Effects, Pixverse v3.5 TransitionEffects-based and transition-focused variants for different creative workflows

Surgical Precision Over Broad Generation

Pixverse Swap diverges from traditional text-to-video models by operating on existing footage rather than generating from scratch. You select a keyframe position (frame 1 through last frame), choose your swap mode, and provide a reference image, the system handles semantic matching and temporal consistency automatically.

What this means for you:

  • Mode-Specific Targeting: Separate person, object, and background modes ensure the model focuses swap operations on semantically appropriate elements rather than applying broad transformations

  • Keyframe Control: Frame-level precision (keyframe_id parameter) lets you anchor swaps to specific moments, critical when timing matters for narrative or product placement

  • Audio Preservation: Original soundtrack retention (original_sound_switch) maintains audio-visual sync without re-processing audio tracks

  • Resolution Flexibility: 360p through 720p output options balance quality against cost, 4x price difference between lowest and highest tiers enables budget optimization per project


Technical Specifications

SpecDetails
ArchitecturePixverse Swap
Input FormatsVideo (MP4, MOV, WebM, M4V, GIF) + Image (JPG, JPEG, PNG, WebP, GIF, AVIF)
Output FormatsMP4 video with optional original audio
Resolution Options360p, 540p, 720p (1080p listed but unsupported)
LicenseCommercial use permitted

API Documentation | Quickstart Guide | Enterprise Pricing


How It Stacks Up

Pixverse v5.5 Effects ($0.15) – Pixverse Swap ($0.15-$0.40) specializes in targeted element replacement through keyframe-based swapping, matching base pricing for 5-second clips. Effects prioritizes stylistic transformations and motion effects for creative workflows where artistic control matters more than surgical precision.

PixVerse v3.5 Transition ($0.15) – Pixverse Swap ($0.15-$0.40) focuses on in-video element replacement at matched base pricing, while Transition handles image-to-image morphing for smooth scene changes. Transition excels when bridging static frames; Swap handles dynamic footage manipulation where existing motion needs preservation.