fal-ai/pika/v2.2/text-to-video

Start with a simple text input to create dynamic generations that defy expectations in up to 1080p. Experience better image clarity and crisper, sharper visuals.

Inference

Commercial use

Partner

Schema

LLMs

Playground API

Input

Prompt*

Large elegant white poodle standing proudly on the deck of a white yacht, wearing oversized glamorous sunglasses and a luxurious silk Gucci-style scarf tied around its neck, layered pearl necklaces draped across its chest, photographed from outside the yacht at a low upward angle, clear blue sky background, strong midday sunlight, washed-out faded tones, slightly overexposed 2000s fashion editorial aesthetic, cinematic analog film texture, playful luxury mood, glossy magazine style, bright harsh light and soft shadows, stylish and extravagant atmosphere. camera slow orbit and dolly in

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

5 second video at 720p costs $0.20. 5 second video at 1080p costs $0.45.

Logs

Pika v2.2 | [text-to-video]

Pika Labs' Text to Video v2.2 delivers up to 1080p resolution video generation at $0.20 per 5-second clip (720p) or $0.45 per 5-second clip (1080p). Trading inference speed for visual clarity, this model prioritizes crisp, sharp outputs over generation time. Built for creators who need production-ready video content from text prompts without compromising on resolution.

Use Cases: Marketing Content Creation | Social Media Video Production | Concept Visualization

Performance

At $0.20-$0.45 per 5-second video depending on resolution, Pika v2.2 positions itself as a premium option trading cost for quality, approximately 2-4x the price of standard text-to-video alternatives while delivering 1080p output capability.

Metric	Result	Context
Maximum Resolution	Up to 1080p	720p ($0.20) or 1080p ($0.45) per 5-second video
Video Duration	5-10 seconds	Configurable via API parameter
Cost per Video (720p)	$0.20	5 generations per $1.00 on fal
Cost per Video (1080p)	$0.45	2.2 generations per $1.00 on fal
Aspect Ratios	7 formats	16:9, 9:16, 1:1, 4:5, 5:4, 3:2, 2:3
Related Endpoints	Pika v2.1 Text to Video, Pika Effects v1.5, Pika Scenes v2.2	Text-to-video vs image-to-video variants

Resolution-First Video Generation

Pika v2.2 prioritizes output quality over generation speed, delivering what the platform describes as better image clarity and crisper, sharper visuals compared to previous iterations. The architecture supports flexible aspect ratio control alongside resolution scaling.

What this means for you:

1080p Production Output: Generate videos at full HD resolution suitable for professional content workflows, not just social media previews
Dual Resolution Pricing: Choose 720p ($0.20) for rapid iteration or 1080p ($0.45) when final quality matters, pay only for the resolution you need
Seven Aspect Ratio Options: Cover vertical (9:16), horizontal (16:9), square (1:1), and portrait/landscape variations (4:5, 5:4, 3:2, 2:3) without post-production cropping
10-Second Extended Duration: Double your video length to 10 seconds when narrative pacing requires more than quick 5-second clips

Technical Specifications

Spec	Details
Architecture	Pika Text to Video v2.2
Input Formats	Text prompts with optional negative prompts
Output Formats	MP4 video (720p or 1080p)
Duration Range	5-10 seconds configurable
License	Commercial use permitted (Partner)

API Documentation | Quickstart Guide | Enterprise Pricing

How It Stacks Up

Pika Text to Video (v2.1) ($0.20 for 720p) – Pika v2.2 ($0.20-$0.45) adds 1080p resolution capability at 2.25x the cost of the 720p tier, while maintaining identical 720p pricing. Pika v2.1 remains ideal for workflows where 720p resolution meets requirements and budget constraints matter.

Pika Effects (v1.5) ($0.039 per image-to-video) – Pika v2.2 handles pure text-to-video generation at 5-11x the cost. Effects v1.5 specializes in image-to-video transformations with stylization controls for creators starting with existing visual assets rather than text prompts.

Pika Scenes (v2.2) – Pika Text to Video v2.2 focuses on text-driven generation with resolution control, while Scenes v2.2 prioritizes image-to-video workflows. Choose text-to-video for concept-first creation, Scenes for asset-based animation.

Hunyuan Video V1.5 – Pika v2.2 trades extended duration capability for resolution options, offering dual-tier pricing at 720p/1080p. Hunyuan emphasizes different architectural priorities for text-to-video generation workflows.