New Release

Unleash Your Creativity with the Kling O1 Video Generator

Experience the world's first unified multi-modal video foundation model. Generate videos from text, images, and subjects with unprecedented consistency and control.

No software downloads
Unified workflow
Product screenshot

Kling O1: The Future of Video Generation is Here

Experience the power of a unified multi-modal video engine that streamlines your creative process.

Unified Multi-Modal Engine

Unified Multi-Modal Engine

Kling O1 merges text-to-video, image-to-video, subject-to-video, and more into a single model within a single semantic space. No more switching between different tools.

Consistent Characters with Universal Reference

Consistent Characters with Universal Reference

Maintain character faces, clothing details, and props consistently across multiple shots, even with changing camera angles and shot types, using up to 5 reference images.

Conversational Video Editing

Conversational Video Editing

Edit your videos using natural language commands. Remove distractions, change styles, and reconstruct scenes at the pixel level without manual masking or keyframing.

Easy Video Creation with Kling O1

Transform your ideas into stunning videos in three simple steps.

1

Input Your Media

Upload images, videos, or text prompts to define your desired video content.

2

Customize and Refine

Use natural language commands to edit, style, and extend your video until it's perfect.

3

Generate and Share

Produce high-quality videos ready for sharing on social media or integrating into your projects.

Frequently Asked Questions

Learn more about the Kling O1 video generator and its powerful capabilities.

Kling O1, or Omni One, is Kuaishou's unified multi-modal video foundation model. It combines text-to-video, image-to-video, video editing, and shot extension into a single model within a shared semantic space.
Kling O1 uses a subject-based reference system with up to 5 reference images to maintain consistent character faces, clothing, and props across multiple shots, even with changing camera angles.
Yes! Kling O1 enables conversational post-production where users can perform pixel-level semantic reconstruction using natural language instructions without manual masking or keyframing.
Kling O1 supports 3-10 seconds of video generation per clip and up to 2 minutes of continuous video with synchronized audio.

Ready to create stunning videos with the power of AI?

Experience the future of video generation with Kling O1.