
# Minimalist AI Video Translation and Dubbing Tool

**[English](/README.md)|[简体中文](/docs/zh/README.md)|[日本語](/docs/jp/README.md)|[한국어](/docs/kr/README.md)|[Tiếng Việt](/docs/vi/README.md)|[Français](/docs/fr/README.md)|[Deutsch](/docs/de/README.md)|[Español](/docs/es/README.md)|[Português](/docs/pt/README.md)|[Русский](/docs/rus/README.md)|[اللغة العربية](/docs/ar/README.md)**
[](https://x.com/KrillinAI)
[](https://jq.qq.com/?_wv=1027&k=754069680)
[](https://space.bilibili.com/242124650)
### 📢 New Release for Windows & macOS Desktop, Welcome to Test and Provide Feedback [Documentation is a bit outdated, ongoing updates]
## Project Introduction
Krillin AI is a versatile audio and video localization and enhancement solution. This minimalist yet powerful tool integrates video translation, dubbing, and voice cloning, supporting both landscape and portrait formats to ensure perfect presentation across all major platforms (Bilibili, Xiaohongshu, Douyin, WeChat Video, Kuaishou, YouTube, TikTok, etc.). With an end-to-end workflow, Krillin AI can transform raw materials into polished, ready-to-use cross-platform content with just a few clicks.
## Key Features and Functions:
🎯 **One-Click Start**: No complex environment setup required, automatic dependency installation, ready to use immediately, with a new desktop version for easier use!
📥 **Video Acquisition**: Supports yt-dlp downloads or local file uploads
📜 **Accurate Recognition**: High-accuracy speech recognition based on Whisper
🧠 **Intelligent Segmentation**: Subtitle segmentation and alignment using LLM
🔄 **Terminology Replacement**: One-click replacement of specialized vocabulary
🌍 **Professional Translation**: LLM-based paragraph-level translation maintains semantic coherence
🎙️ **Voice Cloning**: Offers selected voice tones from CosyVoice or custom voice cloning
🎬 **Video Composition**: Automatically handles landscape and portrait video and subtitle layout
## Effect Demonstration
The image below shows the effect of the subtitle file generated after importing a 46-minute local video and executing it with one click, without any manual adjustments. There are no omissions or overlaps, the segmentation is natural, and the translation quality is very high.
