Global Leading OCR Toolkit & Document AI Engine
English | [简体中文](./readme/README_cn.md) | [繁體中文](./readme/README_tcn.md) | [日本語](./readme/README_ja.md) | [한국어](./readme/README_ko.md) | [Français](./readme/README_fr.md) | [Русский](./readme/README_ru.md) | [Español](./readme/README_es.md) | [العربية](./readme/README_ar.md)
[](https://pepy.tech/projects/paddleocr)
[](https://github.com/PaddlePaddle/PaddleOCR/network/dependents)



[](https://www.paddleocr.com)
[](https://deepwiki.com/PaddlePaddle/PaddleOCR)
[](../LICENSE)
**PaddleOCR converts PDF documents and images into structured, LLM-ready data (JSON/Markdown) with industry-leading accuracy. With 70k+ Stars and trusted by top-tier projects like Dify, RAGFlow, and Cherry Studio, PaddleOCR is the bedrock for building intelligent RAG and Agentic applications.**
## 🚀 Key Features
### 📄 Intelligent Document Parsing (LLM-Ready)
> *Transforming messy visuals into structured data for the LLM era.*
* **SOTA Document VLM**: Featuring **PaddleOCR-VL-1.6 (0.9B)**, the industry's leading lightweight vision-language model for document parsing. It achieves 96.3% accuracy on OmniDocBench v1.6, leads in text, formula, and table recognition, and shows significantly enhanced capabilities in ancient documents, rare characters, seals, and charts, with structured outputs in **Markdown** and **JSON** formats.
* **Structure-Aware Conversion**: Powered by **PP-StructureV3**, seamlessly convert complex PDFs and images into **Markdown** or **JSON**. Unlike the PaddleOCR-VL series models, it provides more fine-grained coordinate information, including table cell coordinates, text coordinates, and more.
* **Production-Ready Efficiency**: Achieve commercial-grade accuracy with an ultra-small footprint. Outperforms numerous closed-source solutions in public benchmarks while remaining resource-efficient for edge/cloud deployment.
### 🔍 Universal Text Recognition (Scene OCR)
> *The global gold standard for high-speed, multilingual text spotting.*
* **100+ Languages Supported**: Native recognition for a vast global library. **PP-OCRv6** supports 50 languages with a single unified model (Chinese, English, Japanese, and 46 Latin-script languages) — no model switching needed for multilingual documents.
* **Complex Element Mastery**: Beyond standard text recognition, we support **natural scene text spotting** across a wide range of environments, including IDs, street views, books, and industrial components
* **Performance Leap**: PP-OCRv6 achieves **+4.6% detection** and **+5.1% recognition** accuracy over PP-OCRv5, surpassing mainstream Vision-Language Models. 5.2× CPU inference speedup end-to-end.
| Project Name | Description |
| ------------ | ----------- |
| [Dify](https://github.com/langgenius/dify)

|Production-ready platform for agentic workflow development.|
| [RAGFlow](https://github.com/infiniflow/ragflow)

|RAG engine based on deep document understanding.|
| [pathway](https://github.com/pathwaycom/pathway)

|Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.|
| [MinerU](https://github.com/opendatalab/MinerU)

|Multi-type Document to Markdown Conversion Tool|
| [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR)

|Free, Open-source, Batch Offline OCR Software.|
| [cherry-studio](https://github.com/CherryHQ/cherry-studio)

|A desktop client that supports for multiple LLM providers.|
| [haystack](https://github.com/deepset-ai/haystack)

|AI orchestration framework to build customizable, production-ready LLM applications.|
| [OmniParser](https://github.com/microsoft/OmniParser)

|OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.|
| [QAnything](https://github.com/netease-youdao/QAnything)

|Question and Answer based on Anything.|
| [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)|
## 👩👩👧👦 Contributors