Agent TARS Banner
## Introduction English | [็ฎ€ไฝ“ไธญๆ–‡](./README.zh-CN.md) [![](https://trendshift.io/api/badge/repositories/13584)](https://trendshift.io/repositories/13584) TARS\* is a Multimodal AI Agent stack, currently shipping two projects: [Agent TARS](#agent-tars) and [UI-TARS-desktop](#ui-tars-desktop):
Agent TARS UI-TARS-desktop
Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.

It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools.
UI-TARS Desktop is a desktop application that provides a native GUI Agent based on the UI-TARS model.

It primarily ships a local and remote computer as well as browser operators.
## Table of Contents - [News](#news) - [Agent TARS](#agent-tars) - [Showcase](#showcase) - [Core Features](#core-features) - [Quick Start](#quick-start) - [Documentation](#documentation) - [UI-TARS Desktop](#ui-tars-desktop) - [Showcase](#showcase-1) - [Features](#features) - [Quick Start](#quick-start-1) - [Contributing](#contributing) - [License](#license) - [Citation](#citation) ## News - **\[2025-11-05\]** ๐ŸŽ‰ We're excited to announce the release of [Agent TARS CLI v0.3.0](https://github.com/bytedance/UI-TARS-desktop/releases/tag/v0.3.0)! This version brings streaming support for multiple tools (shell commands, multi-file structured display), runtime settings with timing statistics for tool calls and deep thinking, Event Stream Viewer for data flow tracking and debugging. Additionally, it features exclusive support for [AIO agent Sandbox](https://github.com/agent-infra/sandbox) as isolated all-in-one tools execution environment. - **\[2025-06-25\]** We released an Agent TARS Beta and Agent TARS CLI - [Introducing Agent TARS Beta](https://agent-tars.com/blog/2025-06-25-introducing-agent-tars-beta.html), a multimodal AI agent that aims to explore a work form that is closer to human-like task completion through rich multimodal capabilities (such as GUI Agent, Vision) and seamless integration with various real-world tools. - **\[2025-06-12\]** - ๐ŸŽ We are thrilled to announce the release of UI-TARS Desktop v0.2.0! This update introduces two powerful new features: **Remote Computer Operator** and **Remote Browser Operator**โ€”both completely free. No configuration required: simply click to remotely control any computer or browser, and experience a new level of convenience and intelligence. - **\[2025-04-17\]** - ๐ŸŽ‰ We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports [the advanced UI-TARS-1.5 model](https://seed-tars.com/1.5) for improved performance and precise control. - **\[2025-02-20\]** - ๐Ÿ“ฆ Introduced [UI TARS SDK](./docs/sdk.md), is a powerful cross-platform toolkit for building GUI automation agents. - **\[2025-01-23\]** - ๐Ÿš€ We updated the **[Cloud Deployment](./docs/deployment.md#cloud-deployment)** section in the ไธญๆ–‡็‰ˆ: [GUIๆจกๅž‹้ƒจ็ฝฒๆ•™็จ‹](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb) with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.
## Agent TARS

npm version downloads node version Discord Community Official Twitter ้ฃžไนฆไบคๆต็พค Ask DeepWiki

Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.

It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools. ### Showcase ``` Please help me book the earliest flight from San Jose to New York on September 1st and the last return flight on September 6th on Priceline ``` https://github.com/user-attachments/assets/772b0eef-aef7-4ab9-8cb0-9611820539d8
Booking Hotel Generate Chart with extra MCP Servers
Instruction: I am in Los Angeles from September 1st to September 6th, with a budget of $5,000. Please help me book a Ritz-Carlton hotel closest to the airport on booking.com and compile a transportation guide for me Instruction: Draw me a chart of Hangzhou's weather for one month
For more use cases, please check out [#842](https://github.com/bytedance/UI-TARS-desktop/issues/842). ### Core Features - ๐Ÿ–ฑ๏ธ **One-Click Out-of-the-box CLI** - Supports both **headful** [Web UI](https://agent-tars.com/guide/basic/web-ui.html) and **headless** [server](https://agent-tars.com/guide/advanced/server.html) [execution](https://agent-tars.com/guide/basic/cli.html). - ๐ŸŒ **Hybrid Browser Agent** - Control browsers using [GUI Agent](https://agent-tars.com/guide/basic/browser.html#visual-grounding), [DOM](https://agent-tars.com/guide/basic/browser.html#dom), or a hybrid strategy. - ๐Ÿ”„ **Event Stream** - Protocol-driven Event Stream drives [Context Engineering](https://agent-tars.com/beta#context-engineering) and [Agent UI](https://agent-tars.com/blog/2025-06-25-introducing-agent-tars-beta.html#easy-to-build-applications). - ๐Ÿงฐ **MCP Integration** - The kernel is built on MCP and also supports mounting [MCP Servers](https://agent-tars.com/guide/basic/mcp.html) to connect to real-world tools. ### Quick Start Agent TARS CLI ```bash # Launch with `npx`. npx @agent-tars/cli@latest # Install globally, required Node.js >= 22 npm install @agent-tars/cli@latest -g # Run with your preferred model provider agent-tars --provider volcengine --model doubao-1-5-thinking-vision-pro-250428 --apiKey your-api-key agent-tars --provider anthropic --model claude-3-7-sonnet-latest --apiKey your-api-key ``` Visit the comprehensive [Quick Start](https://agent-tars.com/guide/get-started/quick-start.html) guide for detailed setup instructions. ### Documentation > ๐ŸŒŸ **Explore Agent TARS Universe** ๐ŸŒŸ
Category Resource Link Description
๐Ÿ  Central Hub Website Your gateway to Agent TARS ecosystem
๐Ÿ“š Quick Start Quick Start Zero to hero in 5 minutes
๐Ÿš€ What's New Blog Discover cutting-edge features & vision
๐Ÿ› ๏ธ Developer Zone Docs Master every command & features
๐ŸŽฏ Showcase Examples View use cases built by the official and community
๐Ÿ”ง Reference API Complete technical reference



## UI-TARS Desktop

UI-TARS

UI-TARS Desktop is a native GUI agent for your local computer, driven by [UI-TARS](https://github.com/bytedance/UI-TARS) and Seed-1.5-VL/1.6 series models.

   ๐Ÿ“‘ Paper    | ๐Ÿค— Hugging Face Models   |   ๐Ÿซจ Discord   |   ๐Ÿค– ModelScope  
๐Ÿ–ฅ๏ธ Desktop Application    |    ๐Ÿ‘“ Midscene (use in browser)   

### Showcase | Instruction | Local Operator | Remote Operator | | :----------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------: | | Please help me open the autosave feature of VS Code and delay AutoSave operations for 500 milliseconds in the VS Code setting. |