---
title: "EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios"
source_url: "https://huggingface.co/blog/ServiceNow-AI/eva-bench-data"
author: ServiceNow AI
publish_date: 2026-06-04
ingested: 2026-06-07
sha256: pending
tags: [servicenow, eva-bench, voice-agent, benchmark, agent-evaluation]
source: huggingface
review_value: 7
review_confidence: 6
review_stars: 4
---

# EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios

> 原文存档：[[raw/articles/eva-bench-data-2-voice-agent-evaluation|原文存档]] ^[raw/articles/eva-bench-data-2-voice-agent-evaluation.md]

## 核心内容

ServiceNow AI 发布的语音 Agent 评估基准 EVA-Bench Data 2.0：

- **3 个领域**：HR / 机票改签 / 客户支持 等垂直领域
- **121 个工具**：覆盖真实业务场景的工具调用
- **213 个场景**：复杂多步骤对话评估
- **目标**：解决语音 Agent 在垂直领域的评估缺口

## 评分依据

- v=7: ServiceNow 官方发布，针对语音 Agent 垂直领域评估的实用 benchmark
- c=6: 来源可信（ServiceNow AI），但内容被截断，无法完整评估方法论
- stars=4: 独特技术洞察（垂直领域语音 Agent 评估 + 121 工具 + 213 场景规模）
- v×c=42 < 49，但 stars≥4 → 入库（独特技术洞察规则）

## 上线状态

- 官方链接：https://huggingface.co/blog/ServiceNow-AI/eva-bench-data
- 部署：Hugging Face Datasets