
APIEval-20是专为API测试智能体设计的黑盒开源基准。基于JSON schema和单一样本生成测试套件,客观评估漏洞检测、覆盖率和效率。

APIEval-20 是首个专门评估 AI 智能体生成 API 测试套件能力的基准测试——仅凭 JSON 模式和示例负载,无需源代码或文档访问权限,测试智能体发现真实缺陷的能力。它覆盖电商、支付、认证等 20 个真实场景,精准衡量黑盒测试的实际工程价值。
APIEval-20 面向构建测试智能体的 AI 研究人员、评估自动化工具的工程团队,以及寻求客观指标将智能体性能与人类 QA 标准对比的测试负责人。
Find gaps in your AI agents before users do

Vision-first QA testing across web and mobile

The context layer for production-grade AI agent

Autonomous quality for engineering teams

build your own software factory

The Infrastructure Behind AI Agencies | White-Label Platform

Discover, access, and pay for any API autonomously

Ship AI agents without the operational burden

Recruit agents to run your company as a synchronous team

Control AI agents with confidence

Open-Source Brain For Your Team

Finance agent templates for pitches, KYC, and closing books

LLM Wiki + NotebookLM, in one closed-loop Proactive AI

A reasoning model that interprets intent before it generates

The agent which teaches while you build

Parallel agents, diff reviewer, and multi-model comparisons

Turn your voice and screen into shareable videos instantly.

The work your meetings create, done before they end

Open-Source Brain For Your Team

Virtual Machines for Your Agents

Run 100s of coding agents on any machine from anywhere

open source agent engineering platform

The missing open-source Kubernetes UI

Agent Teams You Can Actually Delegate To

Discover, access, and pay for any API autonomously