# EvalScope Benchmark MCP

A paid remote MCP for AI SDK benchmark dashboard, built to return verdicts, receipts, usage logs, and audit-ready JSON for agent and CI workflows.

Website: https://evalscopebench.clauxel.com/
Documentation: https://github.com/clauxel/evalscope-benchmark-mcp-mcp
Pricing: https://evalscopebench.clauxel.com/pricing/
Checkout: https://evalscopebench.clauxel.com/checkout/
Primary topic: AI SDK benchmark dashboard
Secondary topics: AI SDK benchmark dashboard | EvalScope MCP server | LLM benchmark MCP | model evaluation report | hosted EvalScope runs, hosted EvalScope benchmark MCP server, EvalScope MCP server, LLM benchmark MCP, model evaluation report
Support: support@aigeamy.com
Region: Global, with English customer-facing copy.

Core workflow:
- Submit public-safe AI SDK benchmark dashboard context with owner and policy details.
- Run the remote MCP gate and evaluate the submitted workflow against product-specific rules.
- Return structured JSON suitable for agents, CI, IDEs, and reviewers.
- Archive the receipt, report, or review history for audit and follow-up.

Outputs:
- Structured verdict JSON
- Risk reasons and next actions
- Receipt and usage log
- Audit dashboard export

Remote MCP endpoint: https://evalscopebench.clauxel.com/mcp
Server card: https://evalscopebench.clauxel.com/.well-known/mcp/server-card.json
Agent checkout API: https://evalscopebench.clauxel.com/api/agent-checkout
Tools: run_benchmark_gate, compare_model_scores, read_benchmark_report, issue_benchmark_receipt
Authentication: paid bearer token required for production calls.