E

eval-sys

@eval-sys
2 ツールスター: 551フォーク: 43
E
mcpmark
eval-sys
MCPMark is a comprehensive, stress-testing MCP benchmark designed to evaluate model and agent capabilities in real-world MCP use.
Pythonデータベース・ストレージ+12
432
2026-06-12
E
mcpmark
EVAL SYS
MCP Servers are shaping the future of software. MCPMark is a comprehensive, stress-testing benchmark and a collection of diverse, verifiable tasks designed to evaluate model capabilities in real-world MCP use.
Pythonテスト・品質保証+7
119
2025-09-06