E

eval-sys

@eval-sys
2 ツールスター: 508フォーク: 37
E
mcpmark
eval-sys
MCPMark is a comprehensive, stress-testing MCP benchmark designed to evaluate model and agent capabilities in real-world MCP use.
Pythonデータベース・ストレージ+12
389
2026-01-27
E
mcpmark
EVAL SYS
MCP Servers are shaping the future of software. MCPMark is a comprehensive, stress-testing benchmark and a collection of diverse, verifiable tasks designed to evaluate model capabilities in real-world MCP use.
Pythonテスト・品質保証+7
119
2025-09-06