E

eval-sys

@eval-sys
2 ツールスター: 261フォーク: 10
E
mcpmark
eval-sys
MCP Servers are shaping the future of software. MCPMark is a comprehensive, stress-testing benchmark designed to evaluate model and agent capabilities in real-world MCP use.
Pythonデータベース・ストレージ+12
142
2025-09-10
E
mcpmark
EVAL SYS
MCP Servers are shaping the future of software. MCPMark is a comprehensive, stress-testing benchmark and a collection of diverse, verifiable tasks designed to evaluate model capabilities in real-world MCP use.
Pythonテスト・品質保証+7
119
2025-09-06