browser-x-mcp
AI-Powered Browser Automation with Advanced Form Testing - A Model Context Provider (MCP) server that enables intelligent browser automation with form testing, element extraction, and comprehensive logging
GitHubスター
0
ユーザー評価
未評価
フォーク
0
イシュー
0
閲覧数
0
お気に入り
0
AI-Powered Browser Automation with Advanced Form Testing
Browser[X]MCP is a Model Context Provider (MCP) server that enables AI-driven browser automation with advanced form testing capabilities, intelligent element extraction, and comprehensive interaction logging.
Connect your AI apps to browser automation - Works seamlessly with Cursor, Claude Desktop, VS Code, and other MCP-compatible applications.
✨ Features
🤖 AI-Driven Testing
- Smart Form Filling: AI automatically fills forms with realistic test data
- Batch Actions: Efficient bulk operations for multiple elements (up to 5 actions per batch)
- Context Awareness: AI understands page state and avoids redundant actions
- Loop Detection: Prevents infinite testing cycles
⚡ Batch Operations System
- Multi-Element Processing: Execute up to 5 actions simultaneously
- Intelligent Grouping: AI automatically groups similar elements for batch processing
- Performance Optimization: Reduce API calls and execution time by 3-5x
- Error Isolation: Individual action failures don't stop the entire batch
- Smart Prioritization: Batch similar input types (text fields, checkboxes, etc.)
🎯 Advanced Element Extraction
- XML Canvas Format: Compact, efficient page representation (800x+ compression)
- ID-Based Targeting: Reliable element identification
- Coordinate Mapping: Precise click positioning
- Real-time Updates: Dynamic page state tracking
💰 Token Economics & Cost Efficiency
- Massive Token Savings: 800x+ data compression vs screenshots
- AI Cost Reduction: ~90% lower AI API costs compared to vision models
- Text vs Vision Models: Use cheaper text models instead of expensive vision APIs
- Scalable Operations: Process thousands of pages at fraction of screenshot costs
- Performance Boost: 10x faster processing with compact data format
📊 Comprehensive Logging
- Action History: Detailed logs of all AI decisions and actions
- Form Data Capture: Real-time extraction of filled form data
- Performance Metrics: Success rates, timing, and efficiency stats
- Test Reports: JSON and console output formats
🛡️ Robust Automation
- Field Clearing: Advanced input field cleaning before entry
- File Upload Handling: Programmatic file upload without OS dialogs
- Error Recovery: Graceful handling of failed operations
- Stealth Mode: Reduced bot detection signatures
🚀 Quick Start
Installation
# Clone the repository
git clone https://github.com/rnd-pro/browser-x-mcp.git
cd browser-x-mcp
# Install dependencies
npm install
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Start the MCP server
npm start
Basic Usage
# Run AI-powered form testing
npm test
# Run with mock AI (faster testing)
npm run test:mock
# Generate test reports
npm run test:report
🏗️ Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ AI Test │───▶│ MCP Server │───▶│ Browser │
│ Agent │ │ (BrowserX) │ │ (Playwright) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Test Reports │ │ Action Logs │ │ Screenshots │
│ & Metrics │ │ & Form Data │ │ & Canvas │
└─────────────────┘ └──────────────────┘ └─────────────────┘
📁 Project Structure
browserx-mcp/
├── src/
│ ├── server/ # MCP Server implementation
│ │ ├── index.js # Main server with browser automation
│ │ ├── atomic-navigation.js # Navigation utilities
│ │ └── daemon.js # Server daemon
│ └── extractor/ # Page analysis tools
│ └── VirtualCanvasExtractor.js # XML canvas extraction
├── test/
│ ├── ai-mcp-interaction-test.js # AI-powered testing
│ ├── real-websites-test.js # Real website validation
│ └── input-types-test-page.html # Test page
├── tools/ # Development utilities
│ └── screenshot-analyzer/ # Screenshot analysis tools (planned)
├── examples/ # Usage examples
├── docs/ # Documentation
└── config/ # Configuration files
💰 Cost Efficiency Analysis
Token Usage Comparison
Approach | Data Size | Tokens | Cost/Request |
---|---|---|---|
Screenshots | 200KB | ~400,000 | $0.0048 |
XML Canvas | 0.25KB | ~500 | $0.0001 |
Savings | 800x smaller | 800x fewer | 48x cheaper |
Real-World Performance
- Google Search: 276KB screenshot → 3KB canvas = 92x compression
- GitHub Pages: 166KB screenshot → 121KB canvas = 1.4x compression
- Average Savings: ~90% cost reduction on AI API calls
🎮 Usage Examples
AI-Powered Form Testing
import { MCPAIInteractionAgent } from './test/ai-mcp-interaction-test.js';
const agent = new MCPAIInteractionAgent({
maxIterations: 20,
useMockAI: false,
stopOnFailure: true
});
await agent.init();
await agent.runInteractionTest();
const report = await agent.generateReport();
Batch Operations Example
// Execute multiple actions in one batch
const batchResponse = await fetch('http://localhost:3001', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
jsonrpc: '2.0',
method: 'batch_actions',
params: {
actions: [
{ action: 'input_text', element_id: 'email', text: 'user@example.com' },
{ action: 'input_text', element_id: 'password', text: 'SecurePass123' },
{ action: 'click_element_by_id', element_id: 'submit-btn' }
]
},
id: 1
})
});
Custom MCP Operations
// Connect to MCP server
const response = await fetch('http://localhost:3001', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
jsonrpc: '2.0',
method: 'extract_xml_canvas',
params: {},
id: 1
})
});
🤖 AI Editor Integration
Works with Popular AI Applications
Browser[X]MCP integrates seamlessly with MCP-compatible AI applications:
Application | Support | Setup |
---|---|---|
Cursor | ✅ Full | Add to .cursor/mcp.json |
Claude Desktop | ✅ Full | Add to MCP configuration |
VS Code | ✅ Full | Use MCP extension |
Windsurf | ✅ Full | MCP server integration |
Cursor Integration
To use Browser[X]MCP with Cursor, add this to your .cursor/mcp.json
:
{
"mcpServers": {
"browser-x-mcp": {
"command": "node",
"args": ["./src/server/daemon.js"],
"env": {
"BROWSER_X_MCP_DEBUG": "true",
"NODE_ENV": "development"
}
}
}
}
Then restart Cursor and start automating your browser with AI! 🚀
🔧 Configuration
Environment Variables
Create a .env
file based on .env.example
:
# Copy the example file
cp .env.example .env
# Edit with your settings
nano .env
Required environment variables:
# AI Configuration (required for AI testing)
OPENROUTER_API_KEY=your_openrouter_api_key_here
OPENROUTER_MODEL=deepseek/deepseek-r1:free
# Server Configuration
MCP_PORT=3001
BROWSER_HEADLESS=false
Note: Get your OpenRouter API key from openrouter.ai
Test Configuration
const config = {
maxIterations: 30,
stopOnFailure: true,
useMockAI: false,
headless: false,
loopThreshold: 2
};
📊 Test Reports
Browser[X]MCP generates comprehensive test reports:
{
"testMetadata": {
"testType": "MCP AI-Powered Form Interaction Test",
"timestamp": "2025-01-20T19:30:22.508Z",
"duration": "45.2 seconds",
"model": "deepseek/deepseek-r1:free"
},
"results": {
"totalActions": 12,
"successfulActions": 12,
"failedActions": 0,
"successRate": "100.00%",
"aiDecisions": [...]
}
}
🛠️ Development
Running Tests
# AI-powered form testing
npm test
# Alternative AI test command
npm run test:ai
# Mock AI testing (faster, no API required)
npm run test:mock
# View test page manually
npm run test:page
Adding New Features
- Server Extensions: Add new MCP methods in
src/server/index.js
- AI Capabilities: Enhance AI logic in
test/ai-mcp-interaction-test.js
- Extractors: Create new page analyzers in
src/extractor/
🗺️ Roadmap
🎯 Planned Features
🖼️ Screenshot Analysis Tools
- Visual element detection and coordinate mapping
- Cropped screenshot analysis for targeted interactions
- AI-powered click coordinate determination
- Visual regression testing capabilities
🧠 Enhanced AI Integration
- Multi-model AI support (GPT-4, Claude, Local models)
- Custom AI prompt templates
- Learning from user interactions
- Adaptive testing strategies
🌐 Extended Browser Support
- Multi-browser testing (Chrome, Firefox, Safari)
- Browser profile management
- Existing browser connection support
- Extension-based automation
🔍 Advanced Analysis
- Performance monitoring and optimization
- Accessibility testing integration
- SEO analysis capabilities
- Security vulnerability scanning
📱 Cross-Platform Support
- Mobile browser automation
- Responsive design testing
- Touch interaction simulation
- Device emulation
🚀 Priority Features
- Screenshot analyzer tool implementation
- Enhanced error handling and recovery
- Performance optimization
- Comprehensive documentation
🎨 Future Vision
- Visual testing framework
- Multi-browser orchestration
- Cloud deployment options
- Enterprise features
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
git clone https://github.com/rnd-pro/browser-x-mcp.git
cd browser-x-mcp
npm install
npm run dev
Submitting Changes
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Commit changes:
git commit -m 'Add amazing feature'
- Push to branch:
git push origin feature/amazing-feature
- Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
👥 Development Team
Developed by RND-PRO Team
- 🌐 Website: rnd-pro.com
- 💼 Professional development team specializing in innovative automation solutions
- 🤖 Experts in AI integration and browser automation technologies
🙏 Acknowledgments
- Built on top of Playwright for reliable browser automation
- Inspired by the MCP (Model Context Provider) specification
- AI integration powered by OpenRouter and various LLM providers
- Similar to Browser MCP but with advanced AI testing capabilities
📞 Support
- 📧 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
- 📖 Documentation: Wiki
Made with ❤️ by RND-PRO Team for the AI automation community