mcp-macos-companion

Name: mcp-macos-companion
Availability: InStock
Author: MR M

MCP macOS Companion - Backend Services: Intelligent workflow orchestration, Claude AI integration, and voice command processing for macOS automation

GitHub

GitHub Stars

User Rating

Not Rated

Favorites

Views

Forks

Issues

README

🏗️ MCP macOS Companion - Backend Services

A comprehensive backend architecture for macOS automation built with Model Context Protocol (MCP) integration, providing intelligent workflow orchestration, Claude AI processing, and seamless voice command execution.

🎯 Overview

This system serves as the backend infrastructure for the MCP Companion Voice App, handling AI processing, workflow automation, and system integration through a distributed microservices architecture.

Core Architecture

┌─────────────────┐    HTTP/REST    ┌──────────────────┐
│   SwiftUI App   │ ◄─────────────► │  Flask Services  │
│   (Frontend)    │                 │   (Backend)      │
│                 │                 │                  │
│ • Voice UI      │                 │ • Claude AI      │
│ • Speech Rec    │                 │ • Workflows      │
│ • TTS Output    │                 │ • Orchestration  │
│ • Menu Bar      │                 │ • Service Reg    │
└─────────────────┘                 └──────────────────┘

🏗️ Service Architecture

Core Services

Service Registry (Port 8081)

Service Discovery: Automatic detection and health monitoring
Health Checks: Real-time service status tracking
Load Balancing: Request distribution across service instances
Configuration Management: Centralized service configuration

Workflow Orchestrator (Port 8084)

Template Engine: Reusable automation patterns
Parameter Interpolation: Dynamic value substitution
Cross-Service Communication: HTTP-based inter-service messaging
Error Recovery: Robust retry logic and failure handling

Claude Service Bridge (Port 8092)

Natural Language Processing: Commands parsed through Claude AI
Context Awareness: Maintains conversation state and history
Action Mapping: Intelligent translation of voice to system actions
Fallback Responses: Graceful handling when AI unavailable

Memory Engine (Port 8082)

Context Storage: Persistent conversation and system state
Retrieval System: Fast access to historical interactions
Data Indexing: Efficient search across stored contexts
Session Management: User and workflow state tracking

Voice Processing Service (Port 8085)

Speech-to-Text: Real-time audio transcription
Audio Processing: Noise reduction and level detection
Session Management: Recording state and result storage
Integration APIs: Seamless frontend communication

🎤 Voice Command Processing

Intelligent Recognition Pipeline

Voice Input → Speech Recognition → Claude Processing → Action Mapping → System Execution

Supported Commands

"Open NAS RAID" (handles variations like "NAS equality", "NAS rate")
"Open 4TB SSD" - External storage access
"Open MCP stuff folder" - Project directory navigation
"Take a screenshot" - Screen capture with notifications
File search operations - Intelligent file system queries
System automation tasks - Various macOS integrations

Error-Tolerant Processing

Speech Recognition Errors: Handles common misinterpretations
Context Disambiguation: Uses conversation history for clarity
Graceful Fallbacks: Provides helpful responses when commands fail
Multi-Pattern Matching: Supports various phrasings for same actions

🔧 Technical Implementation

Flask Microservices Framework

# Example service structure
class WorkflowOrchestrator:
    def __init__(self):
        self.service_registry = ServiceRegistry()
        self.templates = WorkflowTemplates()
        
    def execute_workflow(self, trigger_type, trigger_data):
        # Parameter interpolation and execution
        return self.orchestrate_services(workflow)

Service Communication

HTTP REST APIs: Standardized inter-service communication
JSON Message Format: Structured data exchange
Async Processing: Non-blocking operation handling
Error Propagation: Comprehensive error reporting

AppleScript Integration

-- Enhanced system control
tell application "Finder"
    open folder "NAS RAID"
    activate
end tell

display notification "Folder opened successfully" with title "MCP Companion"

🚀 Quick Start

Prerequisites

Python 3.8+
Flask and dependencies
macOS 13.0+ (for AppleScript integration)
Optional: Claude API access for enhanced AI features

Installation

Clone the repository:

git clone https://github.com/TIMBOTGPT/mcp-macos-companion.git
cd mcp-macos-companion

Install dependencies:

pip3 install flask flask-cors requests --break-system-packages

Start core services:

# Start service registry (required first)
python3 services/service_registry.py &

# Start workflow orchestrator
python3 services/workflow_orchestrator.py &

# Start Claude bridge
python3 services/simple_claude_bridge.py &

# Optional: Start additional services
python3 services/memory_engine.py &
python3 services/voice_processing.py &

Verify Installation

# Check service status
curl http://localhost:8081/services  # Service registry
curl http://localhost:8084/health    # Workflow orchestrator  
curl http://localhost:8092/health    # Claude bridge

# Test workflow execution
curl -X POST http://localhost:8084/trigger \
  -H "Content-Type: application/json" \
  -d '{"trigger_type": "voice_command", "trigger_data": {"command": "take a screenshot"}}'

📁 Project Structure

mcp-macos-companion/
├── services/                    # Core backend services
│   ├── service_registry.py      # Service discovery and health
│   ├── workflow_orchestrator.py # Main automation engine
│   ├── simple_claude_bridge.py  # Claude AI integration
│   ├── memory_engine.py         # Context and state storage
│   ├── voice_processing.py      # Speech processing service
│   └── ...                      # Additional utility services
├── workflows/                   # Workflow templates and configs
│   ├── templates/               # Reusable workflow patterns
│   └── configs/                 # Service configurations
├── tests/                       # Service and integration tests
├── docs/                        # Architecture documentation
├── requirements.txt             # Python dependencies
└── README.md                    # This file

🔧 Service Configuration

Environment Variables

# Service ports (defaults)
export SERVICE_REGISTRY_PORT=8081
export WORKFLOW_ORCHESTRATOR_PORT=8084
export CLAUDE_BRIDGE_PORT=8092
export MEMORY_ENGINE_PORT=8082
export VOICE_PROCESSING_PORT=8085

# Optional: Claude API
export CLAUDE_API_KEY="your_key_here"

Service Registry Configuration

{
  "services": {
    "workflow_orchestrator": {
      "url": "http://localhost:8084",
      "health_endpoint": "/health",
      "required": true
    },
    "claude_bridge": {
      "url": "http://localhost:8092", 
      "health_endpoint": "/health",
      "required": false
    }
  }
}

🎯 Workflow Templates

Example: Screenshot Workflow

{
  "name": "capture_screen",
  "description": "Take screenshot with notification",
  "steps": [
    {
      "action": "execute_applescript",
      "script": "do shell script \"screencapture ~/Desktop/screenshot.png\""
    },
    {
      "action": "show_notification", 
      "title": "Screenshot Taken",
      "message": "Saved to Desktop"
    }
  ]
}

Example: Voice Command Processing

def process_voice_command(command_text):
    # Parse through Claude AI
    response = claude_bridge.process_command(command_text)
    
    # Execute mapped actions
    if response.action == "open_folder":
        return execute_folder_open(response.target)
    elif response.action == "take_screenshot":
        return execute_screenshot()
    
    return {"status": "unknown_command"}

🔍 API Documentation

Workflow Orchestrator API

`POST /trigger`

Execute a workflow based on trigger type and data.

curl -X POST http://localhost:8084/trigger \
  -H "Content-Type: application/json" \
  -d '{
    "trigger_type": "voice_command",
    "trigger_data": {
      "command": "open nas raid",
      "confidence": 0.95
    }
  }'

`GET /workflows`

List available workflow templates.

Claude Bridge API

`POST /process_command`

Process natural language command through Claude AI.

curl -X POST http://localhost:8092/process_command \
  -H "Content-Type: application/json" \
  -d '{"command": "take a screenshot"}'

`GET /health`

Check Claude service availability.

Service Registry API

`GET /services`

List all registered services and their status.

`POST /register`

🛠️ Development

Adding New Services

Create service in services/ directory
Implement health check endpoint (/health)
Register with service registry on startup
Add configuration to workflows as needed

Creating Workflow Templates

# services/workflow_templates.py
def create_custom_workflow():
    return {
        "name": "my_workflow",
        "steps": [
            {"action": "custom_action", "params": {...}}
        ]
    }

Testing Services

# Run service tests
python3 -m pytest tests/

# Test individual service
curl http://localhost:8084/health

🐛 Troubleshooting

Common Issues

Services Not Starting

Check port availability: lsof -i :8081-8092
Verify Python dependencies: pip3 list | grep flask
Review service logs for specific errors

Voice Commands Not Working

Ensure Claude bridge is running: curl http://localhost:8092/health
Check workflow orchestrator status
Verify AppleScript permissions in System Settings

Service Discovery Failing

Confirm service registry is accessible: curl http://localhost:8081/services
Check network connectivity between services
Review service registration logs

Debug Mode

# Enable debug logging
export FLASK_ENV=development
export DEBUG=true

# Run with enhanced logging
python3 services/workflow_orchestrator.py --debug

🔗 Integration

Frontend Integration

The backend services are designed to work with the MCP Companion Voice App:

// SwiftUI integration example
func executeVoiceCommand(_ command: String) async {
    let url = URL(string: "http://localhost:8084/trigger")!
    // ... HTTP request to backend
}

Third-Party Integration

Claude AI API: Natural language processing
AppleScript: macOS system integration
HTTP REST: Service communication
JSON: Data exchange format

📊 Performance

Service Metrics

Response Time: < 200ms for most operations
Concurrent Requests: Supports 50+ simultaneous requests
Memory Usage: ~50MB per service
CPU Usage: < 10% under normal load

Scaling Considerations

Horizontal scaling via service registry
Load balancing across service instances
Database integration for persistent storage
Container deployment with Docker

🔮 Future Enhancements

Docker containerization for easy deployment
Database integration for persistent workflows
Web dashboard for service monitoring
Plugin system for custom integrations
Distributed deployment across multiple machines

📄 License

🤝 Contributing

This is currently a private project. For collaboration opportunities, please contact the repository owner.

Built with ❤️ for intelligent macOS automation

Repository: https://github.com/TIMBOTGPT/mcp-macos-companion

Author Information

MR M

Businessperson and Engineer tinkering with AI down under

QLD - Australia

GitHub

Followers

Repositories

Gists

Total Contributions