MCP-RAG-DATABASE

このプロジェクトは、TypeScriptで構築されたデータベース関連のツールです。機能は限られているものの、データ管理やストレージに関する基本的な操作をサポートしています。ドキュメントが不十分で、ユーザーが理解するのに苦労する可能性があります。

GitHubスター

1

ユーザー評価

未評価

お気に入り

0

閲覧数

25

フォーク

0

イシュー

0

README
🚀 Enterprise Multi-MCP Smart Database System

TypeScript
Node.js
License
Build Status
Coverage

Revolutionary database replacement technology that eliminates SQL and uses Multi-Context Processors (MCPs) with dual RAG intelligence for natural language data management.


🎯 What Makes This Revolutionary?
Traditional Databases
  • Rigid schemas and complex SQL
  • Monolithic architecture
  • Manual optimization required
  • Limited intelligence
Enterprise Multi-MCP System
  • No SQL required - Natural language queries only
  • Domain-specific intelligence - Each MCP understands its data
  • Auto-optimization - Self-organizing and performance-tuning
  • AI-powered - RAG systems for intelligent data management

🏗️ System Architecture
High-Level Architecture Diagram
graph TB
    subgraph "📱 Client Layer"
        CLIENT[Client Applications]
        API[REST API Endpoints]
        WS[WebSocket Real-time]
    end
    
    subgraph "🧠 Intelligence Layer"
        RAG1[RAG₁<br/>Intelligent Ingestion]
        RAG2[RAG₂<br/>Natural Language Queries]
        ML[ML Classification Engine]
        NLP[NLP Query Parser]
    end
    
    subgraph "🎛️ MCP Registry & Orchestration"
        REGISTRY[MCP Registry<br/>Central Coordinator]
        LB[Load Balancer]
        HEALTH[Health Monitor]
        MIGRATE[Migration Engine]
    end
    
    subgraph "🔥 HOT MCPs - Frequently Accessed"
        USER_MCP[👤 UserMCP<br/>Profiles & Auth]
        CHAT_MCP[💬 ChatMCP<br/>Messages & Threads]
        STATS_MCP[📊 StatsMCP<br/>Real-time Analytics]
    end
    
    subgraph "❄️ COLD MCPs - Archived Data"
        LOGS_MCP[📝 LogsMCP<br/>System Logs]
        ARCHIVE_MCP[🗄️ ArchiveMCP<br/>Historical Data]
        BACKUP_MCP[💾 BackupMCP<br/>Data Backups]
    end
    
    subgraph "⚡ Performance Layer"
        CACHE[Intelligent Cache]
        INDEX[Auto-Indexing]
        PREDICT[Predictive Loading]
    end
    
    CLIENT --> API
    CLIENT --> WS
    API --> RAG1
    API --> RAG2
    
    RAG1 --> ML
    RAG1 --> REGISTRY
    RAG2 --> NLP
    RAG2 --> REGISTRY
    
    REGISTRY --> LB
    REGISTRY --> HEALTH
    REGISTRY --> MIGRATE
    
    LB --> USER_MCP
    LB --> CHAT_MCP
    LB --> STATS_MCP
    LB --> LOGS_MCP
    LB --> ARCHIVE_MCP
    LB --> BACKUP_MCP
    
    HEALTH --> MIGRATE
    MIGRATE -.-> USER_MCP
    MIGRATE -.-> CHAT_MCP
    MIGRATE -.-> STATS_MCP
    
    USER_MCP --> CACHE
    CHAT_MCP --> CACHE
    STATS_MCP --> INDEX
    LOGS_MCP --> PREDICT
    
    classDef hotMcp fill:#ff6b6b,stroke:#d63031,stroke-width:2px,color:#fff
    classDef coldMcp fill:#74b9ff,stroke:#0984e3,stroke-width:2px,color:#fff
    classDef intelligence fill:#a29bfe,stroke:#6c5ce7,stroke-width:2px,color:#fff
    classDef registry fill:#fd79a8,stroke:#e84393,stroke-width:2px,color:#fff
    classDef performance fill:#55a3ff,stroke:#2d96ff,stroke-width:2px,color:#fff
    
    class USER_MCP,CHAT_MCP,STATS_MCP hotMcp
    class LOGS_MCP,ARCHIVE_MCP,BACKUP_MCP coldMcp
    class RAG1,RAG2,ML,NLP intelligence
    class REGISTRY,LB,HEALTH,MIGRATE registry
    class CACHE,INDEX,PREDICT performance
Data Flow Architecture
sequenceDiagram
    participant C as Client
    participant API as API Server
    participant RAG1 as RAG₁ Ingestion
    participant RAG2 as RAG₂ Query
    participant REG as MCP Registry
    participant USER as UserMCP
    participant CHAT as ChatMCP
    participant CACHE as Cache Layer
    
    Note over C,CACHE: Data Ingestion Flow
    C->>API: POST /api/v1/ingest
    API->>RAG1: Raw data classification
    RAG1->>RAG1: ML analysis & routing decision
    RAG1->>REG: Request optimal MCP
    REG->>RAG1: Return UserMCP instance
    RAG1->>USER: Store structured data
    USER->>CACHE: Update cache
    USER->>API: Confirm storage
    API->>C: Success response
    
    Note over C,CACHE: Natural Language Query Flow
    C->>API: POST /api/v1/query {"naturalQuery": "get active users"}
    API->>RAG2: Parse natural language
    RAG2->>RAG2: Intent recognition & entity extraction
    RAG2->>REG: Determine target MCPs
    REG->>RAG2: Return [UserMCP, ChatMCP]
    
    par Parallel MCP Queries
        RAG2->>USER: Execute user query
        RAG2->>CHAT: Execute activity query
    end
    
    USER->>RAG2: Return user data
    CHAT->>RAG2: Return activity data
    RAG2->>RAG2: Aggregate & optimize results
    RAG2->>API: Combined results
    API->>C: JSON response

🧠 Dual RAG Intelligence System
RAG₁: Intelligent Data Ingestion
flowchart TD
    START([Raw Data Input]) --> ANALYZE{AI Classification}
    ANALYZE -->|User Data| USER_ROUTE[Route to UserMCP]
    ANALYZE -->|Chat Message| CHAT_ROUTE[Route to ChatMCP]
    ANALYZE -->|Metrics| STATS_ROUTE[Route to StatsMCP]
    ANALYZE -->|Log Entry| LOGS_ROUTE[Route to LogsMCP]
    
    USER_ROUTE --> USER_PROCESS[Process & Index]
    CHAT_ROUTE --> CHAT_PROCESS[Extract Mentions/Tags]
    STATS_ROUTE --> STATS_PROCESS[Time-series Analysis]
    LOGS_ROUTE --> LOGS_PROCESS[Log Level Classification]
    
    USER_PROCESS --> STORE[(Optimized Storage)]
    CHAT_PROCESS --> STORE
    STATS_PROCESS --> STORE
    LOGS_PROCESS --> STORE
    
    STORE --> LEARN[Pattern Learning]
    LEARN --> OPTIMIZE[Optimize Future Routing]
    
    classDef process fill:#fdcb6e,stroke:#e17055
    classDef decision fill:#a29bfe,stroke:#6c5ce7
    classDef storage fill:#55a3ff,stroke:#2d96ff
    classDef learn fill:#fd79a8,stroke:#e84393
    
    class USER_PROCESS,CHAT_PROCESS,STATS_PROCESS,LOGS_PROCESS process
    class ANALYZE decision
    class STORE storage
    class LEARN,OPTIMIZE learn
RAG₂: Natural Language Query Processing
flowchart TD
    QUERY([Natural Language Query]) --> PARSE{NLP Parsing}
    PARSE --> INTENT[Intent Recognition]
    PARSE --> ENTITY[Entity Extraction]
    PARSE --> CONTEXT[Context Analysis]
    
    INTENT --> PLAN[Query Execution Plan]
    ENTITY --> PLAN
    CONTEXT --> PLAN
    
    PLAN --> ROUTE{MCP Routing}
    ROUTE -->|User Query| USER_MCP[👤 UserMCP]
    ROUTE -->|Chat Query| CHAT_MCP[💬 ChatMCP]
    ROUTE -->|Stats Query| STATS_MCP[📊 StatsMCP]
    ROUTE -->|Logs Query| LOGS_MCP[📝 LogsMCP]
    
    USER_MCP --> RESULTS[Result Collection]
    CHAT_MCP --> RESULTS
    STATS_MCP --> RESULTS
    LOGS_MCP --> RESULTS
    
    RESULTS --> AGGREGATE[Intelligent Aggregation]
    AGGREGATE --> OPTIMIZE[Result Optimization]
    OPTIMIZE --> CACHE[Cache Results]
    CACHE --> RESPONSE([Structured Response])
    
    classDef nlp fill:#a29bfe,stroke:#6c5ce7,color:#fff
    classDef mcp fill:#ff6b6b,stroke:#d63031,color:#fff
    classDef process fill:#55a3ff,stroke:#2d96ff,color:#fff
    
    class PARSE,INTENT,ENTITY,CONTEXT nlp
    class USER_MCP,CHAT_MCP,STATS_MCP,LOGS_MCP mcp
    class PLAN,RESULTS,AGGREGATE,OPTIMIZE,CACHE process

🎛️ MCP Registry & Hot/Cold Classification
Dynamic MCP Management
stateDiagram-v2
    [*] --> Initializing
    Initializing --> Active: MCP Created
    
    Active --> Hot: High Access Frequency
    Active --> Cold: Low Access Frequency
    
    Hot --> Optimized: Performance Tuning
    Cold --> Archived: Data Compression
    
    Optimized --> Hot: Continued Use
    Archived --> Cold: Occasional Access
    
    Hot --> Migrating: Access Pattern Change
    Cold --> Migrating: Access Pattern Change
    
    Migrating --> Hot: Migration to Hot Tier
    Migrating --> Cold: Migration to Cold Tier
    
    Active --> Failed: Health Check Fail
    Hot --> Failed: Critical Error
    Cold --> Failed: System Error
    
    Failed --> Recovering: Auto-Recovery
    Recovering --> Active: Recovery Success
    Recovering --> [*]: Recovery Failed
    
    note right of Hot
        🔥 HOT MCPs
        - In-memory caching
        - Optimized indexes
        - Real-time processing
    end note
    
    note right of Cold
        ❄️ COLD MCPs
        - Compressed storage
        - Batch processing
        - Archive optimization
    end note
MCP Classification Algorithm
flowchart TD
    DATA[MCP Metrics] --> SCORE{Calculate Score}
    
    SCORE --> ACCESS_FREQ[Access Frequency<br/>Weight: 40%]
    SCORE --> RESPONSE_TIME[Response Time<br/>Weight: 25%]
    SCORE --> DATA_SIZE[Data Growth<br/>Weight: 20%]
    SCORE --> USER_PRIORITY[User Priority<br/>Weight: 15%]
    
    ACCESS_FREQ --> CALCULATE[Weighted Score Calculation]
    RESPONSE_TIME --> CALCULATE
    DATA_SIZE --> CALCULATE
    USER_PRIORITY --> CALCULATE
    
    CALCULATE --> THRESHOLD{Score >= 70?}
    THRESHOLD -->|Yes| HOT[🔥 HOT Tier]
    THRESHOLD -->|No| COLD_CHECK{Score >= 30?}
    
    COLD_CHECK -->|Yes| WARM[🌡️ WARM Tier]
    COLD_CHECK -->|No| COLD[❄️ COLD Tier]
    
    HOT --> HOT_FEATURES[• In-memory cache<br/>• SSD storage<br/>• Real-time indexing<br/>• Priority processing]
    WARM --> WARM_FEATURES[• Hybrid cache<br/>• Standard storage<br/>• Periodic indexing<br/>• Normal processing]
    COLD --> COLD_FEATURES[• Disk cache<br/>• Archive storage<br/>• Batch indexing<br/>• Background processing]
    
    classDef hot fill:#ff6b6b,stroke:#d63031,stroke-width:2px,color:#fff
    classDef warm fill:#fdcb6e,stroke:#e17055,stroke-width:2px,color:#fff
    classDef cold fill:#74b9ff,stroke:#0984e3,stroke-width:2px,color:#fff
    
    class HOT,HOT_FEATURES hot
    class WARM,WARM_FEATURES warm
    class COLD,COLD_FEATURES cold

🏢 Enterprise Features
Security Architecture
graph TB
    subgraph "🛡️ Security Layers"
        AUTH[Authentication Layer]
        AUTHZ[Authorization Layer]
        ENCRYPT[Encryption Layer]
        AUDIT[Audit Layer]
    end
    
    subgraph "🔐 Authentication"
        JWT[JWT Tokens]
        OAUTH[OAuth 2.0]
        MFA[Multi-Factor Auth]
        SSO[Single Sign-On]
    end
    
    subgraph "🎫 Authorization"
        RBAC[Role-Based Access]
        ABAC[Attribute-Based Access]
        POLICIES[Policy Engine]
        PERMISSIONS[Permission Matrix]
    end
    
    subgraph "🔒 Encryption"
        TLS[TLS/SSL Transit]
        AES[AES-256 At Rest]
        KEY_MGMT[Key Management]
        FIELD[Field-Level Encryption]
    end
    
    subgraph "📋 Auditing"
        ACCESS_LOG[Access Logging]
        CHANGE_LOG[Change Tracking]
        COMPLIANCE[Compliance Reports]
        MONITORING[Security Monitoring]
    end
    
    AUTH --> JWT
    AUTH --> OAUTH
    AUTH --> MFA
    AUTH --> SSO
    
    AUTHZ --> RBAC
    AUTHZ --> ABAC
    AUTHZ --> POLICIES
    AUTHZ --> PERMISSIONS
    
    ENCRYPT --> TLS
    ENCRYPT --> AES
    ENCRYPT --> KEY_MGMT
    ENCRYPT --> FIELD
    
    AUDIT --> ACCESS_LOG
    AUDIT --> CHANGE_LOG
    AUDIT --> COMPLIANCE
    AUDIT --> MONITORING
Performance & Monitoring
graph TB
    subgraph "📊 Monitoring Dashboard"
        METRICS[Real-time Metrics]
        ALERTS[Alert System]
        HEALTH[Health Checks]
        PERF[Performance Analytics]
    end
    
    subgraph "⚡ Performance Optimization"
        CACHE[Multi-Level Caching]
        INDEX[Auto-Indexing]
        PREDICT[Predictive Loading]
        BALANCE[Load Balancing]
    end
    
    subgraph "🎯 Key Metrics"
        QPS[Queries Per Second]
        LATENCY[Response Latency]
        THROUGHPUT[Data Throughput]
        UPTIME[System Uptime]
    end
    
    subgraph "🚨 Alerting"
        THRESHOLD[Threshold Alerts]
        ANOMALY[Anomaly Detection]
        ESCALATION[Alert Escalation]
        NOTIFICATION[Multi-Channel Notifications]
    end
    
    METRICS --> QPS
    METRICS --> LATENCY
    METRICS --> THROUGHPUT
    METRICS --> UPTIME
    
    ALERTS --> THRESHOLD
    ALERTS --> ANOMALY
    ALERTS --> ESCALATION
    ALERTS --> NOTIFICATION
    
    PERF --> CACHE
    PERF --> INDEX
    PERF --> PREDICT
    PERF --> BALANCE

🚀 Quick Start
Prerequisites
  • Node.js 18+
  • npm 8+
  • TypeScript 5.3+
Installation & Setup
# Clone the repository
git clone https://github.com/notyesbut/MCP-RAG-DATABASE.git
cd MCP-RAG-DATABASE

# Install dependencies
npm install

# Build the project
npm run build

# Start the system
npm start

✅ System is now running at http://localhost:3000

Verify Installation
# Health check
curl http://localhost:3000/api/v1/health

# Expected response:
{
  "status": "healthy",
  "timestamp": 1640995200000,
  "uptime": 125.5,
  "mcps": {
    "total": 4,
    "active": 4,
    "hot": 2,
    "cold": 2
  }
}

💡 Usage Examples
1. Natural Language Queries (No SQL!)
# Query active users
curl -X POST http://localhost:3000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "naturalQuery": "show me all active users from last week"
  }'

# Search messages
curl -X POST http://localhost:3000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "naturalQuery": "find messages from john about project alpha"
  }'

# Get statistics
curl -X POST http://localhost:3000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "naturalQuery": "show user registration stats for this month"
  }'
2. Intelligent Data Ingestion
# Ingest user data (automatically classified)
curl -X POST http://localhost:3000/api/v1/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "type": "user_registration",
      "userId": "user_123",
      "email": "john@example.com",
      "registrationDate": "2024-01-15T10:30:00Z",
      "plan": "premium"
    }
  }'

# Ingest chat message (auto-tagged and indexed)
curl -X POST http://localhost:3000/api/v1/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "type": "chat_message",
      "messageId": "msg_456",
      "userId": "user_123",
      "content": "Great work on #project-alpha @team!",
      "timestamp": "2024-01-15T14:45:00Z"
    }
  }'
3. Real-time WebSocket Connection
const WebSocket = require('ws');

const ws = new WebSocket('ws://localhost:3000/ws');

ws.on('open', function open() {
  // Subscribe to real-time events
  ws.send(JSON.stringify({
    type: 'subscribe',
    events: ['user_activity', 'system_metrics', 'new_messages']
  }));
});

ws.on('message', function message(data) {
  const event = JSON.parse(data);
  console.log('Real-time event:', event);
});

📊 Performance Benchmarks
Response Time Targets
Operation Type Target Achieved Status
Natural Language Query <100ms 85ms
Data Ingestion <50ms 42ms
MCP Classification <25ms 18ms
Health Check <10ms 7ms
WebSocket Events <5ms 3ms
Scalability Metrics
Metric Current Target Max Tested
Concurrent Users 1,000 10,000 5,000
Queries/Second 2,500 25,000 12,000
Data Ingestion Rate 10,000/sec 100,000/sec 50,000/sec
MCP Instances 50 500 200
Memory Usage 2GB 16GB 8GB

🧪 Testing
Run Test Suite
# Unit tests
npm run test:unit

# Integration tests  
npm run test:integration

# Performance tests
npm run test:performance

# Complete test suite
npm run test
Load Testing
# Default load test (50 users, 30 seconds)
./scripts/load-test.sh

# Custom load test
./scripts/load-test.sh 100 50 20 120
# Parameters: [users] [requests_per_user] [ramp_up_seconds] [duration_seconds]

📈 Monitoring & Observability
Metrics Dashboard

Access the built-in dashboard at: http://localhost:3000/dashboard

Key Metrics Endpoints
# System health
GET /api/v1/health

# Detailed metrics
GET /api/v1/metrics

# MCP status
GET /api/v1/admin/mcps

# Performance stats
GET /api/v1/admin/performance
Logging

Logs are structured JSON and written to:

  • Console: Development mode
  • Files: Production mode (./logs/)
  • External: Configurable (ELK, Splunk, etc.)

🔧 Configuration
Environment Variables
# Server Configuration
PORT=3000                          # API server port
NODE_ENV=production               # Environment mode

# Database Configuration  
DB_HOST=localhost                 # Database host
DB_PORT=5432                     # Database port
DB_NAME=enterprise_mcp           # Database name

# MCP Configuration
MCP_MAX_INSTANCES=100            # Maximum MCP instances
MCP_HOT_THRESHOLD=70             # Hot classification threshold
MCP_COLD_THRESHOLD=30            # Cold classification threshold

# RAG Configuration
RAG1_ML_MODEL=bert-base          # ML model for classification
RAG2_NLP_MODEL=gpt-3.5-turbo    # NLP model for queries

# Security Configuration
JWT_SECRET=your-secret-key       # JWT signing secret
AUTH_ENABLED=true                # Enable authentication
ENCRYPTION_KEY=your-encrypt-key  # Data encryption key

# Performance Configuration
CACHE_TTL=3600                   # Cache TTL in seconds
INDEX_OPTIMIZATION=true          # Enable auto-indexing
PREDICTIVE_CACHE=true           # Enable predictive caching
Advanced Configuration

See src/config/ for detailed configuration options:

  • database.config.ts - Database settings
  • mcp.config.ts - MCP-specific configuration
  • rag.config.ts - RAG system configuration
  • security.config.ts - Security settings
  • performance.config.ts - Performance tuning

🌐 API Reference
Core Endpoints
Method Endpoint Description
POST /api/v1/query Natural language queries
POST /api/v1/ingest Data ingestion
GET /api/v1/health System health check
GET /api/v1/metrics System metrics
WS /ws WebSocket connection
Admin Endpoints
Method Endpoint Description
GET /api/v1/admin/mcps MCP status
POST /api/v1/admin/mcps/migrate Force MCP migration
GET /api/v1/admin/performance Performance analytics
POST /api/v1/admin/cache/clear Clear system cache
OpenAPI Documentation

Interactive API documentation available at: http://localhost:3000/api-docs


🏗️ Architecture Details
Technology Stack
  • Runtime: Node.js 18+, TypeScript 5.3+
  • Web Framework: Express.js with security middleware
  • Real-time: WebSocket support with Socket.io
  • AI/ML: TensorFlow.js, Natural language processing
  • Caching: Multi-level caching with Redis compatibility
  • Monitoring: Built-in metrics with Prometheus compatibility
  • Security: JWT authentication, encryption, audit logging
Design Patterns
  • Multi-Context Processor (MCP): Domain-specific data processors
  • Registry Pattern: Centralized MCP management and discovery
  • Strategy Pattern: Pluggable algorithms for routing and optimization
  • Observer Pattern: Event-driven architecture for real-time updates
  • Factory Pattern: Dynamic MCP creation and configuration
Key Components
  1. MCP Registry: Central coordinator for all MCP instances
  2. RAG₁ Controller: Intelligent data ingestion and classification
  3. RAG₂ Controller: Natural language query processing
  4. Specialized MCPs: Domain-specific processors (User, Chat, Stats, Logs)
  5. Intelligence Layer: ML models and optimization algorithms
  6. API Server: RESTful API with WebSocket support
  7. Monitoring System: Real-time metrics and health monitoring

🤝 Contributing
Development Setup
# Fork and clone
git clone https://github.com/notyesbut/MCP-RAG-DATABASE.git
cd MCP-RAG-DATABASE

# Install development dependencies
npm install

# Run in development mode
npm run dev

# Run tests
npm test

# Lint code
npm run lint

# Type check
npm run typecheck
Code Style
  • TypeScript: Strict mode enabled
  • ESLint: Airbnb configuration with TypeScript extensions
  • Prettier: Consistent code formatting
  • Husky: Pre-commit hooks for quality assurance
Pull Request Process
  1. Create feature branch from main
  2. Implement changes with tests
  3. Ensure all tests pass
  4. Update documentation if needed
  5. Submit pull request with detailed description

📜 License

This project is licensed under the BLS1.1 License - see the LICENSE file for details.


🆘 Support & Community
Getting Help
Community

🎉 What's Next?

The Enterprise Multi-MCP Smart Database System represents the future of intelligent data management. With its revolutionary architecture eliminating SQL and providing natural language interfaces, this system is ready to transform how enterprises handle data.

Upcoming Features
  • Federated Learning: Cross-MCP learning and optimization
  • Quantum Optimization: Quantum-inspired query optimization
  • Blockchain Integration: Immutable audit trails
  • Edge Computing: Distributed MCP deployment
  • Advanced AI: GPT-4 integration for enhanced intelligence

🚀 Enterprise Multi-MCP Smart Database System

No SQL. Just Natural Language. The Future is Now.