mcp-agentic-rag

Name: mcp-agentic-rag
Availability: InStock
Author: ravisane

mcp-agentic-rag is an agent-based Retrieval-Augmented Generation (RAG) model implemented in Python. This project aims to combine information retrieval and generation to produce more accurate and context-aware responses. It is particularly expected to be applied in the field of natural language processing, enabling tailored information delivery based on user needs.

GitHub

GitHub Stars

User Rating

Not Rated

Favorites

Views

Forks

Issues

README

MCP AI Agents RAG - Production-Ready Implementation

A sophisticated Retrieval-Augmented Generation (RAG) system using Model Context Protocol (MCP) with Firecrawl web search and Qdrant vector database. This production-ready implementation features modular architecture, comprehensive error handling, and configurable environments.

🏗️ Architecture Overview

The application follows a clean, modular architecture:

mcp_ai_agents_rag/
├── config.py                 # Centralized configuration management
├── server.py                 # MCP server implementation  
├── setup_database.py         # Database initialization script
├── requirements.txt          # Python dependencies
├── .env.example              # Environment variables template
│
├── core/                     # Core business logic
│   ├── __init__.py
│   ├── embedding_service.py  # Text embedding generation
│   ├── vector_database.py    # Qdrant vector database operations
│   └── retrieval_service.py  # High-level document retrieval
│
├── services/                 # External service integrations
│   ├── __init__.py
│   └── web_search_service.py # Firecrawl web search integration
│
├── utils/                    # Utility functions
│   ├── __init__.py
│   └── batch_processing.py   # Efficient batch processing utilities
│
└── data/                     # Data management
    ├── __init__.py
    └── ml_faq_data.py        # Machine learning FAQ dataset

✨ Key Features

🔧 Production-Ready Architecture

Modular Design: Clean separation of concerns with dedicated modules
Comprehensive Error Handling: Custom exceptions and proper error propagation
Configurable Environment: All settings managed through environment variables
Type Safety: Full type hints throughout the codebase
Logging: Structured logging for monitoring and debugging

🚀 Performance Optimizations

Batch Processing: Efficient handling of large datasets
Memory Management: Optimized vector storage and retrieval
Connection Pooling: Efficient database connections
Async-Ready: Architecture prepared for async operations

📚 Comprehensive Documentation

Google-Style Docstrings: Complete API documentation
Type Annotations: Clear parameter and return types
Usage Examples: Practical implementation examples
Error Documentation: Detailed exception handling guide

🚀 Quick Start

1. Environment Setup

# Clone or navigate to the project directory
cd mcp_ai_agents_rag

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your configuration values

2. Configure Environment Variables

Edit your .env file with the required configuration:

# API Keys
FIRECRAWL_API_KEY=your_firecrawl_api_key_here

# Vector Database Configuration
QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_COLLECTION_NAME=ml_faq_collection

# Server Configuration  
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8080

3. Start Qdrant Database

docker run -p 6333:6333 -p 6334:6334 -v "$(pwd)/qdrant_storage:/qdrant/storage:z" qdrant/qdrant

4. Initialize Database

# Set up vector database with FAQ data
python setup_database.py

# Or with custom options
python setup_database.py --collection-name custom_collection --recreate --verbose

🔧 Configuration

Environment Variables

Variable	Description	Default
`FIRECRAWL_API_KEY`	Firecrawl API key (required)	-
`QDRANT_HOST`	Qdrant server host	localhost
`QDRANT_PORT`	Qdrant server port	6333
`QDRANT_COLLECTION_NAME`	Vector database collection name	ml_faq_collection
`EMBEDDING_MODEL_NAME`	HuggingFace embedding model	nomic-ai/nomic-embed-text-v1.5
`MCP_SERVER_HOST`	MCP server host	127.0.0.1
`MCP_SERVER_PORT`	MCP server port	8080
`SEARCH_RESULTS_LIMIT`	Max search results	3
`WEB_SEARCH_LIMIT`	Max web search results	10

MCP Client Configuration

Add to your Cursor IDE MCP configuration:

{
  "mcpServers": {
    "mcp-rag-app": {
      "command": "python",
      "args": ["/absolute/path/to/server.py"],
      "host": "127.0.0.1", 
      "port": 8080,
      "timeout": 30000
    }
  }
}

MCP Tools Available

retrieve_ml_faq_documents(query: str)
- Retrieves relevant ML FAQ documents from vector database
- Use for machine learning related questions
search_web_content(query: str)
- Searches external web sources using Firecrawl
- Use for current events or topics not in FAQ
get_service_health()
- Returns health status and configuration of all services
- Use for monitoring and debugging

🧪 Testing and Development

Running the Setup Script

# Basic setup
python setup_database.py

# Recreate collection with verbose logging
python setup_database.py --recreate --verbose

# Custom collection name
python setup_database.py --collection-name my_custom_collection

Service Health Check

from server import document_retrieval_service

# Get service information
health_info = document_retrieval_service.get_service_info()
print(health_info)

📊 Performance Considerations

Batch Processing

Documents processed in configurable batches (default: 32 for embeddings, 512 for database)
Memory-efficient handling of large datasets
Progress tracking for long-running operations

Vector Database Optimization

On-disk storage for memory efficiency
Optimized indexing after ingestion
Quantization support for faster search

Error Handling

Custom exceptions for different error types
Graceful degradation when services are unavailable
Comprehensive logging for debugging

🔍 Monitoring and Logging

The application provides structured logging at multiple levels:

import logging

# Configure logging level
logging.getLogger('mcp_ai_agents_rag').setLevel(logging.DEBUG)

# Key loggers:
# - config: Configuration loading and validation
# - core.embedding_service: Embedding generation operations  
# - core.vector_database: Database operations
# - services.web_search_service: Web search operations
# - server: MCP server operations