mcp-agentic-rag
mcp-agentic-rag is an agent-based Retrieval-Augmented Generation (RAG) model implemented in Python. This project aims to combine information retrieval and generation to produce more accurate and context-aware responses. It is particularly expected to be applied in the field of natural language processing, enabling tailored information delivery based on user needs.
GitHub Stars
0
User Rating
Not Rated
Favorites
0
Views
27
Forks
0
Issues
0
MCP AI Agents RAG - Production-Ready Implementation
A sophisticated Retrieval-Augmented Generation (RAG) system using Model Context Protocol (MCP) with Firecrawl web search and Qdrant vector database. This production-ready implementation features modular architecture, comprehensive error handling, and configurable environments.
๐๏ธ Architecture Overview
The application follows a clean, modular architecture:
mcp_ai_agents_rag/
โโโ config.py # Centralized configuration management
โโโ server.py # MCP server implementation
โโโ setup_database.py # Database initialization script
โโโ requirements.txt # Python dependencies
โโโ .env.example # Environment variables template
โ
โโโ core/ # Core business logic
โ โโโ __init__.py
โ โโโ embedding_service.py # Text embedding generation
โ โโโ vector_database.py # Qdrant vector database operations
โ โโโ retrieval_service.py # High-level document retrieval
โ
โโโ services/ # External service integrations
โ โโโ __init__.py
โ โโโ web_search_service.py # Firecrawl web search integration
โ
โโโ utils/ # Utility functions
โ โโโ __init__.py
โ โโโ batch_processing.py # Efficient batch processing utilities
โ
โโโ data/ # Data management
โโโ __init__.py
โโโ ml_faq_data.py # Machine learning FAQ dataset
โจ Key Features
๐ง Production-Ready Architecture
- Modular Design: Clean separation of concerns with dedicated modules
- Comprehensive Error Handling: Custom exceptions and proper error propagation
- Configurable Environment: All settings managed through environment variables
- Type Safety: Full type hints throughout the codebase
- Logging: Structured logging for monitoring and debugging
๐ Performance Optimizations
- Batch Processing: Efficient handling of large datasets
- Memory Management: Optimized vector storage and retrieval
- Connection Pooling: Efficient database connections
- Async-Ready: Architecture prepared for async operations
๐ Comprehensive Documentation
- Google-Style Docstrings: Complete API documentation
- Type Annotations: Clear parameter and return types
- Usage Examples: Practical implementation examples
- Error Documentation: Detailed exception handling guide
๐ Quick Start
1. Environment Setup
# Clone or navigate to the project directory
cd mcp_ai_agents_rag
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your configuration values
2. Configure Environment Variables
Edit your .env file with the required configuration:
# API Keys
FIRECRAWL_API_KEY=your_firecrawl_api_key_here
# Vector Database Configuration
QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_COLLECTION_NAME=ml_faq_collection
# Server Configuration
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8080
3. Start Qdrant Database
docker run -p 6333:6333 -p 6334:6334 -v "$(pwd)/qdrant_storage:/qdrant/storage:z" qdrant/qdrant
4. Initialize Database
# Set up vector database with FAQ data
python setup_database.py
# Or with custom options
python setup_database.py --collection-name custom_collection --recreate --verbose
๐ง Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
FIRECRAWL_API_KEY |
Firecrawl API key (required) | - |
QDRANT_HOST |
Qdrant server host | localhost |
QDRANT_PORT |
Qdrant server port | 6333 |
QDRANT_COLLECTION_NAME |
Vector database collection name | ml_faq_collection |
EMBEDDING_MODEL_NAME |
HuggingFace embedding model | nomic-ai/nomic-embed-text-v1.5 |
MCP_SERVER_HOST |
MCP server host | 127.0.0.1 |
MCP_SERVER_PORT |
MCP server port | 8080 |
SEARCH_RESULTS_LIMIT |
Max search results | 3 |
WEB_SEARCH_LIMIT |
Max web search results | 10 |
MCP Client Configuration
Add to your Cursor IDE MCP configuration:
{
"mcpServers": {
"mcp-rag-app": {
"command": "python",
"args": ["/absolute/path/to/server.py"],
"host": "127.0.0.1",
"port": 8080,
"timeout": 30000
}
}
}
MCP Tools Available
retrieve_ml_faq_documents(query: str)- Retrieves relevant ML FAQ documents from vector database
- Use for machine learning related questions
search_web_content(query: str)- Searches external web sources using Firecrawl
- Use for current events or topics not in FAQ
get_service_health()- Returns health status and configuration of all services
- Use for monitoring and debugging
๐งช Testing and Development
Running the Setup Script
# Basic setup
python setup_database.py
# Recreate collection with verbose logging
python setup_database.py --recreate --verbose
# Custom collection name
python setup_database.py --collection-name my_custom_collection
Service Health Check
from server import document_retrieval_service
# Get service information
health_info = document_retrieval_service.get_service_info()
print(health_info)
๐ Performance Considerations
Batch Processing
- Documents processed in configurable batches (default: 32 for embeddings, 512 for database)
- Memory-efficient handling of large datasets
- Progress tracking for long-running operations
Vector Database Optimization
- On-disk storage for memory efficiency
- Optimized indexing after ingestion
- Quantization support for faster search
Error Handling
- Custom exceptions for different error types
- Graceful degradation when services are unavailable
- Comprehensive logging for debugging
๐ Monitoring and Logging
The application provides structured logging at multiple levels:
import logging
# Configure logging level
logging.getLogger('mcp_ai_agents_rag').setLevel(logging.DEBUG)
# Key loggers:
# - config: Configuration loading and validation
# - core.embedding_service: Embedding generation operations
# - core.vector_database: Database operations
# - services.web_search_service: Web search operations
# - server: MCP server operations
๐ค Contributing
This refactored codebase follows production-ready standards:
- PEP 8 compliance for code style
- Type hints throughout the codebase
- Comprehensive docstrings using Google style
- Error handling with custom exceptions
- Modular architecture for maintainability
- Configuration management through environment variables
๐ Security Considerations
- API keys stored in environment variables
- Input validation on all user inputs
- Error messages that don't expose internal details
- Secure configuration management
๐ Scalability Features
- Configurable batch sizes for different workloads
- Memory-efficient processing for large datasets
- Modular services that can be deployed independently
- Database optimization for large-scale deployments
Ready for production deployment with monitoring, error handling, and scalable architecture.