qdrant-mcp-server

MCP server for semantic search using local Qdrant vector database and OpenAI embeddings

GitHub Stars

10

User Rating

Not Rated

Favorites

0

Views

97

Forks

10

Issues

7

README
Qdrant MCP Server

CI
codecov

A Model Context Protocol (MCP) server providing semantic search capabilities using Qdrant vector database with multiple embedding providers.

Features
  • Zero Setup: Works out of the box with Ollama - no API keys required
  • Privacy-First: Local embeddings and vector storage - data never leaves your machine
  • Code Vectorization: Intelligent codebase indexing with AST-aware chunking and semantic code search
  • Multiple Providers: Ollama (default), OpenAI, Cohere, and Voyage AI
  • Hybrid Search: Combine semantic and keyword search for better results
  • Semantic Search: Natural language search with metadata filtering
  • Incremental Indexing: Efficient updates - only re-index changed files
  • Configurable Prompts: Create custom prompts for guided workflows without code changes
  • Rate Limiting: Intelligent throttling with exponential backoff
  • Full CRUD: Create, search, and manage collections and documents
  • Flexible Deployment: Run locally (stdio) or as a remote HTTP server
  • API Key Authentication: Connect to secured Qdrant instances (Qdrant Cloud, self-hosted with API keys)
Quick Start
Prerequisites
  • Node.js 22+
  • Podman or Docker with Compose support
Installation
# Clone and install
git clone https://github.com/mhalder/qdrant-mcp-server.git
cd qdrant-mcp-server
npm install

# Start services (choose one)
podman compose up -d   # Using Podman
docker compose up -d   # Using Docker

# Pull the embedding model
podman exec ollama ollama pull nomic-embed-text  # Podman
docker exec ollama ollama pull nomic-embed-text  # Docker

# Build
npm run build
Configuration
Local Setup (stdio transport)

Add to ~/.claude/claude_code_config.json:

{
  "mcpServers": {
    "qdrant": {
      "command": "node",
      "args": ["/path/to/qdrant-mcp-server/build/index.js"],
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "EMBEDDING_BASE_URL": "http://localhost:11434"
      }
    }
  }
}
Connecting to Secured Qdrant Instances

For Qdrant Cloud or self-hosted instances with API key authentication:

{
  "mcpServers": {
    "qdrant": {
      "command": "node",
      "args": ["/path/to/qdrant-mcp-server/build/index.js"],
      "env": {
        "QDRANT_URL": "https://your-cluster.qdrant.io:6333",
        "QDRANT_API_KEY": "your-api-key-here",
        "EMBEDDING_BASE_URL": "http://localhost:11434"
      }
    }
  }
}
Remote Setup (HTTP transport)

⚠️ Security Warning: When deploying the HTTP transport in production:

  • Always run behind a reverse proxy (nginx, Caddy) with HTTPS
  • Implement authentication/authorization at the proxy level
  • Use firewalls to restrict access to trusted networks
  • Never expose directly to the public internet without protection
  • Consider implementing rate limiting at the proxy level
  • Monitor server logs for suspicious activity

Start the server:

TRANSPORT_MODE=http HTTP_PORT=3000 node build/index.js

Configure client:

{
  "mcpServers": {
    "qdrant": {
      "url": "http://your-server:3000/mcp"
    }
  }
}

Using a different provider:

"env": {
  "EMBEDDING_PROVIDER": "openai",  // or "cohere", "voyage"
  "OPENAI_API_KEY": "sk-...",      // provider-specific API key
  "QDRANT_URL": "http://localhost:6333"
}

Restart after making changes.

See Advanced Configuration section below for all options.

Tools
Collection Management
Tool Description
create_collection Create collection with specified distance metric (Cosine/Euclid/Dot)
list_collections List all collections
get_collection_info Get collection details and statistics
delete_collection Delete collection and all documents
Document Operations
Tool Description
add_documents Add documents with automatic embedding (supports string/number IDs, metadata)
semantic_search Natural language search with optional metadata filtering
hybrid_search Hybrid search combining semantic and keyword (BM25) search with RRF
delete_documents Delete specific documents by ID
Code Vectorization
Tool Description
index_codebase Index a codebase for semantic code search with AST-aware chunking
search_code Search indexed codebase using natural language queries
reindex_changes Incrementally re-index only changed files (detects added/modified/deleted)
get_index_status Get indexing status and statistics for a codebase
clear_index Delete all indexed data for a codebase
Resources
  • qdrant://collections - List all collections
  • qdrant://collection/{name} - Collection details
Configurable Prompts

Create custom prompts tailored to your specific use cases without modifying code. Prompts provide guided workflows for common tasks.

Note: By default, the server looks for prompts.json in the project root directory. If the file exists, prompts are automatically loaded. You can specify a custom path using the PROMPTS_CONFIG_FILE environment variable.

Setup
  1. Create a prompts configuration file (e.g., prompts.json in the project root):

    See prompts.example.json for example configurations you can copy and customize.

  2. Configure the server (optional - only needed for custom path):

If you place prompts.json in the project root, no additional configuration is needed. To use a custom path:

{
  "mcpServers": {
    "qdrant": {
      "command": "node",
      "args": ["/path/to/qdrant-mcp-server/build/index.js"],
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "PROMPTS_CONFIG_FILE": "/custom/path/to/prompts.json"
      }
    }
  }
}
  1. Use prompts in your AI assistant:

Claude Code:

/mcp__qdrant__find_similar_docs papers "neural networks" 10

VSCode:

/mcp.qdrant.find_similar_docs papers "neural networks" 10
Example Prompts

See prompts.example.json for ready-to-use prompts including:

  • find_similar_docs - Semantic search with result explanation
  • setup_rag_collection - Create RAG-optimized collections
  • analyze_collection - Collection insights and recommendations
  • bulk_add_documents - Guided bulk document insertion
  • search_with_filter - Metadata filtering assistance
  • compare_search_methods - Semantic vs hybrid search comparison
  • collection_maintenance - Maintenance and cleanup workflows
  • migrate_to_hybrid - Collection migration guide
Template Syntax

Templates use {{variable}} placeholders:

  • Required arguments must be provided
  • Optional arguments use defaults if not specified
  • Unknown variables are left as-is in the output
Code Vectorization

Intelligently index and search your codebase using semantic code search. Perfect for AI-assisted development, code exploration, and understanding large codebases.

Features
  • AST-Aware Chunking: Intelligent code splitting at function/class boundaries using tree-sitter
  • Multi-Language Support: 35+ file types including TypeScript, Python, Java, Go, Rust, C++, and more
  • Incremental Updates: Only re-index changed files for fast updates
  • Smart Ignore Patterns: Respects .gitignore, .dockerignore, and custom .contextignore files
  • Semantic Search: Natural language queries to find relevant code
  • Metadata Filtering: Filter by file type, path patterns, or language
  • Local-First: All processing happens locally - your code never leaves your machine
Quick Start

1. Index your codebase:

# Via Claude Code MCP tool
/mcp__qdrant__index_codebase /path/to/your/project

2. Search your code:

# Natural language search
/mcp__qdrant__search_code /path/to/your/project "authentication middleware"

# Filter by file type
/mcp__qdrant__search_code /path/to/your/project "database schema" --fileTypes .ts,.js

# Filter by path pattern
/mcp__qdrant__search_code /path/to/your/project "API endpoints" --pathPattern src/api/**

3. Update after changes:

# Incrementally re-index only changed files
/mcp__qdrant__reindex_changes /path/to/your/project
Usage Examples
Index a TypeScript Project
// The MCP tool automatically:
// 1. Scans all .ts, .tsx, .js, .jsx files
// 2. Respects .gitignore patterns (skips node_modules, dist, etc.)
// 3. Chunks code at function/class boundaries
// 4. Generates embeddings using your configured provider
// 5. Stores in Qdrant with metadata (file path, line numbers, language)

index_codebase({
  path: "/workspace/my-app",
  forceReindex: false, // Set to true to re-index from scratch
});

// Output:
// ✓ Indexed 247 files (1,823 chunks) in 45.2s
Search for Authentication Code
search_code({
  path: "/workspace/my-app",
  query: "how does user authentication work?",
  limit: 5,
});

// Results include file path, line numbers, and code snippets:
// [
//   {
//     filePath: "src/auth/middleware.ts",
//     startLine: 15,
//     endLine: 42,
//     content: "export async function authenticateUser(req: Request) { ... }",
//     score: 0.89,
//     language: "typescript"
//   },
//   ...
// ]
Search with Filters
// Only search TypeScript files
search_code({
  path: "/workspace/my-app",
  query: "error handling patterns",
  fileTypes: [".ts", ".tsx"],
  limit: 10,
});

// Only search in specific directories
search_code({
  path: "/workspace/my-app",
  query: "API route handlers",
  pathPattern: "src/api/**",
  limit: 10,
});
Incremental Re-indexing
// After making changes to your codebase
reindex_changes({
  path: "/workspace/my-app",
});

// Output:
// ✓ Updated: +3 files added, ~5 files modified, -1 files deleted
// ✓ Chunks: +47 added, -23 deleted in 8.3s
Check Indexing Status
get_index_status({
  path: "/workspace/my-app",
});

// Output:
// {
//   status: "indexed",      // "not_indexed" | "indexing" | "indexed"
//   isIndexed: true,        // deprecated: use status instead
//   collectionName: "code_a3f8d2e1",
//   chunksCount: 1823,
//   filesCount: 247,
//   lastUpdated: "2025-01-30T10:15:00Z",
//   languages: ["typescript", "javascript", "json"]
// }
Supported Languages

Programming Languages (35+ file types):

  • Web: TypeScript, JavaScript, Vue, Svelte
  • Backend: Python, Java, Go, Rust, Ruby, PHP
  • Systems: C, C++, C#
  • Mobile: Swift, Kotlin, Dart
  • Functional: Scala, Clojure, Haskell, OCaml
  • Scripting: Bash, Shell, Fish
  • Data: SQL, GraphQL, Protocol Buffers
  • Config: JSON, YAML, TOML, XML, Markdown

See configuration for full list and customization options.

Custom Ignore Patterns

Create a .contextignore file in your project root to specify additional patterns to ignore:

# .contextignore
**/test/**
**/*.test.ts
**/*.spec.ts
**/fixtures/**
**/mocks/**
**/__tests__/**
Best Practices
  1. Index Once, Update Incrementally: Use index_codebase for initial indexing, then reindex_changes for updates
  2. Use Filters: Narrow search scope with fileTypes and pathPattern for better results
  3. Meaningful Queries: Use natural language that describes what you're looking for (e.g., "database connection pooling" instead of "db")
  4. Check Status First: Use get_index_status to verify a codebase is indexed before searching
  5. Local Embedding: Use Ollama (default) to keep everything local and private
Performance

Typical performance on a modern laptop (Apple M1/M2 or similar):

Codebase Size Files Indexing Time Search Latency
Small (10k LOC) 50 ~10s <100ms
Medium (100k LOC) 500 ~2min <200ms
Large (500k LOC) 2,500 ~10min <500ms

Note: Indexing time varies based on embedding provider. Ollama (local) is fastest for initial indexing.

Examples

See examples/ directory for detailed guides:

Advanced Configuration
Environment Variables
Core Configuration
Variable Description Default
TRANSPORT_MODE "stdio" or "http" stdio
HTTP_PORT Port for HTTP transport 3000
HTTP_REQUEST_TIMEOUT_MS Request timeout for HTTP transport (ms) 300000
EMBEDDING_PROVIDER "ollama", "openai", "cohere", "voyage" ollama
QDRANT_URL Qdrant server URL http://localhost:6333
QDRANT_API_KEY API key for Qdrant authentication -
PROMPTS_CONFIG_FILE Path to prompts configuration JSON prompts.json
Embedding Configuration
Variable Description Default
EMBEDDING_MODEL Model name Provider-specific
EMBEDDING_BASE_URL Custom API URL Provider-specific
EMBEDDING_MAX_REQUESTS_PER_MINUTE Rate limit Provider-specific
EMBEDDING_RETRY_ATTEMPTS Retry count 3
EMBEDDING_RETRY_DELAY Initial retry delay (ms) 1000
OPENAI_API_KEY OpenAI API key -
COHERE_API_KEY Cohere API key -
VOYAGE_API_KEY Voyage AI API key -
Code Vectorization Configuration
Variable Description Default
CODE_CHUNK_SIZE Maximum chunk size in characters 2500
CODE_CHUNK_OVERLAP Overlap between chunks in characters 300
CODE_ENABLE_AST Enable AST-aware chunking (tree-sitter) true
CODE_BATCH_SIZE Number of chunks to embed in one batch 100
CODE_CUSTOM_EXTENSIONS Additional file extensions (comma-separated) -
CODE_CUSTOM_IGNORE Additional ignore patterns (comma-separated) -
CODE_DEFAULT_LIMIT Default search result limit 5
Provider Comparison
Provider Models Dimensions Rate Limit Notes
Ollama nomic-embed-text (default), mxbai-embed-large, all-minilm 768, 1024, 384 None Local, no API key
OpenAI text-embedding-3-small (default), text-embedding-3-large 1536, 3072 3500/min Cloud API
Cohere embed-english-v3.0 (default), embed-multilingual-v3.0 1024 100/min Multilingual support
Voyage voyage-2 (default), voyage-large-2, voyage-code-2 1024, 1536 300/min Code-specialized

Note: Ollama models require pulling before use:

  • Podman: podman exec ollama ollama pull <model-name>
  • Docker: docker exec ollama ollama pull <model-name>
Troubleshooting
Issue Solution
Qdrant not running podman compose up -d or docker compose up -d
Collection missing Create collection first before adding documents
Ollama not running Verify with curl http://localhost:11434, start with podman compose up -d
Model missing podman exec ollama ollama pull nomic-embed-text or docker exec ollama ollama pull ...
Rate limit errors Adjust EMBEDDING_MAX_REQUESTS_PER_MINUTE to match your provider tier
API key errors Verify correct API key in environment configuration
Qdrant unauthorized Set QDRANT_API_KEY environment variable for secured instances
Filter errors Ensure Qdrant filter format, check field names match metadata
Codebase not indexed Run index_codebase before search_code
Slow indexing Use Ollama (local) for faster indexing, or increase CODE_BATCH_SIZE
Files not found Check .gitignore and .contextignore patterns
Search returns no results Try broader queries, check if codebase is indexed with get_index_status
Out of memory during index Reduce CODE_CHUNK_SIZE or CODE_BATCH_SIZE
Development
npm run dev          # Development with auto-reload
npm run build        # Production build
npm run type-check   # TypeScript validation
npm test             # Run test suite
npm run test:coverage # Coverage report
Testing

563 tests across 21 test files with 98%+ coverage:

  • Unit Tests: QdrantManager (54), Ollama (41), OpenAI (25), Cohere (29), Voyage (31), Factory (43), Prompts (50), Transport (15), MCP Server (19)
  • Integration Tests: Code indexer, scanner, chunker, synchronizer, merkle tree

CI/CD: GitHub Actions runs build, type-check, and tests on Node.js 22 LTS for every push/PR.

Contributing

Contributions welcome! See CONTRIBUTING.md for:

  • Development workflow
  • Conventional commit format (feat:, fix:, BREAKING CHANGE:)
  • Testing requirements (run npm test, npm run type-check, npm run build)

Automated releases: Semantic versioning via conventional commits - feat: → minor, fix: → patch, BREAKING CHANGE: → major.

Acknowledgments

The code vectorization feature is inspired by and builds upon concepts from the excellent claude-context project (MIT License, Copyright 2025 Zilliz).

License

MIT - see LICENSE file.