qdrant-mcp-server
MCP server for semantic search using local Qdrant vector database and OpenAI embeddings
GitHub Stars
10
User Rating
Not Rated
Favorites
0
Views
97
Forks
10
Issues
7
Qdrant MCP Server
A Model Context Protocol (MCP) server providing semantic search capabilities using Qdrant vector database with multiple embedding providers.
Features
- Zero Setup: Works out of the box with Ollama - no API keys required
- Privacy-First: Local embeddings and vector storage - data never leaves your machine
- Code Vectorization: Intelligent codebase indexing with AST-aware chunking and semantic code search
- Multiple Providers: Ollama (default), OpenAI, Cohere, and Voyage AI
- Hybrid Search: Combine semantic and keyword search for better results
- Semantic Search: Natural language search with metadata filtering
- Incremental Indexing: Efficient updates - only re-index changed files
- Configurable Prompts: Create custom prompts for guided workflows without code changes
- Rate Limiting: Intelligent throttling with exponential backoff
- Full CRUD: Create, search, and manage collections and documents
- Flexible Deployment: Run locally (stdio) or as a remote HTTP server
- API Key Authentication: Connect to secured Qdrant instances (Qdrant Cloud, self-hosted with API keys)
Quick Start
Prerequisites
- Node.js 22+
- Podman or Docker with Compose support
Installation
# Clone and install
git clone https://github.com/mhalder/qdrant-mcp-server.git
cd qdrant-mcp-server
npm install
# Start services (choose one)
podman compose up -d # Using Podman
docker compose up -d # Using Docker
# Pull the embedding model
podman exec ollama ollama pull nomic-embed-text # Podman
docker exec ollama ollama pull nomic-embed-text # Docker
# Build
npm run build
Configuration
Local Setup (stdio transport)
Add to ~/.claude/claude_code_config.json:
{
"mcpServers": {
"qdrant": {
"command": "node",
"args": ["/path/to/qdrant-mcp-server/build/index.js"],
"env": {
"QDRANT_URL": "http://localhost:6333",
"EMBEDDING_BASE_URL": "http://localhost:11434"
}
}
}
}
Connecting to Secured Qdrant Instances
For Qdrant Cloud or self-hosted instances with API key authentication:
{
"mcpServers": {
"qdrant": {
"command": "node",
"args": ["/path/to/qdrant-mcp-server/build/index.js"],
"env": {
"QDRANT_URL": "https://your-cluster.qdrant.io:6333",
"QDRANT_API_KEY": "your-api-key-here",
"EMBEDDING_BASE_URL": "http://localhost:11434"
}
}
}
}
Remote Setup (HTTP transport)
⚠️ Security Warning: When deploying the HTTP transport in production:
- Always run behind a reverse proxy (nginx, Caddy) with HTTPS
- Implement authentication/authorization at the proxy level
- Use firewalls to restrict access to trusted networks
- Never expose directly to the public internet without protection
- Consider implementing rate limiting at the proxy level
- Monitor server logs for suspicious activity
Start the server:
TRANSPORT_MODE=http HTTP_PORT=3000 node build/index.js
Configure client:
{
"mcpServers": {
"qdrant": {
"url": "http://your-server:3000/mcp"
}
}
}
Using a different provider:
"env": {
"EMBEDDING_PROVIDER": "openai", // or "cohere", "voyage"
"OPENAI_API_KEY": "sk-...", // provider-specific API key
"QDRANT_URL": "http://localhost:6333"
}
Restart after making changes.
See Advanced Configuration section below for all options.
Tools
Collection Management
| Tool | Description |
|---|---|
create_collection |
Create collection with specified distance metric (Cosine/Euclid/Dot) |
list_collections |
List all collections |
get_collection_info |
Get collection details and statistics |
delete_collection |
Delete collection and all documents |
Document Operations
| Tool | Description |
|---|---|
add_documents |
Add documents with automatic embedding (supports string/number IDs, metadata) |
semantic_search |
Natural language search with optional metadata filtering |
hybrid_search |
Hybrid search combining semantic and keyword (BM25) search with RRF |
delete_documents |
Delete specific documents by ID |
Code Vectorization
| Tool | Description |
|---|---|
index_codebase |
Index a codebase for semantic code search with AST-aware chunking |
search_code |
Search indexed codebase using natural language queries |
reindex_changes |
Incrementally re-index only changed files (detects added/modified/deleted) |
get_index_status |
Get indexing status and statistics for a codebase |
clear_index |
Delete all indexed data for a codebase |
Resources
qdrant://collections- List all collectionsqdrant://collection/{name}- Collection details
Configurable Prompts
Create custom prompts tailored to your specific use cases without modifying code. Prompts provide guided workflows for common tasks.
Note: By default, the server looks for prompts.json in the project root directory. If the file exists, prompts are automatically loaded. You can specify a custom path using the PROMPTS_CONFIG_FILE environment variable.
Setup
Create a prompts configuration file (e.g.,
prompts.jsonin the project root):See
prompts.example.jsonfor example configurations you can copy and customize.Configure the server (optional - only needed for custom path):
If you place prompts.json in the project root, no additional configuration is needed. To use a custom path:
{
"mcpServers": {
"qdrant": {
"command": "node",
"args": ["/path/to/qdrant-mcp-server/build/index.js"],
"env": {
"QDRANT_URL": "http://localhost:6333",
"PROMPTS_CONFIG_FILE": "/custom/path/to/prompts.json"
}
}
}
}
- Use prompts in your AI assistant:
Claude Code:
/mcp__qdrant__find_similar_docs papers "neural networks" 10
VSCode:
/mcp.qdrant.find_similar_docs papers "neural networks" 10
Example Prompts
See prompts.example.json for ready-to-use prompts including:
find_similar_docs- Semantic search with result explanationsetup_rag_collection- Create RAG-optimized collectionsanalyze_collection- Collection insights and recommendationsbulk_add_documents- Guided bulk document insertionsearch_with_filter- Metadata filtering assistancecompare_search_methods- Semantic vs hybrid search comparisoncollection_maintenance- Maintenance and cleanup workflowsmigrate_to_hybrid- Collection migration guide
Template Syntax
Templates use {{variable}} placeholders:
- Required arguments must be provided
- Optional arguments use defaults if not specified
- Unknown variables are left as-is in the output
Code Vectorization
Intelligently index and search your codebase using semantic code search. Perfect for AI-assisted development, code exploration, and understanding large codebases.
Features
- AST-Aware Chunking: Intelligent code splitting at function/class boundaries using tree-sitter
- Multi-Language Support: 35+ file types including TypeScript, Python, Java, Go, Rust, C++, and more
- Incremental Updates: Only re-index changed files for fast updates
- Smart Ignore Patterns: Respects .gitignore, .dockerignore, and custom .contextignore files
- Semantic Search: Natural language queries to find relevant code
- Metadata Filtering: Filter by file type, path patterns, or language
- Local-First: All processing happens locally - your code never leaves your machine
Quick Start
1. Index your codebase:
# Via Claude Code MCP tool
/mcp__qdrant__index_codebase /path/to/your/project
2. Search your code:
# Natural language search
/mcp__qdrant__search_code /path/to/your/project "authentication middleware"
# Filter by file type
/mcp__qdrant__search_code /path/to/your/project "database schema" --fileTypes .ts,.js
# Filter by path pattern
/mcp__qdrant__search_code /path/to/your/project "API endpoints" --pathPattern src/api/**
3. Update after changes:
# Incrementally re-index only changed files
/mcp__qdrant__reindex_changes /path/to/your/project
Usage Examples
Index a TypeScript Project
// The MCP tool automatically:
// 1. Scans all .ts, .tsx, .js, .jsx files
// 2. Respects .gitignore patterns (skips node_modules, dist, etc.)
// 3. Chunks code at function/class boundaries
// 4. Generates embeddings using your configured provider
// 5. Stores in Qdrant with metadata (file path, line numbers, language)
index_codebase({
path: "/workspace/my-app",
forceReindex: false, // Set to true to re-index from scratch
});
// Output:
// ✓ Indexed 247 files (1,823 chunks) in 45.2s
Search for Authentication Code
search_code({
path: "/workspace/my-app",
query: "how does user authentication work?",
limit: 5,
});
// Results include file path, line numbers, and code snippets:
// [
// {
// filePath: "src/auth/middleware.ts",
// startLine: 15,
// endLine: 42,
// content: "export async function authenticateUser(req: Request) { ... }",
// score: 0.89,
// language: "typescript"
// },
// ...
// ]
Search with Filters
// Only search TypeScript files
search_code({
path: "/workspace/my-app",
query: "error handling patterns",
fileTypes: [".ts", ".tsx"],
limit: 10,
});
// Only search in specific directories
search_code({
path: "/workspace/my-app",
query: "API route handlers",
pathPattern: "src/api/**",
limit: 10,
});
Incremental Re-indexing
// After making changes to your codebase
reindex_changes({
path: "/workspace/my-app",
});
// Output:
// ✓ Updated: +3 files added, ~5 files modified, -1 files deleted
// ✓ Chunks: +47 added, -23 deleted in 8.3s
Check Indexing Status
get_index_status({
path: "/workspace/my-app",
});
// Output:
// {
// status: "indexed", // "not_indexed" | "indexing" | "indexed"
// isIndexed: true, // deprecated: use status instead
// collectionName: "code_a3f8d2e1",
// chunksCount: 1823,
// filesCount: 247,
// lastUpdated: "2025-01-30T10:15:00Z",
// languages: ["typescript", "javascript", "json"]
// }
Supported Languages
Programming Languages (35+ file types):
- Web: TypeScript, JavaScript, Vue, Svelte
- Backend: Python, Java, Go, Rust, Ruby, PHP
- Systems: C, C++, C#
- Mobile: Swift, Kotlin, Dart
- Functional: Scala, Clojure, Haskell, OCaml
- Scripting: Bash, Shell, Fish
- Data: SQL, GraphQL, Protocol Buffers
- Config: JSON, YAML, TOML, XML, Markdown
See configuration for full list and customization options.
Custom Ignore Patterns
Create a .contextignore file in your project root to specify additional patterns to ignore:
# .contextignore
**/test/**
**/*.test.ts
**/*.spec.ts
**/fixtures/**
**/mocks/**
**/__tests__/**
Best Practices
- Index Once, Update Incrementally: Use
index_codebasefor initial indexing, thenreindex_changesfor updates - Use Filters: Narrow search scope with
fileTypesandpathPatternfor better results - Meaningful Queries: Use natural language that describes what you're looking for (e.g., "database connection pooling" instead of "db")
- Check Status First: Use
get_index_statusto verify a codebase is indexed before searching - Local Embedding: Use Ollama (default) to keep everything local and private
Performance
Typical performance on a modern laptop (Apple M1/M2 or similar):
| Codebase Size | Files | Indexing Time | Search Latency |
|---|---|---|---|
| Small (10k LOC) | 50 | ~10s | <100ms |
| Medium (100k LOC) | 500 | ~2min | <200ms |
| Large (500k LOC) | 2,500 | ~10min | <500ms |
Note: Indexing time varies based on embedding provider. Ollama (local) is fastest for initial indexing.
Examples
See examples/ directory for detailed guides:
- Basic Usage - Create collections, add documents, search
- Hybrid Search - Combine semantic and keyword search
- Knowledge Base - Structured documentation with metadata
- Advanced Filtering - Complex boolean filters
- Rate Limiting - Batch processing with cloud providers
- Code Search - Index codebases and semantic code search
Advanced Configuration
Environment Variables
Core Configuration
| Variable | Description | Default |
|---|---|---|
TRANSPORT_MODE |
"stdio" or "http" | stdio |
HTTP_PORT |
Port for HTTP transport | 3000 |
HTTP_REQUEST_TIMEOUT_MS |
Request timeout for HTTP transport (ms) | 300000 |
EMBEDDING_PROVIDER |
"ollama", "openai", "cohere", "voyage" | ollama |
QDRANT_URL |
Qdrant server URL | http://localhost:6333 |
QDRANT_API_KEY |
API key for Qdrant authentication | - |
PROMPTS_CONFIG_FILE |
Path to prompts configuration JSON | prompts.json |
Embedding Configuration
| Variable | Description | Default |
|---|---|---|
EMBEDDING_MODEL |
Model name | Provider-specific |
EMBEDDING_BASE_URL |
Custom API URL | Provider-specific |
EMBEDDING_MAX_REQUESTS_PER_MINUTE |
Rate limit | Provider-specific |
EMBEDDING_RETRY_ATTEMPTS |
Retry count | 3 |
EMBEDDING_RETRY_DELAY |
Initial retry delay (ms) | 1000 |
OPENAI_API_KEY |
OpenAI API key | - |
COHERE_API_KEY |
Cohere API key | - |
VOYAGE_API_KEY |
Voyage AI API key | - |
Code Vectorization Configuration
| Variable | Description | Default |
|---|---|---|
CODE_CHUNK_SIZE |
Maximum chunk size in characters | 2500 |
CODE_CHUNK_OVERLAP |
Overlap between chunks in characters | 300 |
CODE_ENABLE_AST |
Enable AST-aware chunking (tree-sitter) | true |
CODE_BATCH_SIZE |
Number of chunks to embed in one batch | 100 |
CODE_CUSTOM_EXTENSIONS |
Additional file extensions (comma-separated) | - |
CODE_CUSTOM_IGNORE |
Additional ignore patterns (comma-separated) | - |
CODE_DEFAULT_LIMIT |
Default search result limit | 5 |
Provider Comparison
| Provider | Models | Dimensions | Rate Limit | Notes |
|---|---|---|---|---|
| Ollama | nomic-embed-text (default), mxbai-embed-large, all-minilm |
768, 1024, 384 | None | Local, no API key |
| OpenAI | text-embedding-3-small (default), text-embedding-3-large |
1536, 3072 | 3500/min | Cloud API |
| Cohere | embed-english-v3.0 (default), embed-multilingual-v3.0 |
1024 | 100/min | Multilingual support |
| Voyage | voyage-2 (default), voyage-large-2, voyage-code-2 |
1024, 1536 | 300/min | Code-specialized |
Note: Ollama models require pulling before use:
- Podman:
podman exec ollama ollama pull <model-name> - Docker:
docker exec ollama ollama pull <model-name>
Troubleshooting
| Issue | Solution |
|---|---|
| Qdrant not running | podman compose up -d or docker compose up -d |
| Collection missing | Create collection first before adding documents |
| Ollama not running | Verify with curl http://localhost:11434, start with podman compose up -d |
| Model missing | podman exec ollama ollama pull nomic-embed-text or docker exec ollama ollama pull ... |
| Rate limit errors | Adjust EMBEDDING_MAX_REQUESTS_PER_MINUTE to match your provider tier |
| API key errors | Verify correct API key in environment configuration |
| Qdrant unauthorized | Set QDRANT_API_KEY environment variable for secured instances |
| Filter errors | Ensure Qdrant filter format, check field names match metadata |
| Codebase not indexed | Run index_codebase before search_code |
| Slow indexing | Use Ollama (local) for faster indexing, or increase CODE_BATCH_SIZE |
| Files not found | Check .gitignore and .contextignore patterns |
| Search returns no results | Try broader queries, check if codebase is indexed with get_index_status |
| Out of memory during index | Reduce CODE_CHUNK_SIZE or CODE_BATCH_SIZE |
Development
npm run dev # Development with auto-reload
npm run build # Production build
npm run type-check # TypeScript validation
npm test # Run test suite
npm run test:coverage # Coverage report
Testing
563 tests across 21 test files with 98%+ coverage:
- Unit Tests: QdrantManager (54), Ollama (41), OpenAI (25), Cohere (29), Voyage (31), Factory (43), Prompts (50), Transport (15), MCP Server (19)
- Integration Tests: Code indexer, scanner, chunker, synchronizer, merkle tree
CI/CD: GitHub Actions runs build, type-check, and tests on Node.js 22 LTS for every push/PR.
Contributing
Contributions welcome! See CONTRIBUTING.md for:
- Development workflow
- Conventional commit format (
feat:,fix:,BREAKING CHANGE:) - Testing requirements (run
npm test,npm run type-check,npm run build)
Automated releases: Semantic versioning via conventional commits - feat: → minor, fix: → patch, BREAKING CHANGE: → major.
Acknowledgments
The code vectorization feature is inspired by and builds upon concepts from the excellent claude-context project (MIT License, Copyright 2025 Zilliz).
License
MIT - see LICENSE file.