VecMed-MCP
VecMed-MCPは、Milvusベクターデータベースを構築するためのJupyter Notebookプロジェクトです。データベースのスキーマ設定、PubMedデータのダウンロード、検索機能のテスト、検索結果の要約を行うための実験が含まれています。Dockerを使用して環境を簡単に構築できるため、データ管理が効率的に行えます。
GitHubスター
1
ユーザー評価
未評価
お気に入り
0
閲覧数
22
フォーク
0
イシュー
0
VecMed-MCP: Milvus Vector Database for Medical Data
Project Overview
This repository provides tools to establish and manage a Milvus vector database for medical data, specifically designed for rare disease research. It includes scripts for database initialization, data ingestion, search functionality, LLM-based result summarization, and scheduled updates.
File Descriptions
File Name | Purpose |
---|---|
docker-compose.yml |
Used to initialize MilvusDB |
setup_milvus.py |
Formulates the database schema |
download_pubmed_tomilvusdb.py |
Handles timely updates (currently set to 30 days) |
search_milvusdb.py |
Tests search functionality in MilvusDB |
llm_process_search_result.py |
Experiments with summarizing search results using LLM |
updating_milvusdb.log |
Records database updating operations |
ATTU WebUI
The ATTU WebUI provides a visual interface to:
- View all database records
- Manage collections and schemas
- No authentication required
Current Collection: pubmed_rare_disease_db
contains over 160,000 PubMed records related to rare diseases.
Environment Configuration
Launch Milvus with Docker Compose
docker compose up -d
Install Required Dependencies
pip install marshmallow==3.20.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install Flask -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pymilvus -U -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install "mcp[cli]" -i https://pypi.tuna.tsinghua.edu.cn/simple
Set Up ATTU WebUI
docker pull zilliz/attu:v2.4.4
docker run -d --name attu -p 8000:3000 -e MILVUS_URL=192.168.10.199:19530 zilliz/attu:v2.4.4
Workflow
The general workflow can be adapted for various database types beyond medical articles:
- Build your custom Milvus database with a designed collection schema using the main folder code (steps 1-5)
- Download required data and store it in your vector database (see
download_pubmed_2015-2025
subfolder) - Launch the MCP server with HTTP API service or modify to use stdin transport (see
pubmed-mcp-server
subfolder) - Set up timer-based database updates using
download_pubmed_to_milvusdb_2.py
- Integrate the MCP server into your agent/LLM/workflow (example integration with Dify workflow provided)
Scheduled Database Updates
Add a New Cron Job
crontab -e
View Existing Cron Jobs
crontab -l
Author
David Qu
Undergraduate Researcher | AI Algorithm Engineer
University of Toronto Scarborough - Department of Computer Science
📧 davidsz.qu@mail.utoronto.ca