VecMed-MCP

VecMed-MCPは、Milvusベクターデータベースを構築するためのJupyter Notebookプロジェクトです。データベースのスキーマ設定、PubMedデータのダウンロード、検索機能のテスト、検索結果の要約を行うための実験が含まれています。Dockerを使用して環境を簡単に構築できるため、データ管理が効率的に行えます。

GitHubスター

1

ユーザー評価

未評価

お気に入り

0

閲覧数

22

フォーク

0

イシュー

0

README
VecMed-MCP: Milvus Vector Database for Medical Data
Project Overview
47c3a52e549d46a9bbdee82e38fe4b79~tplv-5jbd59dj06-image

This repository provides tools to establish and manage a Milvus vector database for medical data, specifically designed for rare disease research. It includes scripts for database initialization, data ingestion, search functionality, LLM-based result summarization, and scheduled updates.

File Descriptions
File Name Purpose
docker-compose.yml Used to initialize MilvusDB
setup_milvus.py Formulates the database schema
download_pubmed_tomilvusdb.py Handles timely updates (currently set to 30 days)
search_milvusdb.py Tests search functionality in MilvusDB
llm_process_search_result.py Experiments with summarizing search results using LLM
updating_milvusdb.log Records database updating operations
ATTU WebUI

The ATTU WebUI provides a visual interface to:

  • View all database records
  • Manage collections and schemas
  • No authentication required

Current Collection: pubmed_rare_disease_db contains over 160,000 PubMed records related to rare diseases.

Attu WebUI

Environment Configuration
Launch Milvus with Docker Compose
docker compose up -d
Install Required Dependencies
pip install marshmallow==3.20.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install Flask -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pymilvus -U -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install "mcp[cli]" -i https://pypi.tuna.tsinghua.edu.cn/simple
Set Up ATTU WebUI
docker pull zilliz/attu:v2.4.4
docker run -d --name attu -p 8000:3000 -e MILVUS_URL=192.168.10.199:19530 zilliz/attu:v2.4.4
Workflow

The general workflow can be adapted for various database types beyond medical articles:

  1. Build your custom Milvus database with a designed collection schema using the main folder code (steps 1-5)
  2. Download required data and store it in your vector database (see download_pubmed_2015-2025 subfolder)
  3. Launch the MCP server with HTTP API service or modify to use stdin transport (see pubmed-mcp-server subfolder)
  4. Set up timer-based database updates using download_pubmed_to_milvusdb_2.py
  5. Integrate the MCP server into your agent/LLM/workflow (example integration with Dify workflow provided)
image
Scheduled Database Updates
Add a New Cron Job
crontab -e
View Existing Cron Jobs
crontab -l
Author

David Qu
Undergraduate Researcher | AI Algorithm Engineer
University of Toronto Scarborough - Department of Computer Science
📧 davidsz.qu@mail.utoronto.ca

作者情報
David Qu

Undergraduate AI Researcher/Engineer, interested in LLM and CV.

Toronto, Canada

0

フォロワー

14

リポジトリ

0

Gist

0

貢献数