arxiv-mcp

Name: arxiv-mcp
Availability: InStock
Author: Tejas242

arxiv-mcpは、arXivからの論文データを取得し、APIを介してアクセスできるようにするツールです。ユーザーは簡単に論文情報を取得し、分析や研究に活用できます。データの取得は自動化されており、効率的なワークフローを実現します。

GitHub

GitHubスター

ユーザー評価

未評価

お気に入り

閲覧数

フォーク

イシュー

README

arXiv MCP Server

Access the world's largest repository of academic papers through the Model Context Protocol

A streamlined Model Context Protocol server that connects AI assistants to arXiv's vast collection of academic papers. Search, analyze, and download research papers directly from your AI workflow.

🚀 Quick Start

Prerequisites

Python 3.12+
uv package manager

Installation

Option 1: Docker (Recommended)

# Pull and run the Docker image
docker run --rm -it ghcr.io/tejas242/arxiv-mcp:latest

# Or using docker-compose
git clone https://github.com/tejas242/arxiv-mcp.git
cd arxiv-mcp
docker compose up

Option 2: Local Development

# Clone and setup
git clone https://github.com/tejas242/arxiv-mcp.git
cd arxiv-mcp
uv sync

# Test the server
uv run main.py

🛠️ Available Functions

Function	Status	Description	Parameters
`search_papers`	✅ Working	Search arXiv papers with flexible query syntax	`query`, `max_results`, `sort_by`, `sort_order`
`get_paper_details`	✅ Working	Retrieve complete metadata for any arXiv paper	`arxiv_id`
`build_advanced_query`	✅ Working	Construct complex search queries with multiple fields	`title_keywords`, `author_name`, `category`, `abstract_keywords`
`get_arxiv_categories`	✅ Working	List all available arXiv subject categories	None
`search_by_author`	⚠️ Limited	Find papers by specific author (use search_papers instead)	`author_name`, `max_results`
`search_by_category`	⚠️ Limited	Browse papers by category (use search_papers instead)	`category`, `max_results`
`download_paper_pdf`	🔧 Needs Fix	Download paper PDFs (redirect handling issue)	`arxiv_id`, `save_path`

Function Details

✅ Fully Working Functions

search_papers - The primary search function

Supports full arXiv query syntax
Handles keywords, authors, categories, titles
Configurable sorting and pagination
Returns formatted results with abstracts and links

get_paper_details - Detailed paper information

Complete metadata extraction
Author information with affiliations
Category classifications and links
Publication dates and updates

build_advanced_query - Query construction helper

Combines multiple search criteria
Supports title, author, category, and abstract searches
Returns properly formatted query strings

get_arxiv_categories - Category reference

Complete list of arXiv subject categories
Descriptions for each category
Helpful for constructing targeted searches

⚠️ Limited Functions (Workarounds Available)

search_by_author - Use search_papers('au:"Author Name"') instead
search_by_category - Use search_papers('cat:category_code') instead

🔧 Functions Needing Fixes

download_paper_pdf - HTTP redirect handling needs improvement

Currently fails due to HTTPS/HTTP redirect issues
PDFs can be accessed directly via the links provided in search results

⚙️ Configuration

Claude Desktop Setup

Configuration Instructions

For Local Installation:

Add to your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "arxiv-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/arxiv-mcp",
        "run",
        "main.py"
      ]
    }
  }
}

For Docker Installation:

{
  "mcpServers": {
    "arxiv-mcp": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "ghcr.io/tejas242/arxiv-mcp:latest"
      ]
    }
  }
}

VS Code MCP Extension

VS Code Configuration

{
  "mcp": {
    "servers": {
      "arxiv-mcp": {
        "command": "uv",
        "args": ["--directory", "/path/to/arxiv-mcp", "run", "main.py"]
      }
    }
  }
}

💡 Usage Examples

Core Search Operations

# Search for papers about transformers
search_papers("transformer architecture")

# Advanced query with specific fields
search_papers('ti:"attention mechanism" AND cat:cs.LG')

# Author-specific search (recommended approach)
search_papers('au:"Geoffrey Hinton"')

# Category browsing (recommended approach)
search_papers('cat:cs.AI')

Research Workflow

# 1. Find the famous "Attention" paper
search_papers('ti:"Attention Is All You Need"')
get_paper_details("1706.03762")

# 2. Explore related work
search_papers("transformer neural networks")

# 3. Build complex queries
query = build_advanced_query(
    title_keywords="few-shot learning",
    author_name="Tom Brown",
    category="cs.LG"
)
search_papers(query)

📊 arXiv Categories Reference

Popular Categories

Code	Description	Example Topics
`cs.AI`	Artificial Intelligence	Machine learning, neural networks, AI theory
`cs.LG`	Machine Learning	Deep learning, reinforcement learning, statistical learning
`cs.CV`	Computer Vision	Image processing, object detection, visual recognition
`cs.CL`	Computation and Language	NLP, language models, text processing
`cs.CR`	Cryptography and Security	Security protocols, encryption, privacy
`stat.ML`	Machine Learning (Statistics)	Statistical learning theory, Bayesian methods
`physics.gen-ph`	General Physics	Theoretical physics, quantum mechanics
`math.NA`	Numerical Analysis	Computational mathematics, algorithms
`q-bio.NC`	Quantitative Biology	Neuroscience, computational biology

Use get_arxiv_categories() for the complete list of available categories.

🧪 Testing Results

Based on comprehensive testing of all functions:

✅ Reliable Functions

Paper search with keywords, authors, categories: 100% success rate
Paper detail retrieval: Complete metadata extraction working
Query construction: All syntax combinations supported
Category listing: All arXiv categories accessible

⚠️ Alternative Approaches Recommended

Author search: Use search_papers('au:"Author Name"') instead of search_by_author()
Category browsing: Use search_papers('cat:category') instead of search_by_category()

🔧 Known Issues

PDF downloads: Redirect handling needs improvement (PDFs accessible via direct links)

🔧 Development

Project Structure

arxiv-mcp/
├── src/arxiv_mcp/          # Main package
│   ├── server.py           # MCP server implementation
│   ├── arxiv_client.py     # arXiv API wrapper
│   ├── models.py           # Pydantic data models
│   └── utils.py            # Helper functions
├── tests/                  # Test suite
├── main.py                 # Entry point
└── pyproject.toml         # Project config

Running Tests

uv run pytest tests/ -v

Debug Mode

# Enable detailed logging
PYTHONPATH=src uv run python -c "
import logging
logging.basicConfig(level=logging.DEBUG)
from arxiv_mcp.server import main
main()
"