arxiv-mcp
arxiv-mcp is a tool that retrieves paper data from arXiv and makes it accessible via an API. Users can easily obtain paper information for analysis and research purposes. The data retrieval is automated, enabling an efficient workflow.
GitHub Stars
1
User Rating
Not Rated
Favorites
0
Views
35
Forks
0
Issues
6
Access the world's largest repository of academic papers through the Model Context Protocol
A streamlined Model Context Protocol server that connects AI assistants to arXiv's vast collection of academic papers. Search, analyze, and download research papers directly from your AI workflow.
๐ Quick Start
Prerequisites
- Python 3.12+
- uv package manager
Installation
Option 1: Docker (Recommended)
# Pull and run the Docker image
docker run --rm -it ghcr.io/tejas242/arxiv-mcp:latest
# Or using docker-compose
git clone https://github.com/tejas242/arxiv-mcp.git
cd arxiv-mcp
docker compose up
Option 2: Local Development
# Clone and setup
git clone https://github.com/tejas242/arxiv-mcp.git
cd arxiv-mcp
uv sync
# Test the server
uv run main.py
๐ ๏ธ Available Functions
| Function | Status | Description | Parameters |
|---|---|---|---|
search_papers |
โ Working | Search arXiv papers with flexible query syntax | query, max_results, sort_by, sort_order |
get_paper_details |
โ Working | Retrieve complete metadata for any arXiv paper | arxiv_id |
build_advanced_query |
โ Working | Construct complex search queries with multiple fields | title_keywords, author_name, category, abstract_keywords |
get_arxiv_categories |
โ Working | List all available arXiv subject categories | None |
search_by_author |
โ ๏ธ Limited | Find papers by specific author (use search_papers instead) | author_name, max_results |
search_by_category |
โ ๏ธ Limited | Browse papers by category (use search_papers instead) | category, max_results |
download_paper_pdf |
๐ง Needs Fix | Download paper PDFs (redirect handling issue) | arxiv_id, save_path |
Function Details
โ Fully Working Functions
search_papers - The primary search function
- Supports full arXiv query syntax
- Handles keywords, authors, categories, titles
- Configurable sorting and pagination
- Returns formatted results with abstracts and links
get_paper_details - Detailed paper information
- Complete metadata extraction
- Author information with affiliations
- Category classifications and links
- Publication dates and updates
build_advanced_query - Query construction helper
- Combines multiple search criteria
- Supports title, author, category, and abstract searches
- Returns properly formatted query strings
get_arxiv_categories - Category reference
- Complete list of arXiv subject categories
- Descriptions for each category
- Helpful for constructing targeted searches
โ ๏ธ Limited Functions (Workarounds Available)
search_by_author - Use search_papers('au:"Author Name"') insteadsearch_by_category - Use search_papers('cat:category_code') instead
๐ง Functions Needing Fixes
download_paper_pdf - HTTP redirect handling needs improvement
- Currently fails due to HTTPS/HTTP redirect issues
- PDFs can be accessed directly via the links provided in search results
โ๏ธ Configuration
Claude Desktop Setup
Configuration Instructions
For Local Installation:
Add to your Claude Desktop config file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"arxiv-mcp": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/arxiv-mcp",
"run",
"main.py"
]
}
}
}
For Docker Installation:
{
"mcpServers": {
"arxiv-mcp": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"ghcr.io/tejas242/arxiv-mcp:latest"
]
}
}
}
VS Code MCP Extension
VS Code Configuration
{
"mcp": {
"servers": {
"arxiv-mcp": {
"command": "uv",
"args": ["--directory", "/path/to/arxiv-mcp", "run", "main.py"]
}
}
}
}
๐ก Usage Examples
Core Search Operations
# Search for papers about transformers
search_papers("transformer architecture")
# Advanced query with specific fields
search_papers('ti:"attention mechanism" AND cat:cs.LG')
# Author-specific search (recommended approach)
search_papers('au:"Geoffrey Hinton"')
# Category browsing (recommended approach)
search_papers('cat:cs.AI')
Research Workflow
# 1. Find the famous "Attention" paper
search_papers('ti:"Attention Is All You Need"')
get_paper_details("1706.03762")
# 2. Explore related work
search_papers("transformer neural networks")
# 3. Build complex queries
query = build_advanced_query(
title_keywords="few-shot learning",
author_name="Tom Brown",
category="cs.LG"
)
search_papers(query)
๐ arXiv Categories Reference
Popular Categories
| Code | Description | Example Topics |
|---|---|---|
cs.AI |
Artificial Intelligence | Machine learning, neural networks, AI theory |
cs.LG |
Machine Learning | Deep learning, reinforcement learning, statistical learning |
cs.CV |
Computer Vision | Image processing, object detection, visual recognition |
cs.CL |
Computation and Language | NLP, language models, text processing |
cs.CR |
Cryptography and Security | Security protocols, encryption, privacy |
stat.ML |
Machine Learning (Statistics) | Statistical learning theory, Bayesian methods |
physics.gen-ph |
General Physics | Theoretical physics, quantum mechanics |
math.NA |
Numerical Analysis | Computational mathematics, algorithms |
q-bio.NC |
Quantitative Biology | Neuroscience, computational biology |
Use get_arxiv_categories() for the complete list of available categories.
๐งช Testing Results
Based on comprehensive testing of all functions:
โ Reliable Functions
- Paper search with keywords, authors, categories: 100% success rate
- Paper detail retrieval: Complete metadata extraction working
- Query construction: All syntax combinations supported
- Category listing: All arXiv categories accessible
โ ๏ธ Alternative Approaches Recommended
- Author search: Use
search_papers('au:"Author Name"')instead ofsearch_by_author() - Category browsing: Use
search_papers('cat:category')instead ofsearch_by_category()
๐ง Known Issues
- PDF downloads: Redirect handling needs improvement (PDFs accessible via direct links)
๐ง Development
Project Structure
arxiv-mcp/
โโโ src/arxiv_mcp/ # Main package
โ โโโ server.py # MCP server implementation
โ โโโ arxiv_client.py # arXiv API wrapper
โ โโโ models.py # Pydantic data models
โ โโโ utils.py # Helper functions
โโโ tests/ # Test suite
โโโ main.py # Entry point
โโโ pyproject.toml # Project config
Running Tests
uv run pytest tests/ -v
Debug Mode
# Enable detailed logging
PYTHONPATH=src uv run python -c "
import logging
logging.basicConfig(level=logging.DEBUG)
from arxiv_mcp.server import main
main()
"
โ ๏ธ Troubleshooting
Common Issues & Solutions
Server Not Detected
- โ Verify absolute paths in MCP config
- โ
Test server runs:
uv run main.py - โ Restart Claude Desktop after config changes
Search Issues
- โ Use arXiv query syntax (see examples above)
- โ
Check category names:
get_arxiv_categories() - โ Try broader search terms
- โ
Use
search_papers()instead of specific search functions
PDF Download Failures
- โ Access PDFs via links in search results
- โ Check internet connection
- โ Verify arXiv ID format (e.g., "1706.03762")
๐ Acknowledgments
Zammad is an open-source helpdesk software that streamlines team communication. It offers ticket management, customer support, and reporting features, and allows integration with other systems via API. The user interface is intuitive, highly customizable, and is utilized across various industries.
The mcp-client-for-ollama is a simple yet powerful Python client designed for interacting with Model Context Protocol (MCP) servers using Ollama. This client enables local large language models (LLMs) to utilize tools effectively. It primarily facilitates communication with APIs, streamlining workflows and enhancing the capabilities of LLMs.