mcp-llama-integration

Name: mcp-llama-integration
Availability: InStock
Author: Syed Raza

このリポジトリは、ローカルで実行されるLlamaモデルと統合されたモデルコンテキストプロトコル（MCP）サーバーの実装を提供します。MCPサーバーは、AIアプリケーションに関連情報を提供するための標準化されたインターフェースを提供します。FastAPIを使用して構築されており、クエリをLlamaモデルに転送します。

GitHub

GitHubスター

ユーザー評価

未評価

フォーク

イシュー

閲覧数

お気に入り

README

Model Context Protocol Server with Llama Integration

This repository contains a Model Context Protocol (MCP) server implementation that integrates with a locally running Llama model. The MCP server provides a standardized interface for context retrieval, enhancing AI applications with relevant information from a local LLM.

Overview

The project consists of two main components:

MCP Server - A FastAPI-based server that implements the Model Context Protocol and forwards queries to a local Llama model
Python Client - A sample client application that demonstrates how to interact with the MCP server

Prerequisites

Python 3.7 or higher
A running Llama model server (e.g., Ollama) at http://localhost:11434/
Git installed on your machine
GitHub account

Installation

Clone the Repository

git clone https://github.com/EXPESRaza/mcp-llama-integration.git
cd mcp-llama-integration

Install Dependencies

pip install -r requirements.txt

File Structure

mcp-llama-integration/
├── llama_mcp_server.py      # MCP server with Llama integration
├── llama_client_app.py      # Sample client application
└── README.md                # Project documentation

Setting Up the Llama Model

If you haven't already, install Ollama
Pull the Llama model:
```
ollama pull llama3.2
```

Verify the model is running:

curl http://localhost:11434/api/tags
``` browser
http://localhost:11434
http://localhost:11434/api/tags

Running the MCP Server

Start the server:
```
python llama_mcp_server.py
```
The server will start running on http://localhost:8000
You can verify the server is running by checking the health endpoint:
```
curl http://localhost:8000/health
```

Using the Client Application

In a separate terminal, start the client application:
```
python llama_client_app.py
```
The application will prompt you for input
Type your queries and receive responses from the Llama model
Type 'exit' to quit the application

API Documentation

MCP Server Endpoints

POST /context

Request a context for a given query.

Request Body:

{
  "query_text": "Your query here",
  "user_id": "optional-user-id",
  "session_id": "optional-session-id",
  "additional_context": {}
}

Response:

{
  "context_elements": [
    {
      "content": "Response from Llama model",
      "source": "llama_model",
      "relevance_score": 0.9
    }
  ],
  "metadata": {
    "processing_time_ms": 150,
    "model": "llama3",
    "query": "Your query here"
  }
}

GET /health

Check the health status of the MCP server and its connection to the Llama model.

Response:

{
  "status": "healthy",
  "llama_status": "connected"
}

Customization

Changing the Llama Model

If you want to use a different Llama model, modify the model parameter in the query_llama function in llama_mcp_server.py:

payload = {
    "model": "your-model-name",  # Change this to your model name
    "prompt": text,
    "stream": False
}

Modifying the Prompt Template

To change how queries are formatted before sending to Llama, update the prompt template in the get_context function:

prompt = f"""Please provide relevant information for the following query:
{request.query_text}

Respond with factual, helpful information."""

Troubleshooting

Common Issues

Connection Refused Error
- Make sure the Llama model is running at http://localhost:11434/
- Verify Ollama is properly installed and running
Model Not Found Error
- Ensure you've pulled the correct model with Ollama
- Check available models with ollama list
Slow Responses
- Llama model inference can be resource-intensive
- Consider using a smaller model if performance is an issue