llama-server_mcp_proxy
The Llama-Server MCP Proxy is a Node.js-based proxy server that intercepts chat completion requests and enhances them with tool capabilities via the Model Context Protocol (MCP). This application is designed to allow the language model to use tools iteratively, featuring dynamic system prompt generation and the ability to handle multiple tool calls in a single user turn.
GitHub Stars
13
User Rating
Not Rated
Forks
0
Issues
0
Views
1
Favorites
0
Llama-Server MCP Proxy
This Node.js application acts as a proxy server for llama-server
. Its primary purpose is to intercept chat completion requests, enhance them with tool capabilities via the Model Context Protocol (MCP), and allow the Language Model (LLM) to use these tools iteratively.
It proxies standard llama-server
GUI requests (like fetching the main page) directly to your local llama-server
instance and specifically processes /v1/chat/completions
requests to enable tool interactions.
Current status
- brave search works
- search1 API works
- filesystem access is problematic, depending on the model used
- puppeteer
Key Features
- MCP Tool Integration: Connects to MCP-compatible tool servers defined in
mcp-config.json
. - Dynamic System Prompt: Automatically generates a system prompt listing available tools and their descriptions, injecting it into the LLM's context.
- Iterative Tool Use: Allows the LLM to make multiple tool calls in a single user turn. The proxy handles the request-response flow:
- LLM indicates a tool call.
- Proxy detects and parses the tool call.
- Proxy executes the tool via the appropriate MCP client.
- Proxy sends the tool's result back to the LLM.
- LLM uses the result to continue its response or call another tool.
- Streaming Support: Maintains streaming of LLM responses (Server-Sent Events) to the client, including intermediate messages like "[Executing tool...]" and tool results.
- Error Handling: Forwards errors from
llama-server
and provides error messages for failed tool executions back to the LLM and client. The LLM will analyze the error and attempt to find a workable approach. - Configurable: Uses environment variables for port,
llama-server
URL, and MCP configuration path. - Debug Logging: Optional verbose debug logging via an environment variable.
Prerequisites
- Node.js: Version 18 or higher (as indicated by
@modelcontextprotocol/sdk
dependencies). llama-server
: A running instance ofllama-server
(or a compatible OpenAI-like API endpoint).- MCP Tool Servers: One or more MCP-compatible tool servers that you want the LLM to access.
Setup
Clone the Repository (or create the project directory):
# If you have a git repository: # git clone <your-repo-url> # cd llama-server_mcp_proxy # If starting from scratch, create a directory and navigate into it: mkdir llama-server_mcp_proxy cd llama-server_mcp_proxy
Create
package.json
: If you don't have one, create apackage.json
file with the following content:{ "name": "llama-server-mcp-proxy", "version": "1.0.0", "description": "A proxy for llama-server to enable MCP tool usage.", "main": "llama-server-mcp-proxy.js", "scripts": { "start": "node llama-server-mcp-proxy.js" }, "dependencies": { "@modelcontextprotocol/sdk": "^1.8.0" } }
Install Dependencies: Run the following command in your project directory to install the necessary packages:
npm install
This will use your
package.json
(andpackage-lock.json
if present and consistent) to install the@modelcontextprotocol/sdk
.Create
mcp-config.json
: This file tells the proxy how to connect to your MCP tool servers. It should be in the same directory as your proxy script (e.g.,llama-server-mcp-proxy.js
) by default, or you can specify its path via theMCP_CONFIG_PATH
environment variable.As a ready quick-start example rename
mcp-config.json.example
tomcp-config.json
mcp-config.json.example
:{ "mcpServers": { "search1api": { "command": "npx", "args": [ "search1api-mcp", "--port", "0", "--api-key", "YOUR_SEARCH1API_KEY_PLACEHOLDER" ], "env": { "DEBUG": "true" } }, "another_tool_server": { "command": "path/to/your/tool/server/executable", "args": [ "--some-config-for-tool", "value", "--api-key", "ANOTHER_API_KEY_PLACEHOLDER" ], "env": {} } } }
mcpServers
: An object where each key is a unique name for your tool server (e.g., "search1api", "my_custom_tools").command
: The command to execute to start the MCP tool server. This could benpx
for npx-runnable packages, a direct path to an executable, or a script.args
: An array of arguments to pass to the command.--port 0
is often used to let the MCP server pick an available port for stdio communication.- Replace
YOUR_SEARCH1API_KEY_PLACEHOLDER
and other placeholders with your actual API keys or necessary configuration.
env
: An optional object of environment variables to set for the MCP server process.
Create your actual
mcp-config.json
by copying the example and filling in your real API keys and paths.Add
mcp-config.json
to.gitignore
: To prevent accidentally committing your sensitive API keys, create or update your.gitignore
file in the project root:node_modules/ mcp-config.json *.log # Add other files/directories you want to ignore
Save the Proxy Code: Save the JavaScript proxy code provided in the previous steps into a file, for example,
llama-server-mcp-proxy.js
(as referenced inpackage.json
).
Running the Proxy
Ensure
llama-server
is running. The proxy needs to connect to it. By default, it assumesllama-server
is athttp://localhost:8080
.Start the Proxy Server: Open your terminal in the project directory and run:
npm start
Or directly using Node:
node llama-server-mcp-proxy.js
You should see console output indicating the proxy has started, which MCP servers it connected to, and the tools available.
Configure Environment Variables (Optional): You can customize the proxy's behavior using environment variables:
PORT
: The port the proxy server will listen on (default:9090
).LLAMA_SERVER_URL
: The URL of your runningllama-server
instance (default:http://localhost:8080
).MCP_CONFIG_PATH
: The full path to yourmcp-config.json
file (default:./mcp-config.json
relative to the script).MCP_PROXY_DEBUG
: Set totrue
for verbose debug logging (default:false
).LLAMA_SERVER_TIMEOUT
: Timeout in milliseconds for requests tollama-server
(default:60000
).
Example (Linux/macOS):
PORT=9000 LLAMA_SERVER_URL=http://127.0.0.1:8081 MCP_PROXY_DEBUG=true node llama-server-mcp-proxy.js
Example (Windows PowerShell):
$env:PORT="9000"; $env:LLAMA_SERVER_URL="http://127.0.0.1:8081"; $env:MCP_PROXY_DEBUG="true"; node llama-server-mcp-proxy.js
Usage
- Once the proxy server is running, point your LLM GUI client (e.g., your browser accessing a web UI that normally talks to
llama-server
) to the proxy's address and port (e.g.,http://localhost:9090
if the proxy is running on port 9090). - Interact with the LLM as usual.
- When you ask the LLM to perform a task that could benefit from one of the configured tools, it should:
- Indicate its intention to use a tool.
- Output a tool call in the format
TOOL_NAME(ARG_NAME="ARG_VALUE", ...)
or potentially an XML format.
- The proxy will:
- Detect this tool call.
- Show an "[Executing tool: TOOL_NAME...]" message in the stream.
- Execute the tool.
- Show a "[Tool Result for TOOL_NAME]: ...RAW_RESULT..." message in the stream.
- Feed the result back to the LLM.
- The LLM will then use the tool's result to formulate its final response or decide to call another tool.
Tool Call Formats Recognized
The proxy attempts to recognize tool calls in the LLM's output in two primary formats after the LLM finishes its current response segment (indicated by [DONE]
):
Simple Function Style:
Example:
news(query="latest AI research", max_results=3)
XML Style:
<tool_call> <tool>tool_name</tool> <params> <param1>value1</param1> <param2>value2</param2> </params> </tool_call>
The proxy will parse the arguments accordingly for each format.
Troubleshooting
- "Tool ... not found in any connected MCP server":
- Ensure the tool name output by the LLM exactly matches a tool name listed when the proxy starts (which comes from your MCP servers).
- Check your
mcp-config.json
for typos in tool server commands or configurations. - Verify that your MCP tool servers are starting correctly and successfully registering their tools. Check the console output of the proxy when it starts for connection messages.
- Proxy errors /
llama-server
errors (500, 502, etc.):- Ensure
llama-server
is running and accessible at theLLAMA_SERVER_URL
. - Check
llama-server
's own logs for any issues. - Enable
MCP_PROXY_DEBUG=true
for more detailed logs from the proxy.
- Ensure
- Tool execution errors (e.g., "402 Payment Required"): These errors come from the MCP tool server itself. The proxy will report them to the LLM, which should then ideally try a different approach or inform you.
- "Socket hang up" /
ECONNRESET
: This can occur ifllama-server
closes the connection unexpectedly. This might happen if the proxy sends a malformed request (e.g., incorrect JSON structure, problematic headers) or ifllama-server
itself encounters an internal error. Debug logs from both the proxy andllama-server
are crucial.
Dependencies
@modelcontextprotocol/sdk
: Version^1.8.0
(and its transitive dependencies as listed inpackage-lock.json
).