iceberg-mcp-server

The Cloudera Iceberg MCP Server is a Model Context Protocol server that provides read-only access to Iceberg tables via Apache Impala. This server enables large language models (LLMs) to inspect database schemas and execute read-only queries. It is primarily intended for use in database, file system, AI/LLM, and cloud environments.

GitHub Stars

7

User Rating

Not Rated

Favorites

0

Views

17

Forks

5

Issues

0

Installation
Difficulty
Intermediate
Estimated Time
10-20 minutes
Requirements
Python 3.6以上
Apache Impala 最新版

Installation

Installation

Prerequisites

Required software and versions:
Python: 3.6 or higher
Apache Impala: Latest version

Installation Steps

1. Clone Repository

bash
git clone https://github.com/cloudera/iceberg-mcp-server.git
cd iceberg-mcp-server

2. Install Dependencies

bash
pip install -r requirements.txt

3. Configure Claude Desktop

Edit claude_desktop_config.json to add the MCP server:
json
{
  "mcpServers": {
    "iceberg-mcp-server": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/cloudera/iceberg-mcp-server@main",
        "run-server"
      ],
      "env": {
        "IMPALA_HOST": "coordinator-default-impala.example.com",
        "IMPALA_PORT": "443",
        "IMPALA_USER": "username",
        "IMPALA_PASSWORD": "password",
        "IMPALA_DATABASE": "default"
      }
    }
  }
}

4. Start Server

bash
uvx run-server

Troubleshooting

Common Issues

Issue: Server won't start Solution: Check Python version and dependencies. Issue: Not recognized by Claude Desktop Solution: Verify configuration file path and syntax.

Configuration

Configuration

Basic Configuration

Claude Desktop Setup

Edit ~/.config/claude-desktop/claude_desktop_config.json (macOS/Linux) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
json
{
  "mcpServers": {
    "iceberg-mcp-server": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/cloudera/iceberg-mcp-server@main",
        "run-server"
      ],
      "env": {
        "IMPALA_HOST": "coordinator-default-impala.example.com",
        "IMPALA_PORT": "443",
        "IMPALA_USER": "username",
        "IMPALA_PASSWORD": "password",
        "IMPALA_DATABASE": "default"
      }
    }
  }
}

Environment Variables

Set the following environment variables as needed:
bash
export IMPALA_HOST="coordinator-default-impala.example.com"
export IMPALA_PORT="443"
export IMPALA_USER="username"
export IMPALA_PASSWORD="password"
export IMPALA_DATABASE="default"

Advanced Configuration

Security Settings

Store sensitive information in environment variables
Set appropriate access permissions

Performance Tuning

Configure timeout values

Configuration Example

json
{
  "mcpServers": {
    "iceberg-mcp-server": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/cloudera/iceberg-mcp-server@main",
        "run-server"
      ],
      "env": {
        "IMPALA_HOST": "coordinator-default-impala.example.com",
        "IMPALA_PORT": "443",
        "IMPALA_USER": "username",
        "IMPALA_PASSWORD": "password",
        "IMPALA_DATABASE": "default"
      }
    }
  }
}

Examples

Examples

Basic Usage

Here are basic usage examples for the MCP server:

Using with Claude Desktop

1Verify MCP Server Startup
Open Claude Desktop and confirm that the configuration has been loaded correctly.
2Execute Basic Commands

   Available tools from this MCP server:
   - tool1: Description of tool1
   - tool2: Description of tool2
   

Programmatic Usage

python

Python example

import requests def call_mcp_tool(tool_name, params): response = requests.post( 'http://localhost:3000/mcp/call', json={ 'tool': tool_name, 'parameters': params } ) return response.json()

Usage example

result = call_mcp_tool('analyze', { 'input': 'sample data', 'options': {'format': 'json'} })

Use Cases

Building applications that dynamically inspect database schemas using LLMs.
Executing queries on Iceberg tables within data analytics tools.
Providing read-only data access as part of data pipelines.
Integrating with AI frameworks (like LangChain) for data processing.

Additional Resources

Author Information
Cloudera
California, USA

293

Followers

194

Repositories

0

Gists

0

Total Contributions