mcp-realtime-voice

Name: mcp-realtime-voice
Availability: InStock
Author: Helw150

mcp-realtime-voice is a Python library designed for real-time audio processing. It offers functionalities for speech recognition and synthesis, allowing integration with other applications via an API. It is particularly suitable for developing voice chat applications and voice assistants.

GitHub

GitHub Stars

User Rating

Not Rated

Favorites

Views

Forks

Issues

README

MCP Realtime Voice

Turn Claude into a voice assistant that listens and speaks. This MCP server handles speech recognition and text-to-speech for a natural voice interface.

⚠️ IMPORTANT:

This has only been tested on macOS.

Launch Claude from the terminal with the command below instead of clicking the app icon. This fixes microphone permission issues.

Features

Speech recognition with silence detection
Text-to-speech for AI responses
Voice Activity Detection using Silero
Works on Windows, macOS, and Linux
Audio device management
Simple voice conversation interface

Prerequisites

Python 3.8+
A microphone and speakers
MCP client (Claude)

Installation

Clone this repo:

git clone https://github.com/yourusername/mcp-realtime-voice.git
cd mcp-realtime-voice

Set up a virtual environment:

python -m venv venv

# On Windows
venv\Scripts\activate

# On macOS/Linux
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
System dependencies:
- Ubuntu/Debian: sudo apt-get install portaudio19-dev
- macOS: brew install portaudio

Usage

Connecting to Claude

Launch Claude from the terminal:

# On macOS
/Applications/Claude.app/Contents/MacOS/Claude

# On Windows
# Use the path to your Claude executable
start "" "C:\Path\to\Claude.exe"

Install the MCP server:

mcp install voice_server.py --name "Realtime Voice"

Or test with the MCP Inspector:

mcp dev voice_server.py

Available Tools

list_audio_devices: Shows all audio input/output devices
listen_for_speech: Records and transcribes speech
speak_text: Converts text to spoken audio
voice_mode: Starts interactive voice conversation

Voice Conversation Mode

To start:

Connect the MCP server to Claude
Ask Claude to "enter voice mode"
Start talking - the system will:
- Listen for your speech
- Detect when you finish speaking
- Send transcribed text to Claude
- Speak Claude's response

To exit, just say "exit voice mode" or "stop voice mode".

Configuration

Edit these values in voice_server.py if needed:

VAD_THRESHOLD: Voice detection sensitivity (default: 0.2)
SILENCE_DURATION: Seconds of silence before recording stops (default: 3)
Audio sample rate and format settings

Author Information

Helw150

GitHub

Followers

Repositories

Gists

Total Contributions

Related MCPs

LangBot

13398

🤩 Easy-to-use global IM bot platform designed for the LLM era / 简单易用的大模型即时通信机器人开发平台 ⚡️ Bots for QQ / QQ频道 / Discord / WeChat（微信）/ Telegram / 飞书 / 钉钉 / Slack 🧩 Integrated with ChatGPT（GPT)、DeepSeek、Dify、n8n、Claude、Google Gemini、xAI、PPIO、Ollama、阿里云百炼、SiliconFlow、Qwen、Moonshot(Kimi K2)、SillyTraven、MCP、WeClone etc. LLM & Agent & RAG

Python

mcpo

3259

A simple, secure MCP-to-OpenAPI proxy server

Python

mcpo

3259

A simple, secure MCP-to-OpenAPI proxy server

Python