youtube-video-analysis-toolkit
AI-powered YouTube video analysis toolkit using MCP. Extract transcripts, generate knowledge graphs, generate high-quality detailed notes, perform sentiment analysis, and topic modeling through an intuitive Streamlit dashboard.
GitHubスター
12
ユーザー評価
未評価
お気に入り
0
閲覧数
23
フォーク
2
イシュー
0
MCP-Powered YouTube Video Analysis Toolkit
A comprehensive toolkit for analyzing YouTube videos using AI-powered analysis capabilities. This project provides transcript extraction, knowledge graph generation, sentiment analysis, topic modeling, and note generation for educational and research purposes.
Features
YouTube Video Analysis - Chat Interface
Interface of MCP Powered YouTube Video Analysis Toolkit. Extract and analyze transcripts from YouTube videos.
Knowledge Graph Generation
Interactive knowledge graphs from video content.
High Quality Note Generation
Automated markdown note generation from transcripts.
Sentiment Analysis
Comprehensive sentiment analysis with emotional breakdowns.
Topic Modeling
Content classification and structure analysis.
MCP Integration: Model Context Protocol for seamless AI agent integration
- Standardized tool access across multiple AI providers
- Multi-server support for distributed analysis capabilities
- Asynchronous streaming for real-time agent interactions
Architecture
The project consists of two main components:
Server (/server
)
- FastMCP Server: Built with FastMCP 2.0 framework for Model Context Protocol implementation
- Automatic tool discovery and async support for high-performance operations
- Pythonic decorators for simple function-to-tool conversion
- Built-in authentication and proxying capabilities
- Comprehensive protocol handling with minimal boilerplate code
- YouTube Tools: Video transcript extraction and search capabilities via MCP tools
- Analysis Tools: AI-powered prompts exposed as MCP resources for knowledge graphs, sentiment analysis, and topic modeling
- Data Storage: Persistent storage management for transcripts, notes, and analysis results
Client (/client
)
- MCP Client Integration: Built with mcp-use library for seamless MCP server communication
- Multi-server support enabling connection to multiple MCP servers simultaneously
- Asynchronous streaming of agent outputs for real-time interactions
- Dynamic server selection for optimized task execution
- Configurable via JSON for flexible server management
- Streamlit Dashboard: Interactive web interface for visualization
- Multi-tab Interface: Separate views for different analysis types
- Real-time Chat: MCP-powered chatbot for video analysis using intelligent agents
- Data Visualization: Interactive charts and graphs for analysis results
Project Structure
youtube-video-analysis-kit/
├── README.md
├── data/ # Shared data directory
│ ├── transcripts/ # YouTube video transcripts
│ ├── knowledge_graphs/ # Generated knowledge graphs
│ ├── notes_md/ # Generated markdown notes
│ ├── sentiment_analysis/ # Sentiment analysis results
│ └── topic_modeling/ # Topic modeling analysis
├── server/ # Backend MCP server
│ ├── docker-compose.yml
│ ├── Dockerfile
│ ├── main.py # Main server application
│ ├── requirements.txt
│ ├── tools/
│ │ ├── prompt_tools.py # AI analysis prompts
│ │ └── youtube_tools.py # YouTube integration
│ └── prompts/
│ └── prompts.py # Prompt templates
├── client/ # Frontend Streamlit app
│ ├── docker-compose.yml
│ ├── Dockerfile
│ ├── combined_app.py # Main Streamlit application
│ ├── requirements.txt
│ ├── config/
│ │ └── mcpServers.json # MCP server configuration
│ └── proxy.py # Development proxy
└── screenshots/ # Documentation screenshots
MCP Integration
Model Context Protocol (MCP)
This project leverages the Model Context Protocol to provide standardized communication between AI applications and external data sources. MCP enables:
- Standardized Tool Access: Consistent interface for AI agents to access various tools and resources
- Scalable Architecture: Support for multiple concurrent connections and server instances
- Protocol Flexibility: JSON-RPC based communication with support for various transports
FastMCP Server Implementation
The server component uses FastMCP 2.0, a comprehensive Python framework that:
- Reduces boilerplate code by 90% compared to manual MCP implementation
- Provides automatic tool discovery through Python decorators
- Supports async operations for high-performance data processing
- Includes built-in authentication and server proxying capabilities
- Offers production-ready deployment patterns
MCP-Use Client Library
The client integrates with mcp-use library to:
- Connect to multiple MCP servers simultaneously
- Enable dynamic server selection based on task requirements
- Provide asynchronous streaming for real-time agent interactions
- Support various LLM providers (OpenAI, Anthropic, Groq)
- Offer flexible configuration management via JSON files
Getting Started
Prerequisites
- Docker and Docker Compose
- OpenAI API key
- Python 3.12+ (for local development)
Clone the repository
git clone https://github.com/di37/youtube-video-analysis-toolkit.git
cd youtube-video-analysis-toolkit
Docker Deployment
Server only:
cd server
docker-compose up --build
# Server will be available at http://localhost:8000
Client only:
cd client
docker-compose up --build
# Client will be available at http://localhost:8501
Usage
Web Interface
- Access the Streamlit dashboard at
http://localhost:8501
- Configure your OpenAI API key in the sidebar
- Select the appropriate tab for your analysis needs:
- YouTube Video Chat: Interactive chatbot for video analysis
- Saved Transcripts: View and manage video transcripts
- Knowledge Graph: Visualize entity relationships
- Generated Notes: View markdown notes
- Sentiment Analysis: Emotional and sentiment insights
- Topic Modeling: Educational content analysis
API Usage
The server exposes MCP tools at http://localhost:8000
. Available tools include:
get_youtube_transcript(url)
: Extract transcript from YouTube videosave_transcript(content, source)
: Save transcript datagenerate_knowledge_graph_prompt()
: Get knowledge graph generation promptgenerate_notes_prompt()
: Get note generation promptgenerate_sentiment_analysis_prompt(text)
: Get sentiment analysis promptgenerate_topic_modeling_prompt(text)
: Get topic modeling promptsave_analysis(analysis_json, analysis_type)
: Save analysis results
Workflow Example
Analyze a YouTube Video:
Chat: "Please analyze this YouTube video: https://youtube.com/watch?v=..."
Generate Knowledge Graph:
Chat: "Create a knowledge graph from the latest transcript"
Perform Sentiment Analysis:
Chat: "Analyze the sentiment of the video content"
Create Notes:
Chat: "Generate comprehensive notes from the transcript"
Topic Modeling:
Chat: "Classify the educational content and identify topics"
Data Persistence
All analysis results are stored in the /data
directory:
- Transcripts: JSON format with metadata
- Knowledge Graphs: Neo4j-compatible JSON structure
- Notes: Markdown files with YAML front matter
- Sentiment Analysis: Comprehensive JSON analysis
- Topic Modeling: Educational content classification
Configuration
MCP Server Configuration
Edit client/config/mcpServers.json
to configure the MCP connection:
{
"mcpServers": {
"video-analysis-kit": {
"url": "http://{your_ip_address}:8000/mcp",
"transport": "streamable-http"
}
}
}
Use ifconfig | grep inet
to find your ip address.
Docker Configuration
Both services use Docker Compose with volume mounts for data persistence:
- Server: Mounts shared data directory and exposes port 8000
- Client: Mounts shared data directory and exposes port 8501
Development
Adding New Analysis Types
- Create prompt template in
server/prompts/prompts.py
- Add tool function in
server/main.py
- Update client interface in
client/combined_app.py
- Add visualization tab for results display
Extending the Dashboard
The Streamlit dashboard is modular with separate render functions for each tab:
render_chat_tab()
: MCP chatbot interfacerender_knowledge_graph_tab()
: Interactive graph visualizationrender_notes_tab()
: Markdown note displayrender_sentiment_analysis_tab()
: Sentiment visualizationrender_topic_modeling_tab()
: Educational content analysis
Troubleshooting
Common Issues
- Port conflicts: Ensure ports 8000 and 8501 are available
- API key errors: Verify OpenAI API key is correctly set
- Volume mount issues: Check Docker volume permissions
- MCP connection errors: Verify server is running and accessible
Logs
View container logs:
docker logs -f <name_of_the_container>
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Built with FastMCP for MCP server implementation - A comprehensive Python framework for building Model Context Protocol servers with automatic tool discovery, async support, authentication, and production-ready deployment patterns
- Client integration powered by mcp-use - Open-source library for connecting any LLM to any MCP server with multi-server support, asynchronous streaming, and flexible agent configuration
- Uses Streamlit for web interface
- Powered by OpenAI GPT models for AI analysis