ai-webscraper-agent
AI-powered web-scraping agent that uses the Brightdata MCP Server, Fast API, Python, Streamlit UI, Anthropic LLM
GitHubスター
1
ユーザー評価
未評価
フォーク
0
イシュー
0
閲覧数
1
お気に入り
0
README
🕸️ AI Webscraper Agent
An AI-powered webscraping agent that uses the Brightdata MCP server to extract and summarize content from the web. Built with a modular architecture combining LLM reasoning, robust scraping, and a simple web interface.
🔧 Tech Stack
- Frontend: Streamlit
- Backend: FastAPI
- Language: Python
- Scraping: Brightdata MCP Server
- AI Model: Anthropic LLM (Claude)
🚀 Features
- Natural language interface to extract data from websites
- Uses Brightdata MCP for reliable web scraping
- LLM-powered summarization and reasoning
- Streamlit-based interactive frontend
- Async FastAPI backend integration
Environment Variables
Create a .env file and configure the following:
# .env
# Environment Variables for AI Webscraper Agent
# Replace 'your_key_here' with your actual API keys
# Bright Data
API_TOKEN=your_key_here
WEB_UNLOCKER_ZONE=your_key_here
BROWSER_AUTH="your_browser_auth_token"
#Anthropic AI API KEY
ANTHROPIC_API_KEY=your_key_here
📦 Installation
git clone https://github.com/yourusername/ai-webscraper-agent.git
cd ai-webscraper-agent
uv pip install -r requirements.txt
RUN App
Start the FastAPI backend server and Streamlit app'
Start the backend FastAPI server
uv run backend.py
Start frontend Streamlit app
streamlit run frontend.py
Example Usage
Ask:
Scrape the top 5 news headlines from https://bbc.com and summarize them.
Get Response:
1. Headline A - Summary
2. Headline B - Summary
3. Headline C - Summary
4. Headline C - Summary
...
Agent Flow
[User Prompt] ➡ [Streamlit UI] ➡ [FastAPI Router] ➡ [LLM Agent] ➡ [Brightdata Tool via MCP] ➡ [LLM Summarization] ➡ [UI Response]
作者情報
🚀 Full-Stack Architect | AWS-Certified Cloud Expert | AI Innovator 🧩 Mentor | Strategist | Builder
0
フォロワー
7
リポジトリ
0
Gist
4
貢献数
トップ貢献者
スレッド