ai-webscraper-agent
AI-powered web-scraping agent that uses the Brightdata MCP Server, Fast API, Python, Streamlit UI, Anthropic LLM
GitHub Stars
1
User Rating
Not Rated
Forks
0
Issues
0
Views
0
Favorites
0
README
πΈοΈ AI Webscraper Agent
An AI-powered webscraping agent that uses the Brightdata MCP server to extract and summarize content from the web. Built with a modular architecture combining LLM reasoning, robust scraping, and a simple web interface.
π§ Tech Stack
- Frontend: Streamlit
- Backend: FastAPI
- Language: Python
- Scraping: Brightdata MCP Server
- AI Model: Anthropic LLM (Claude)
π Features
- Natural language interface to extract data from websites
- Uses Brightdata MCP for reliable web scraping
- LLM-powered summarization and reasoning
- Streamlit-based interactive frontend
- Async FastAPI backend integration
Environment Variables
Create a .env file and configure the following:
# .env
# Environment Variables for AI Webscraper Agent
# Replace 'your_key_here' with your actual API keys
# Bright Data
API_TOKEN=your_key_here
WEB_UNLOCKER_ZONE=your_key_here
BROWSER_AUTH="your_browser_auth_token"
#Anthropic AI API KEY
ANTHROPIC_API_KEY=your_key_here
π¦ Installation
git clone https://github.com/yourusername/ai-webscraper-agent.git
cd ai-webscraper-agent
uv pip install -r requirements.txt
RUN App
Start the FastAPI backend server and Streamlit app'
Start the backend FastAPI server
uv run backend.py
Start frontend Streamlit app
streamlit run frontend.py
Example Usage
Ask:
Scrape the top 5 news headlines from https://bbc.com and summarize them.
Get Response:
1. Headline A - Summary
2. Headline B - Summary
3. Headline C - Summary
4. Headline C - Summary
...
Agent Flow
[User Prompt] β‘ [Streamlit UI] β‘ [FastAPI Router] β‘ [LLM Agent] β‘ [Brightdata Tool via MCP] β‘ [LLM Summarization] β‘ [UI Response]
Author Information
π Full-Stack Architect | AWS-Certified Cloud Expert | AI Innovator π§© Mentor | Strategist | Builder
0
Followers
7
Repositories
0
Gists
4
Total Contributions
Top Contributors
Threads