ai-webscraper-agent

Name: ai-webscraper-agent
Availability: InStock
Author: @Appugouda.Architect

AI Webscraper Agentは、Brightdata MCPサーバーを利用してウェブからコンテンツを抽出し要約するAI駆動のウェブスクレイピングエージェントです。モジュラーアーキテクチャに基づき、自然言語インターフェースを通じてデータを取得し、インタラクティブなフロントエンドを提供します。

GitHub

GitHubスター

ユーザー評価

未評価

お気に入り

閲覧数

フォーク

イシュー

README

🕸️ AI Webscraper Agent

An AI-powered webscraping agent that uses the Brightdata MCP server to extract and summarize content from the web. Built with a modular architecture combining LLM reasoning, robust scraping, and a simple web interface.

🔧 Tech Stack

Frontend: Streamlit
Backend: FastAPI
Language: Python
Scraping: Brightdata MCP Server
AI Model: Anthropic LLM (Claude)

🚀 Features

Natural language interface to extract data from websites
Uses Brightdata MCP for reliable web scraping
LLM-powered summarization and reasoning
Streamlit-based interactive frontend
Async FastAPI backend integration

Environment Variables

Create a .env file and configure the following:

# .env
# Environment Variables for AI Webscraper Agent
# Replace 'your_key_here' with your actual API keys

# Bright Data
API_TOKEN=your_key_here
WEB_UNLOCKER_ZONE=your_key_here
BROWSER_AUTH="your_browser_auth_token"

#Anthropic AI API KEY
ANTHROPIC_API_KEY=your_key_here

📦 Installation

git clone https://github.com/yourusername/ai-webscraper-agent.git
cd ai-webscraper-agent

uv pip install -r requirements.txt

RUN App

Start the FastAPI backend server and Streamlit app'

Start the backend FastAPI server

uv run backend.py

Start frontend Streamlit app

streamlit run frontend.py

Example Usage

Ask:

Scrape the top 5 news headlines from https://bbc.com and summarize them.

Get Response:

1. Headline A - Summary
2. Headline B - Summary
3. Headline C - Summary
4. Headline C - Summary
...

Agent Flow

[User Prompt] ➡ [Streamlit UI] ➡ [FastAPI Router] ➡ [LLM Agent]
➡ [Brightdata Tool via MCP] ➡ [LLM Summarization] ➡ [UI Response]

作者情報

@Appugouda.Architect

🚀 Full-Stack Architect | AWS-Certified Cloud Expert | AI Innovator 🧩 Mentor | Strategist | Builder

GitHub

フォロワー

リポジトリ

Gist

貢献数

タグ

anthropic fastapi mcp python streamlit

関連するMCP

mcp-domain-availability

mcp-domain-availabilityは、指定されたドメイン名の利用可能性をチェックするためのPythonライブラリです。ユーザーは簡単にドメインの状態を確認でき、APIを通じて自動化されたワークフローに組み込むことが可能です。このツールは、ドメイン取得のプロセスを効率化し、時間を節約します。

Python

network-automation-system

このハイブリッドネットワーク自動化システムは、LangGraphによるワークフローオーケストレーション、LangChainによる自然言語処理、MCPによるインテリジェントなネットワークデバイス管理を統合しています。ユーザーは会話形式でネットワークデバイスに指示を出すことができ、複雑なマルチデバイス操作を簡素化します。自動ネットワークマッピングやトポロジー発見機能も備えています。

Python

fastapi_mcp

10205

FastAPI-MCPは、FastAPIのエンドポイントをModel Context Protocol（MCP）ツールとして公開するためのライブラリです。認証機能も備えており、セキュアなAPIの構築を支援します。これにより、開発者は効率的にAPIを管理し、他のシステムとの連携を容易にします。

Python