Spaces:
Running
A newer version of the Gradio SDK is available:
6.1.0
title: BirdScope AI - MCP Multi-Agent System
emoji: π¦
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 6.0.1
python_version: 3.11
app_file: app.py
pinned: false
license: mit
short_description: AI-powered bird identification with MCP multi-agent system
tags:
- building-mcp-track-enterprise
- building-mcp-track-consumer
- building-mcp-track-creative
- mcp-in-action-track-enterprise
- mcp-in-action-track-consumer
- mcp-in-action-track-creative
π¦ BirdScope AI - MCP Multi-Agent System
AI-powered bird identification with specialized MCP agents
Built for the MCP 1st Birthday Hackathon
π’ Hackathon Submission
Social Media: Twitter/X Post
Demo Video: Watch on YouTube/Loom
Track Submissions:
- π§ Track 1 (Building MCP): Two custom MCP servers
- Nuthatch MCP Server - 7 tools for bird species database (search, species info, images, audio, family search, conservation filtering)
- Modal Bird Classifier MCP - 2 Modal-hosted GPU-powered image classification tools (base64 & URL inputs)
- Categories: Enterprise (wildlife conservation) | Consumer (bird enthusiasts and education) | Creative (multimedia exploration)
- π€ Track 2 (MCP in Action): Full multi-agent system with supervisor routing
- LangGraph-based supervisor orchestrating 3 specialized subagents
- Integrates both MCP servers with intelligent tool routing
- Categories: Enterprise (conservation orgs) | Consumer (bird watchers) | Creative (educational multimedia)
Author: @facemelter
Built with: Gradio 6 | LangGraph | FastMCP | Modal (GPU) | OpenAI/Anthropic/HuggingFace LLMs
π Project Overview
BirdScope AI showcases an advanced multi-agent system powered by Gradio 6 and LangGraph, designed to identify bird species, explore multimedia content, and provide educational information about birds worldwide.
Our innovation: We built two complete systems in one:
- π§ Two Custom MCP Servers (Track 1): Nuthatch species database (7 tools) + Modal GPU classifier (2 tools)
- π€ Multi-Agent Application (Track 2): Supervisor-orchestrated specialist agents
This dual approach demonstrates both building MCP infrastructure and leveraging MCP for autonomous agents.
β¨ Key Features
π€ Multi-Agent Orchestration
- LangGraph Supervisor Pattern with intelligent LLM-based routing
- 3 Specialized Subagents (Image Identifier, Species Explorer, Taxonomy Specialist)
- Session-based Agent Caching - Agents reused within user sessions for 10x faster responses
- Provider-Specific Prompts - Optimized system prompts for OpenAI, Anthropic, and HuggingFace
π§ Dual MCP Server Architecture
- Modal Bird Classifier (modal.com)
- prithivMLmods/Bird-Species-Classifier-526 from HuggingFace
- 526 bird species classification on Modal T4 GPU
- Serverless GPU deployment for on-demand classification
- Streamable HTTP transport with base64 and URL input support
- Nuthatch MCP Server (Custom Built - Track 1)
- FastMCP framework with 7 specialized tools
- Integrates Nuthatch API (1000+ species)
- Dual Transport Support: STDIO (subprocess) for HF Spaces + HTTP for local debugging
- Data sources: Nuthatch DB, Unsplash (images), xeno-canto (audio)
π‘ Dual Streaming Output
- Chat Response Stream - Real-time markdown rendering with embedded media
- Tool Execution Log Stream - Parallel visibility into MCP tool calls (inputs/outputs)
- Async Progress Indicators - Immediate user feedback before processing begins
π¨ Structured Output Parsing
- LlamaIndex Pydantic Models - Type-safe response formatting
- Regex URL Extraction - Automatic detection of image and audio URLs
- Smart Audio Normalization - xeno-canto links converted to browser-friendly format (
/downloadβ playable) - Markdown Media Embedding - Images and audio automatically formatted
π Multi-Provider LLM Support
- OpenAI (GPT-4o-mini) - Recommended for reliability
- Anthropic (Claude Sonnet 4) - Best for complex reasoning
- HuggingFace Inference API - Open-source models (limited tool calling)
- User-Provided Keys - No backend API key required, users supply their own
π Production UI/UX
- Gradio 6.0 SSR - Server-side rendering for enhanced performance
- Custom Cloud Theme - Sky-inspired CSS with mobile-responsive design
- Dynamic Examples - Example queries adapt to selected agent mode
- Instant Feedback - "β³ Starting..." indicator appears immediately on submit
ποΈ Data Sources & MCP Servers
We built two custom MCP servers that integrate with bird data APIs and GPU-powered classification:
Data Sources:
- Nuthatch API (nuthatch.lastelm.software) - 1000+ bird species database by Last Elm Software
- Unsplash - High-quality reference images for visual identification
- xeno-canto.org - Community-contributed bird audio recordings worldwide
- HuggingFace Model - prithivMLmods/Bird-Species-Classifier-526 for GPU classification
MCP Servers:
Nuthatch MCP Server (Track 1 - Building MCP)
- 7 specialized tools: search, species info, images, audio, family search, conservation filtering
- STDIO transport for HF Spaces, HTTP option for local debugging
- FastMCP framework with async API integration
Modal Bird Classifier (GPU-powered)
- Image classification tools: URL and base64 input support
- Serverless GPU deployment via Modal
- Streamable HTTP transport
π§© Core Components
Multi-Agent Orchestration:
- LangGraph Supervisor Pattern - LLM-based routing between specialist agents
- 3 Specialized Subagents - Each with focused tool subset (image ID, species exploration, taxonomy)
- Session-based Caching - Agent instances reused within user sessions for performance
- Dual Streaming - Parallel chat response + tool execution log streams
Agent Architecture:
subagent_supervisor.py- Creates supervisor workflow with LangGraphsubagent_factory.py- Builds specialists with filtered tool accesssubagent_config.py- Defines agent modes and tool allocationsprompts.py- Provider-specific system prompts (OpenAI, Anthropic, HuggingFace)
UI & UX:
- Gradio 6.0 with SSR for enhanced performance
- Custom cloud-themed CSS with mobile-responsive design
- Dynamic examples that adapt to agent mode selection
- Immediate processing feedback with async streaming updates
π Quick Start
Try the Live Demo: Just provide your LLM API key (OpenAI, Anthropic, or HuggingFace) in the sidebar and start exploring!
For Developers:
# Clone and install
git clone <repo-url>
cd hackathon_draft
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Run locally
python app.py
Deploy to HuggingFace Spaces:
python upload_to_space.py
# Configure Secrets in Space Settings (see docs/dev/main-README.md)
Full Setup Guide: See docs/dev/main-README.md for comprehensive deployment instructions
π Credits & License
Built for the HuggingFace MCP 1st Birthday Hackathon
Data Sources: Nuthatch API (Last Elm Software) | xeno-canto.org | Unsplash
Technology: Model Context Protocol | LangGraph | Gradio 6 | Modal
MIT License - Educational and research purposes