Spaces:
Running
Running
| title: BirdScope AI - MCP Multi-Agent System | |
| emoji: π¦ | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: gradio | |
| python_version: 3.11 | |
| app_file: app.py | |
| pinned: false | |
| # π¦ BirdScope AI - Multi-Agent Bird Identification System | |
| **AI-powered bird identification with specialized MCP agents** | |
| Built for the [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday) | |
| --- | |
| ## π― Overview | |
| BirdScope AI is a production-ready multi-agent system that combines **Modal GPU classification** with **Nuthatch species database** to provide comprehensive bird identification and exploration. Users can upload photos, search species, explore taxonomic families, and access rich multimedia content (images, audio recordings, conservation data). | |
| **Two Agent Modes:** | |
| 1. **Specialized Subagents (3 Specialists)** - Router orchestrates image identifier, species explorer, and taxonomy specialist | |
| 2. **Audio Finder Agent** - Specialized agent for discovering bird audio recordings | |
| --- | |
| ## β¨ Features | |
| - π **Image Classification**: Upload bird photos for instant GPU-powered identification | |
| - πΈ **Reference Images**: High-quality Unsplash photos for each species | |
| - π΅ **Audio Recordings**: Bird calls and songs from xeno-canto.org | |
| - π **Conservation Data**: IUCN status and taxonomic information | |
| - π§ **Multi-Agent Architecture**: Specialized agents with focused tool subsets | |
| - π **Dual Streaming**: Separate outputs for chat responses and tool execution logs | |
| - π€ **Multi-Provider**: OpenAI (GPT-4), Anthropic (Claude), HuggingFace (Qwen) | |
| --- | |
| ## π Quick Start (For Users) | |
| ### Option 1: OpenAI (Recommended) | |
| 1. Get your OpenAI API key from [platform.openai.com/api-keys](https://platform.openai.com/api-keys) | |
| 2. Select **OpenAI** as provider in the sidebar | |
| 3. Enter your API key | |
| 4. Model used: `gpt-4o-mini` | |
| ### Option 2: Anthropic (Claude) | |
| 1. Get your Anthropic API key from [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) | |
| 2. Select **Anthropic** as provider | |
| 3. Enter your API key | |
| 4. Model used: `claude-sonnet-4-5` | |
| ### Option 3: HuggingFace | |
| β οΈ **Note**: HuggingFace Inference API has limited function calling support. OpenAI or Anthropic recommended for full functionality. | |
| --- | |
| ## π οΈ Environment Setup (For Developers) | |
| ### Prerequisites | |
| - Python 3.11+ | |
| - Modal account (for GPU classifier) | |
| - Nuthatch API key | |
| - LLM API key (OpenAI, Anthropic, or HuggingFace) | |
| --- | |
| ### π Local Development Setup | |
| #### Step 1: Clone and Install | |
| ```bash | |
| cd ~/Desktop/hackathon/hackathon_draft | |
| # Create virtual environment | |
| python3.11 -m venv .venv | |
| source .venv/bin/activate # On Windows: .venv\Scripts\activate | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| ``` | |
| #### Step 2: Configure Environment Variables | |
| Create a `.env` file from the example: | |
| ```bash | |
| cp .env.example .env | |
| ``` | |
| Edit `.env` with your API keys: | |
| ```bash | |
| # ================================================ | |
| # REQUIRED: Modal Bird Classifier (GPU) | |
| # ================================================ | |
| MODAL_MCP_URL=https://your-modal-app--mcp-server.modal.run/mcp | |
| BIRD_CLASSIFIER_API_KEY=your-modal-api-key-here | |
| # ================================================ | |
| # REQUIRED: Nuthatch Species Database | |
| # ================================================ | |
| NUTHATCH_API_KEY=your-nuthatch-api-key-here | |
| NUTHATCH_BASE_URL=https://nuthatch.lastelm.software/v2 # Default, can omit | |
| # Nuthatch Transport Mode (STDIO or HTTP) | |
| NUTHATCH_USE_STDIO=true # Recommended for local development | |
| # Only needed if NUTHATCH_USE_STDIO=false: | |
| # NUTHATCH_MCP_URL=http://localhost:8001/mcp | |
| # NUTHATCH_MCP_AUTH_KEY=your-auth-key-here | |
| # ================================================ | |
| # LLM Provider (Choose ONE) | |
| # ================================================ | |
| # OpenAI (Recommended) | |
| OPENAI_API_KEY=sk-your-openai-key-here | |
| DEFAULT_OPENAI_MODEL=gpt-4o-mini | |
| OPENAI_TEMPERATURE=0.0 | |
| # OR Anthropic | |
| # ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here | |
| # DEFAULT_ANTHROPIC_MODEL=claude-sonnet-4-5-20250929 | |
| # ANTHROPIC_TEMPERATURE=0.0 | |
| # OR HuggingFace (Limited function calling support) | |
| # HF_API_KEY=hf_your-huggingface-token-here | |
| # DEFAULT_HF_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct | |
| # HF_TEMPERATURE=0.1 | |
| ``` | |
| #### Step 3: Understanding Nuthatch Transport Modes | |
| **STDIO Mode (Recommended for Local):** | |
| - Nuthatch MCP server runs as subprocess | |
| - Automatically started by the app | |
| - No separate server process needed | |
| - Set `NUTHATCH_USE_STDIO=true` | |
| **HTTP Mode (Alternative for Local):** | |
| - Nuthatch MCP server runs as separate HTTP server | |
| - Useful for debugging or multiple clients | |
| - Requires running server in separate terminal | |
| To use HTTP mode: | |
| ```bash | |
| # Terminal 1: Run Nuthatch MCP server | |
| python nuthatch_tools.py --http --port 8001 | |
| # Terminal 2: Run the app | |
| # Set in .env: | |
| # NUTHATCH_USE_STDIO=false | |
| # NUTHATCH_MCP_URL=http://localhost:8001/mcp | |
| python app.py | |
| ``` | |
| #### Step 4: Run the App | |
| ```bash | |
| # With STDIO mode (default, easiest): | |
| python app.py | |
| # Or using Gradio CLI: | |
| gradio app.py | |
| ``` | |
| App will be available at: `http://127.0.0.1:7860` | |
| --- | |
| ### βοΈ HuggingFace Spaces Deployment | |
| #### Step 1: Create a New Space | |
| 1. Go to [huggingface.co/new-space](https://huggingface.co/new-space) | |
| 2. Choose: | |
| - **SDK**: Gradio | |
| - **Hardware**: CPU Basic (free) or CPU Upgrade (faster) | |
| - **Visibility**: Public or Private | |
| #### Step 2: Upload Your Code | |
| **Option A: Using `upload_to_space.py` (Recommended)** | |
| ```bash | |
| # 1. Install HuggingFace CLI | |
| pip install huggingface_hub | |
| # 2. Login | |
| huggingface-cli login | |
| # 3. Update upload_to_space.py with your Space name | |
| # Edit line with repo_id: | |
| # repo_id="YOUR-USERNAME/YOUR-SPACE-NAME" | |
| # 4. Upload | |
| python upload_to_space.py | |
| ``` | |
| **Option B: Using Git** | |
| ```bash | |
| git remote add hf-space https://huggingface.co/spaces/YOUR-USERNAME/YOUR-SPACE-NAME | |
| git push hf-space main | |
| ``` | |
| #### Step 3: Configure Secrets in HuggingFace Spaces | |
| β οΈ **CRITICAL**: Spaces use **Secrets**, not `.env` files! | |
| Go to your Space β **Settings** β **Variables and secrets** | |
| **Add these secrets:** | |
| ```bash | |
| # REQUIRED: Modal Bird Classifier | |
| MODAL_MCP_URL = https://your-modal-app--mcp-server.modal.run/mcp | |
| BIRD_CLASSIFIER_API_KEY = your-modal-api-key-here | |
| # REQUIRED: Nuthatch Species Database | |
| NUTHATCH_API_KEY = your-nuthatch-api-key-here | |
| NUTHATCH_BASE_URL = https://nuthatch.lastelm.software/v2 # Optional | |
| NUTHATCH_USE_STDIO = true # MUST be "true" for Spaces | |
| # OPTIONAL: Backend-provided LLM keys (users can provide their own) | |
| # Only add if you want to provide default keys: | |
| # OPENAI_API_KEY = sk-your-key-here | |
| # ANTHROPIC_API_KEY = sk-ant-your-key-here | |
| ``` | |
| **Important Notes:** | |
| - β **ALWAYS** use `NUTHATCH_USE_STDIO=true` on Spaces (subprocess mode) | |
| - β HTTP mode not supported on Spaces (port binding restrictions) | |
| - β Users can provide their own LLM keys via the UI | |
| - β Environment variables from Spaces **do not** auto-inherit to subprocesses | |
| - The app explicitly passes `NUTHATCH_API_KEY` and `NUTHATCH_BASE_URL` to the subprocess (see `mcp_clients.py`) | |
| #### Step 4: Verify Deployment | |
| 1. Wait for Space to build (2-5 minutes) | |
| 2. Check **Logs** tab for errors | |
| 3. Try the app - upload a bird photo or ask about species | |
| --- | |
| ## π Project Structure | |
| ``` | |
| hackathon_draft/ | |
| βββ app.py # Main Gradio app | |
| βββ upload_to_space.py # HF Spaces upload script | |
| βββ requirements.txt # Python dependencies | |
| βββ .env.example # Environment template | |
| βββ langgraph_agent/ | |
| β βββ __init__.py | |
| β βββ agents.py # Agent factory (single/multi-agent) | |
| β βββ config.py # Configuration loader | |
| β βββ mcp_clients.py # MCP client setup | |
| β βββ subagent_config.py # Agent mode definitions | |
| β βββ prompts.py # System prompts | |
| β βββ structured_output.py # Response formatting | |
| βββ nuthatch_tools.py # Nuthatch MCP server | |
| βββ agent_cache.py # Session-based agent caching | |
| ``` | |
| --- | |
| ## ποΈ Architecture | |
| ### MCP Servers | |
| **1. Modal Bird Classifier (GPU)** | |
| - Hosted on Modal (serverless GPU) | |
| - ResNet50 trained on 555 bird species | |
| - Tools: `classify_from_url`, `classify_from_base64` | |
| - Transport: Streamable HTTP | |
| **2. Nuthatch Species Database** | |
| - Species reference API (1000+ birds) | |
| - Tools: `search_birds`, `get_bird_info`, `get_bird_images`, `get_bird_audio`, `search_by_family`, `filter_by_status`, `get_all_families` | |
| - Transport: **STDIO** (subprocess on Spaces), STDIO or HTTP (local) | |
| - Data sources: Unsplash (images), xeno-canto (audio) | |
| ### Agent Modes | |
| **Mode 1: Specialized Subagents (3 Specialists)** | |
| - **Router** orchestrates 3 specialized agents: | |
| 1. **Image Identifier**: classify images, show reference photos | |
| 2. **Species Explorer**: search by name, provide multimedia | |
| 3. **Taxonomy Specialist**: conservation status, family search | |
| - Each specialist has focused tool subset | |
| **Mode 2: Audio Finder Agent** | |
| - Single specialized agent for finding bird audio | |
| - Tools: `search_birds`, `get_bird_info`, `get_bird_audio` | |
| - Optimized workflow for xeno-canto recordings | |
| ### Tech Stack | |
| - **Frontend**: Gradio 6.0 with custom CSS (cloud/sky theme) | |
| - **Agent Framework**: LangGraph with streaming | |
| - **MCP Integration**: FastMCP client library | |
| - **LLM Support**: OpenAI, Anthropic, HuggingFace | |
| - **Session Management**: In-memory agent caching | |
| - **Output Parsing**: LlamaIndex Pydantic + regex (optimized) | |
| --- | |
| ## π¨ Special Features | |
| ### Dual Streaming Output | |
| - **Chat Panel**: LLM responses with markdown rendering | |
| - **Tool Log Panel**: Real-time tool execution traces (inputs/outputs) | |
| ### Dynamic Examples | |
| - Examples change based on selected agent mode | |
| - Photo examples always visible | |
| - Text examples adapt to Audio Finder vs Multi-Agent | |
| ### Structured Output | |
| - Automatic image/audio URL extraction | |
| - Markdown formatting for media | |
| - xeno-canto audio links (browser-friendly) | |
| --- | |
| ## π API Key Sources | |
| | Service | Get Key From | Purpose | | |
| |---------|-------------|---------| | |
| | **Modal** | [modal.com](https://modal.com) | GPU bird classifier | | |
| | **Nuthatch** | [nuthatch.lastelm.software](https://nuthatch.lastelm.software) | Species database | | |
| | **OpenAI** | [platform.openai.com/api-keys](https://platform.openai.com/api-keys) | LLM (recommended) | | |
| | **Anthropic** | [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) | LLM (Claude) | | |
| | **HuggingFace** | [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) | LLM (limited support) | | |
| --- | |
| ## π Troubleshooting | |
| ### Space stuck on "Building" | |
| - Check **Logs** tab for errors | |
| - Verify all required secrets are set | |
| - Try Factory Reboot (Settings β Factory Reboot) | |
| ### "Invalid API key" errors | |
| - Ensure secrets are set correctly (no quotes needed) | |
| - Check secret names match exactly (case-sensitive) | |
| ### HuggingFace provider fails with "function calling not support" | |
| - HuggingFace Inference API has limited tool calling | |
| - Use OpenAI or Anthropic instead | |
| ### Nuthatch server not starting (local) | |
| - Check `NUTHATCH_API_KEY` is set in `.env` | |
| - Verify API key is valid | |
| - Try STDIO mode: `NUTHATCH_USE_STDIO=true` | |
| ### Audio links broken | |
| - Check AUDIO_FINDER_PROMPT is working | |
| - Verify xeno-canto URLs include `/download` | |
| - Check structured output parsing logs | |
| --- | |
| ## π Documentation | |
| For detailed implementation docs, see: | |
| - `project_docs/implementation/phase_5_final.md` - Complete agent architecture | |
| - `project_docs/commands_guide/git_spaces_cheatsheet.md` - Deployment guide | |
| --- | |
| ## π Credits | |
| - **Bird Species Data**: [Nuthatch API](https://nuthatch.lastelm.software) by Last Elm Software | |
| - **Bird Audio**: [xeno-canto.org](https://xeno-canto.org) - Community bird recordings | |
| - **Reference Images**: [Unsplash](https://unsplash.com) + curated collections | |
| - **MCP Protocol**: [Anthropic Model Context Protocol](https://github.com/anthropics/mcp) | |
| - **Hackathon**: [HuggingFace MCP-1st-Birthday](https://huggingface.co/MCP-1st-Birthday) | |
| --- | |
| ## π License | |
| MIT License - Built for educational and research purposes | |