--- title: BirdScope AI - MCP Multi-Agent System emoji: 🦅 colorFrom: green colorTo: blue sdk: gradio python_version: 3.11 app_file: app.py pinned: false --- # 🦅 BirdScope AI - Multi-Agent Bird Identification System **AI-powered bird identification with specialized MCP agents** Built for the [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday) --- ## 🎯 Overview BirdScope AI is a production-ready multi-agent system that combines **Modal GPU classification** with **Nuthatch species database** to provide comprehensive bird identification and exploration. Users can upload photos, search species, explore taxonomic families, and access rich multimedia content (images, audio recordings, conservation data). **Two Agent Modes:** 1. **Specialized Subagents (3 Specialists)** - Router orchestrates image identifier, species explorer, and taxonomy specialist 2. **Audio Finder Agent** - Specialized agent for discovering bird audio recordings --- ## ✨ Features - 🔍 **Image Classification**: Upload bird photos for instant GPU-powered identification - 📸 **Reference Images**: High-quality Unsplash photos for each species - 🎵 **Audio Recordings**: Bird calls and songs from xeno-canto.org - 🌍 **Conservation Data**: IUCN status and taxonomic information - 🧠 **Multi-Agent Architecture**: Specialized agents with focused tool subsets - 🔄 **Dual Streaming**: Separate outputs for chat responses and tool execution logs - 🤖 **Multi-Provider**: OpenAI (GPT-4), Anthropic (Claude), HuggingFace (Qwen) --- ## 🚀 Quick Start (For Users) ### Option 1: OpenAI (Recommended) 1. Get your OpenAI API key from [platform.openai.com/api-keys](https://platform.openai.com/api-keys) 2. Select **OpenAI** as provider in the sidebar 3. Enter your API key 4. Model used: `gpt-4o-mini` ### Option 2: Anthropic (Claude) 1. Get your Anthropic API key from [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) 2. Select **Anthropic** as provider 3. Enter your API key 4. Model used: `claude-sonnet-4-5` ### Option 3: HuggingFace ⚠️ **Note**: HuggingFace Inference API has limited function calling support. OpenAI or Anthropic recommended for full functionality. --- ## 🛠️ Environment Setup (For Developers) ### Prerequisites - Python 3.11+ - Modal account (for GPU classifier) - Nuthatch API key - LLM API key (OpenAI, Anthropic, or HuggingFace) --- ### 🏠 Local Development Setup #### Step 1: Clone and Install ```bash cd ~/Desktop/hackathon/hackathon_draft # Create virtual environment python3.11 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install dependencies pip install -r requirements.txt ``` #### Step 2: Configure Environment Variables Create a `.env` file from the example: ```bash cp .env.example .env ``` Edit `.env` with your API keys: ```bash # ================================================ # REQUIRED: Modal Bird Classifier (GPU) # ================================================ MODAL_MCP_URL=https://your-modal-app--mcp-server.modal.run/mcp BIRD_CLASSIFIER_API_KEY=your-modal-api-key-here # ================================================ # REQUIRED: Nuthatch Species Database # ================================================ NUTHATCH_API_KEY=your-nuthatch-api-key-here NUTHATCH_BASE_URL=https://nuthatch.lastelm.software/v2 # Default, can omit # Nuthatch Transport Mode (STDIO or HTTP) NUTHATCH_USE_STDIO=true # Recommended for local development # Only needed if NUTHATCH_USE_STDIO=false: # NUTHATCH_MCP_URL=http://localhost:8001/mcp # NUTHATCH_MCP_AUTH_KEY=your-auth-key-here # ================================================ # LLM Provider (Choose ONE) # ================================================ # OpenAI (Recommended) OPENAI_API_KEY=sk-your-openai-key-here DEFAULT_OPENAI_MODEL=gpt-4o-mini OPENAI_TEMPERATURE=0.0 # OR Anthropic # ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here # DEFAULT_ANTHROPIC_MODEL=claude-sonnet-4-5-20250929 # ANTHROPIC_TEMPERATURE=0.0 # OR HuggingFace (Limited function calling support) # HF_API_KEY=hf_your-huggingface-token-here # DEFAULT_HF_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct # HF_TEMPERATURE=0.1 ``` #### Step 3: Understanding Nuthatch Transport Modes **STDIO Mode (Recommended for Local):** - Nuthatch MCP server runs as subprocess - Automatically started by the app - No separate server process needed - Set `NUTHATCH_USE_STDIO=true` **HTTP Mode (Alternative for Local):** - Nuthatch MCP server runs as separate HTTP server - Useful for debugging or multiple clients - Requires running server in separate terminal To use HTTP mode: ```bash # Terminal 1: Run Nuthatch MCP server python nuthatch_tools.py --http --port 8001 # Terminal 2: Run the app # Set in .env: # NUTHATCH_USE_STDIO=false # NUTHATCH_MCP_URL=http://localhost:8001/mcp python app.py ``` #### Step 4: Run the App ```bash # With STDIO mode (default, easiest): python app.py # Or using Gradio CLI: gradio app.py ``` App will be available at: `http://127.0.0.1:7860` --- ### ☁️ HuggingFace Spaces Deployment #### Step 1: Create a New Space 1. Go to [huggingface.co/new-space](https://huggingface.co/new-space) 2. Choose: - **SDK**: Gradio - **Hardware**: CPU Basic (free) or CPU Upgrade (faster) - **Visibility**: Public or Private #### Step 2: Upload Your Code **Option A: Using `upload_to_space.py` (Recommended)** ```bash # 1. Install HuggingFace CLI pip install huggingface_hub # 2. Login huggingface-cli login # 3. Update upload_to_space.py with your Space name # Edit line with repo_id: # repo_id="YOUR-USERNAME/YOUR-SPACE-NAME" # 4. Upload python upload_to_space.py ``` **Option B: Using Git** ```bash git remote add hf-space https://huggingface.co/spaces/YOUR-USERNAME/YOUR-SPACE-NAME git push hf-space main ``` #### Step 3: Configure Secrets in HuggingFace Spaces ⚠️ **CRITICAL**: Spaces use **Secrets**, not `.env` files! Go to your Space → **Settings** → **Variables and secrets** **Add these secrets:** ```bash # REQUIRED: Modal Bird Classifier MODAL_MCP_URL = https://your-modal-app--mcp-server.modal.run/mcp BIRD_CLASSIFIER_API_KEY = your-modal-api-key-here # REQUIRED: Nuthatch Species Database NUTHATCH_API_KEY = your-nuthatch-api-key-here NUTHATCH_BASE_URL = https://nuthatch.lastelm.software/v2 # Optional NUTHATCH_USE_STDIO = true # MUST be "true" for Spaces # OPTIONAL: Backend-provided LLM keys (users can provide their own) # Only add if you want to provide default keys: # OPENAI_API_KEY = sk-your-key-here # ANTHROPIC_API_KEY = sk-ant-your-key-here ``` **Important Notes:** - ✅ **ALWAYS** use `NUTHATCH_USE_STDIO=true` on Spaces (subprocess mode) - ✅ HTTP mode not supported on Spaces (port binding restrictions) - ✅ Users can provide their own LLM keys via the UI - ✅ Environment variables from Spaces **do not** auto-inherit to subprocesses - The app explicitly passes `NUTHATCH_API_KEY` and `NUTHATCH_BASE_URL` to the subprocess (see `mcp_clients.py`) #### Step 4: Verify Deployment 1. Wait for Space to build (2-5 minutes) 2. Check **Logs** tab for errors 3. Try the app - upload a bird photo or ask about species --- ## 📁 Project Structure ``` hackathon_draft/ ├── app.py # Main Gradio app ├── upload_to_space.py # HF Spaces upload script ├── requirements.txt # Python dependencies ├── .env.example # Environment template ├── langgraph_agent/ │ ├── __init__.py │ ├── agents.py # Agent factory (single/multi-agent) │ ├── config.py # Configuration loader │ ├── mcp_clients.py # MCP client setup │ ├── subagent_config.py # Agent mode definitions │ ├── prompts.py # System prompts │ └── structured_output.py # Response formatting ├── nuthatch_tools.py # Nuthatch MCP server └── agent_cache.py # Session-based agent caching ``` --- ## 🏗️ Architecture ### MCP Servers **1. Modal Bird Classifier (GPU)** - Hosted on Modal (serverless GPU) - ResNet50 trained on 555 bird species - Tools: `classify_from_url`, `classify_from_base64` - Transport: Streamable HTTP **2. Nuthatch Species Database** - Species reference API (1000+ birds) - Tools: `search_birds`, `get_bird_info`, `get_bird_images`, `get_bird_audio`, `search_by_family`, `filter_by_status`, `get_all_families` - Transport: **STDIO** (subprocess on Spaces), STDIO or HTTP (local) - Data sources: Unsplash (images), xeno-canto (audio) ### Agent Modes **Mode 1: Specialized Subagents (3 Specialists)** - **Router** orchestrates 3 specialized agents: 1. **Image Identifier**: classify images, show reference photos 2. **Species Explorer**: search by name, provide multimedia 3. **Taxonomy Specialist**: conservation status, family search - Each specialist has focused tool subset **Mode 2: Audio Finder Agent** - Single specialized agent for finding bird audio - Tools: `search_birds`, `get_bird_info`, `get_bird_audio` - Optimized workflow for xeno-canto recordings ### Tech Stack - **Frontend**: Gradio 6.0 with custom CSS (cloud/sky theme) - **Agent Framework**: LangGraph with streaming - **MCP Integration**: FastMCP client library - **LLM Support**: OpenAI, Anthropic, HuggingFace - **Session Management**: In-memory agent caching - **Output Parsing**: LlamaIndex Pydantic + regex (optimized) --- ## 🎨 Special Features ### Dual Streaming Output - **Chat Panel**: LLM responses with markdown rendering - **Tool Log Panel**: Real-time tool execution traces (inputs/outputs) ### Dynamic Examples - Examples change based on selected agent mode - Photo examples always visible - Text examples adapt to Audio Finder vs Multi-Agent ### Structured Output - Automatic image/audio URL extraction - Markdown formatting for media - xeno-canto audio links (browser-friendly) --- ## 📝 API Key Sources | Service | Get Key From | Purpose | |---------|-------------|---------| | **Modal** | [modal.com](https://modal.com) | GPU bird classifier | | **Nuthatch** | [nuthatch.lastelm.software](https://nuthatch.lastelm.software) | Species database | | **OpenAI** | [platform.openai.com/api-keys](https://platform.openai.com/api-keys) | LLM (recommended) | | **Anthropic** | [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) | LLM (Claude) | | **HuggingFace** | [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) | LLM (limited support) | --- ## 🐛 Troubleshooting ### Space stuck on "Building" - Check **Logs** tab for errors - Verify all required secrets are set - Try Factory Reboot (Settings → Factory Reboot) ### "Invalid API key" errors - Ensure secrets are set correctly (no quotes needed) - Check secret names match exactly (case-sensitive) ### HuggingFace provider fails with "function calling not support" - HuggingFace Inference API has limited tool calling - Use OpenAI or Anthropic instead ### Nuthatch server not starting (local) - Check `NUTHATCH_API_KEY` is set in `.env` - Verify API key is valid - Try STDIO mode: `NUTHATCH_USE_STDIO=true` ### Audio links broken - Check AUDIO_FINDER_PROMPT is working - Verify xeno-canto URLs include `/download` - Check structured output parsing logs --- ## 📚 Documentation For detailed implementation docs, see: - `project_docs/implementation/phase_5_final.md` - Complete agent architecture - `project_docs/commands_guide/git_spaces_cheatsheet.md` - Deployment guide --- ## 🏆 Credits - **Bird Species Data**: [Nuthatch API](https://nuthatch.lastelm.software) by Last Elm Software - **Bird Audio**: [xeno-canto.org](https://xeno-canto.org) - Community bird recordings - **Reference Images**: [Unsplash](https://unsplash.com) + curated collections - **MCP Protocol**: [Anthropic Model Context Protocol](https://github.com/anthropics/mcp) - **Hackathon**: [HuggingFace MCP-1st-Birthday](https://huggingface.co/MCP-1st-Birthday) --- ## 📄 License MIT License - Built for educational and research purposes