Spaces:

MCP-1st-Birthday
/

BirdScopeAI

Running

File size: 12,126 Bytes

128f5d1

---
title: BirdScope AI - MCP Multi-Agent System
emoji: 🦅
colorFrom: green
colorTo: blue
sdk: gradio
python_version: 3.11
app_file: app.py
pinned: false
---

# 🦅 BirdScope AI - Multi-Agent Bird Identification System

**AI-powered bird identification with specialized MCP agents**

Built for the [MCP 1st Birthday Hackathon](https://huggingface.co/MCP-1st-Birthday)

---

## 🎯 Overview

BirdScope AI is a production-ready multi-agent system that combines **Modal GPU classification** with **Nuthatch species database** to provide comprehensive bird identification and exploration. Users can upload photos, search species, explore taxonomic families, and access rich multimedia content (images, audio recordings, conservation data).

**Two Agent Modes:**
1. **Specialized Subagents (3 Specialists)** - Router orchestrates image identifier, species explorer, and taxonomy specialist
2. **Audio Finder Agent** - Specialized agent for discovering bird audio recordings

---

## ✨ Features

- 🔍 **Image Classification**: Upload bird photos for instant GPU-powered identification
- 📸 **Reference Images**: High-quality Unsplash photos for each species
- 🎵 **Audio Recordings**: Bird calls and songs from xeno-canto.org
- 🌍 **Conservation Data**: IUCN status and taxonomic information
- 🧠 **Multi-Agent Architecture**: Specialized agents with focused tool subsets
- 🔄 **Dual Streaming**: Separate outputs for chat responses and tool execution logs
- 🤖 **Multi-Provider**: OpenAI (GPT-4), Anthropic (Claude), HuggingFace (Qwen)

---

## 🚀 Quick Start (For Users)

### Option 1: OpenAI (Recommended)
1. Get your OpenAI API key from [platform.openai.com/api-keys](https://platform.openai.com/api-keys)
2. Select **OpenAI** as provider in the sidebar
3. Enter your API key
4. Model used: `gpt-4o-mini`

### Option 2: Anthropic (Claude)
1. Get your Anthropic API key from [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys)
2. Select **Anthropic** as provider
3. Enter your API key
4. Model used: `claude-sonnet-4-5`

### Option 3: HuggingFace
⚠️ **Note**: HuggingFace Inference API has limited function calling support. OpenAI or Anthropic recommended for full functionality.

---

## 🛠️ Environment Setup (For Developers)

### Prerequisites

- Python 3.11+
- Modal account (for GPU classifier)
- Nuthatch API key
- LLM API key (OpenAI, Anthropic, or HuggingFace)

---

### 🏠 Local Development Setup

#### Step 1: Clone and Install

```bash
cd ~/Desktop/hackathon/hackathon_draft

# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
```

#### Step 2: Configure Environment Variables

Create a `.env` file from the example:

```bash
cp .env.example .env
```

Edit `.env` with your API keys:

```bash
# ================================================
# REQUIRED: Modal Bird Classifier (GPU)
# ================================================
MODAL_MCP_URL=https://your-modal-app--mcp-server.modal.run/mcp
BIRD_CLASSIFIER_API_KEY=your-modal-api-key-here

# ================================================
# REQUIRED: Nuthatch Species Database
# ================================================
NUTHATCH_API_KEY=your-nuthatch-api-key-here
NUTHATCH_BASE_URL=https://nuthatch.lastelm.software/v2  # Default, can omit

# Nuthatch Transport Mode (STDIO or HTTP)
NUTHATCH_USE_STDIO=true  # Recommended for local development

# Only needed if NUTHATCH_USE_STDIO=false:
# NUTHATCH_MCP_URL=http://localhost:8001/mcp
# NUTHATCH_MCP_AUTH_KEY=your-auth-key-here

# ================================================
# LLM Provider (Choose ONE)
# ================================================
# OpenAI (Recommended)
OPENAI_API_KEY=sk-your-openai-key-here
DEFAULT_OPENAI_MODEL=gpt-4o-mini
OPENAI_TEMPERATURE=0.0

# OR Anthropic
# ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here
# DEFAULT_ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
# ANTHROPIC_TEMPERATURE=0.0

# OR HuggingFace (Limited function calling support)
# HF_API_KEY=hf_your-huggingface-token-here
# DEFAULT_HF_MODEL=Qwen/Qwen2.5-Coder-32B-Instruct
# HF_TEMPERATURE=0.1
```

#### Step 3: Understanding Nuthatch Transport Modes

**STDIO Mode (Recommended for Local):**
- Nuthatch MCP server runs as subprocess
- Automatically started by the app
- No separate server process needed
- Set `NUTHATCH_USE_STDIO=true`

**HTTP Mode (Alternative for Local):**
- Nuthatch MCP server runs as separate HTTP server
- Useful for debugging or multiple clients
- Requires running server in separate terminal

To use HTTP mode:

```bash
# Terminal 1: Run Nuthatch MCP server
python nuthatch_tools.py --http --port 8001

# Terminal 2: Run the app
# Set in .env:
# NUTHATCH_USE_STDIO=false
# NUTHATCH_MCP_URL=http://localhost:8001/mcp
python app.py
```

#### Step 4: Run the App

```bash
# With STDIO mode (default, easiest):
python app.py

# Or using Gradio CLI:
gradio app.py
```

App will be available at: `http://127.0.0.1:7860`

---

### ☁️ HuggingFace Spaces Deployment

#### Step 1: Create a New Space

1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
2. Choose:
   - **SDK**: Gradio
   - **Hardware**: CPU Basic (free) or CPU Upgrade (faster)
   - **Visibility**: Public or Private

#### Step 2: Upload Your Code

**Option A: Using `upload_to_space.py` (Recommended)**

```bash
# 1. Install HuggingFace CLI
pip install huggingface_hub

# 2. Login
huggingface-cli login

# 3. Update upload_to_space.py with your Space name
# Edit line with repo_id:
# repo_id="YOUR-USERNAME/YOUR-SPACE-NAME"

# 4. Upload
python upload_to_space.py
```

**Option B: Using Git**

```bash
git remote add hf-space https://huggingface.co/spaces/YOUR-USERNAME/YOUR-SPACE-NAME
git push hf-space main
```

#### Step 3: Configure Secrets in HuggingFace Spaces

⚠️ **CRITICAL**: Spaces use **Secrets**, not `.env` files!

Go to your Space → **Settings** → **Variables and secrets**

**Add these secrets:**

```bash
# REQUIRED: Modal Bird Classifier
MODAL_MCP_URL = https://your-modal-app--mcp-server.modal.run/mcp
BIRD_CLASSIFIER_API_KEY = your-modal-api-key-here

# REQUIRED: Nuthatch Species Database
NUTHATCH_API_KEY = your-nuthatch-api-key-here
NUTHATCH_BASE_URL = https://nuthatch.lastelm.software/v2  # Optional
NUTHATCH_USE_STDIO = true  # MUST be "true" for Spaces

# OPTIONAL: Backend-provided LLM keys (users can provide their own)
# Only add if you want to provide default keys:
# OPENAI_API_KEY = sk-your-key-here
# ANTHROPIC_API_KEY = sk-ant-your-key-here
```

**Important Notes:**
- ✅ **ALWAYS** use `NUTHATCH_USE_STDIO=true` on Spaces (subprocess mode)
- ✅ HTTP mode not supported on Spaces (port binding restrictions)
- ✅ Users can provide their own LLM keys via the UI
- ✅ Environment variables from Spaces **do not** auto-inherit to subprocesses
  - The app explicitly passes `NUTHATCH_API_KEY` and `NUTHATCH_BASE_URL` to the subprocess (see `mcp_clients.py`)

#### Step 4: Verify Deployment

1. Wait for Space to build (2-5 minutes)
2. Check **Logs** tab for errors
3. Try the app - upload a bird photo or ask about species

---

## 📁 Project Structure

```
hackathon_draft/
├── app.py                      # Main Gradio app
├── upload_to_space.py          # HF Spaces upload script
├── requirements.txt            # Python dependencies
├── .env.example                # Environment template
├── langgraph_agent/
│   ├── __init__.py
│   ├── agents.py               # Agent factory (single/multi-agent)
│   ├── config.py               # Configuration loader
│   ├── mcp_clients.py          # MCP client setup
│   ├── subagent_config.py      # Agent mode definitions
│   ├── prompts.py              # System prompts
│   └── structured_output.py    # Response formatting
├── nuthatch_tools.py           # Nuthatch MCP server
└── agent_cache.py              # Session-based agent caching
```

---

## 🏗️ Architecture

### MCP Servers

**1. Modal Bird Classifier (GPU)**
- Hosted on Modal (serverless GPU)
- ResNet50 trained on 555 bird species
- Tools: `classify_from_url`, `classify_from_base64`
- Transport: Streamable HTTP

**2. Nuthatch Species Database**
- Species reference API (1000+ birds)
- Tools: `search_birds`, `get_bird_info`, `get_bird_images`, `get_bird_audio`, `search_by_family`, `filter_by_status`, `get_all_families`
- Transport: **STDIO** (subprocess on Spaces), STDIO or HTTP (local)
- Data sources: Unsplash (images), xeno-canto (audio)

### Agent Modes

**Mode 1: Specialized Subagents (3 Specialists)**
- **Router** orchestrates 3 specialized agents:
  1. **Image Identifier**: classify images, show reference photos
  2. **Species Explorer**: search by name, provide multimedia
  3. **Taxonomy Specialist**: conservation status, family search
- Each specialist has focused tool subset

**Mode 2: Audio Finder Agent**
- Single specialized agent for finding bird audio
- Tools: `search_birds`, `get_bird_info`, `get_bird_audio`
- Optimized workflow for xeno-canto recordings

### Tech Stack

- **Frontend**: Gradio 6.0 with custom CSS (cloud/sky theme)
- **Agent Framework**: LangGraph with streaming
- **MCP Integration**: FastMCP client library
- **LLM Support**: OpenAI, Anthropic, HuggingFace
- **Session Management**: In-memory agent caching
- **Output Parsing**: LlamaIndex Pydantic + regex (optimized)

---

## 🎨 Special Features

### Dual Streaming Output
- **Chat Panel**: LLM responses with markdown rendering
- **Tool Log Panel**: Real-time tool execution traces (inputs/outputs)

### Dynamic Examples
- Examples change based on selected agent mode
- Photo examples always visible
- Text examples adapt to Audio Finder vs Multi-Agent

### Structured Output
- Automatic image/audio URL extraction
- Markdown formatting for media
- xeno-canto audio links (browser-friendly)

---

## 📝 API Key Sources

| Service | Get Key From | Purpose |
|---------|-------------|---------|
| **Modal** | [modal.com](https://modal.com) | GPU bird classifier |
| **Nuthatch** | [nuthatch.lastelm.software](https://nuthatch.lastelm.software) | Species database |
| **OpenAI** | [platform.openai.com/api-keys](https://platform.openai.com/api-keys) | LLM (recommended) |
| **Anthropic** | [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) | LLM (Claude) |
| **HuggingFace** | [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) | LLM (limited support) |

---

## 🐛 Troubleshooting

### Space stuck on "Building"
- Check **Logs** tab for errors
- Verify all required secrets are set
- Try Factory Reboot (Settings → Factory Reboot)

### "Invalid API key" errors
- Ensure secrets are set correctly (no quotes needed)
- Check secret names match exactly (case-sensitive)

### HuggingFace provider fails with "function calling not support"
- HuggingFace Inference API has limited tool calling
- Use OpenAI or Anthropic instead

### Nuthatch server not starting (local)
- Check `NUTHATCH_API_KEY` is set in `.env`
- Verify API key is valid
- Try STDIO mode: `NUTHATCH_USE_STDIO=true`

### Audio links broken
- Check AUDIO_FINDER_PROMPT is working
- Verify xeno-canto URLs include `/download`
- Check structured output parsing logs

---

## 📚 Documentation

For detailed implementation docs, see:
- `project_docs/implementation/phase_5_final.md` - Complete agent architecture
- `project_docs/commands_guide/git_spaces_cheatsheet.md` - Deployment guide

---

## 🏆 Credits

- **Bird Species Data**: [Nuthatch API](https://nuthatch.lastelm.software) by Last Elm Software
- **Bird Audio**: [xeno-canto.org](https://xeno-canto.org) - Community bird recordings
- **Reference Images**: [Unsplash](https://unsplash.com) + curated collections
- **MCP Protocol**: [Anthropic Model Context Protocol](https://github.com/anthropics/mcp)
- **Hackathon**: [HuggingFace MCP-1st-Birthday](https://huggingface.co/MCP-1st-Birthday)

---

## 📄 License

MIT License - Built for educational and research purposes