Spaces:

MCP-1st-Birthday
/

BirdScopeAI

Running

App Files Files Community

facemelter commited on 16 days ago

Commit

db789ae

verified ·

1 Parent(s): 7246469

Ensuring agentic consistency; fixing structured output errors

Browse files

Files changed (6) hide show

.gitattributes +1 -0
app.py +62 -8
examples/bird_example_7.jpg +3 -0
langgraph_agent/prompts.py +11 -9
langgraph_agent/structured_output.py +17 -6
langgraph_agent/subagent_config.py +11 -9

.gitattributes CHANGED Viewed

@@ -38,3 +38,4 @@ examples/bird_example_1.jpg filter=lfs diff=lfs merge=lfs -text
 examples/bird_example_2.jpg filter=lfs diff=lfs merge=lfs -text
 examples/bird_exmample_4.jpg filter=lfs diff=lfs merge=lfs -text
 examples/bird_example_4.jpg filter=lfs diff=lfs merge=lfs -text

 examples/bird_example_2.jpg filter=lfs diff=lfs merge=lfs -text
 examples/bird_exmample_4.jpg filter=lfs diff=lfs merge=lfs -text
 examples/bird_example_4.jpg filter=lfs diff=lfs merge=lfs -text
+examples/bird_example_7.jpg filter=lfs diff=lfs merge=lfs -text

app.py CHANGED Viewed

@@ -655,22 +655,76 @@ def format_tool_output_for_chat(tool_output):
     """
     Parse tool output and format images/content for display in chatbot.
     Detects image URLs and converts them to markdown image syntax.
     """
     import re
-    output_str = str(tool_output)
-    # Pattern to match image URLs (common image formats)
-    image_pattern = r'(https?://[^\s<>"{}|\\^\[\]`]+\.(?:jpg|jpeg|png|gif|webp|svg)(?:\?[^\s]*)?)'
-    # Find all image URLs
-    image_urls = re.findall(image_pattern, output_str, re.IGNORECASE)
-    if image_urls:
         # Format images as markdown
         formatted_output = ""
-        for url in image_urls[:3]:  # Limit to first 3 images to avoid clutter
             formatted_output += f"![Image]({url})\n\n"
         return formatted_output
     # If no images, return truncated text

     """
     Parse tool output and format images/content for display in chatbot.
     Detects image URLs and converts them to markdown image syntax.
+    Handles both JSON-formatted MCP responses and plain text.
     """
     import re
+    # Extract content from ToolMessage objects (LangGraph wraps outputs in ToolMessage)
+    if hasattr(tool_output, 'content'):
+        output_str = tool_output.content
+        print(f"[FORMAT_TOOL_OUTPUT] Extracted content from ToolMessage")
+    elif isinstance(tool_output, dict) and 'content' in tool_output:
+        output_str = tool_output['content']
+        print(f"[FORMAT_TOOL_OUTPUT] Extracted content from dict")
+    else:
+        output_str = str(tool_output)
+        print(f"[FORMAT_TOOL_OUTPUT] Using str() fallback")
+    image_urls = []
+    # Try to parse as JSON first (MCP tools often return JSON)
+    try:
+        import json
+        parsed = json.loads(output_str)
+        print(f"[FORMAT_TOOL_OUTPUT] Successfully parsed JSON")
+        # Extract URLs from common JSON structures
+        if isinstance(parsed, dict):
+            # Check for "data" field (Nuthatch MCP format)
+            data = parsed.get("data", [])
+            if isinstance(data, list):
+                # data is a list of URLs
+                for item in data:
+                    if isinstance(item, str) and item.startswith("http"):
+                        image_urls.append(item)
+            elif isinstance(data, str) and data.startswith("http"):
+                image_urls.append(data)
+            # Also check for images in nested structures
+            for key, value in parsed.items():
+                if isinstance(value, list):
+                    for item in value:
+                        if isinstance(item, str) and item.startswith("http") and any(ext in item.lower() for ext in ['.jpg', '.jpeg', '.png', '.gif', '.webp', '.svg']):
+                            image_urls.append(item)
+    except (json.JSONDecodeError, ValueError):
+        # Not JSON, fallback to regex extraction
+        pass
+    # Fallback: regex extraction for non-JSON or additional URLs
+    if not image_urls:
+        # Updated pattern: more permissive to catch URLs even with surrounding JSON characters
+        # Match URLs ending in image extensions, allowing any characters before the extension
+        image_pattern = r'https?://[^\s]+?\.(?:jpg|jpeg|png|gif|webp|svg)(?:\?[^\s"]*)?'
+        found_urls = re.findall(image_pattern, output_str, re.IGNORECASE)
+        image_urls.extend(found_urls)
+    # Remove duplicates while preserving order
+    seen = set()
+    unique_urls = []
+    for url in image_urls:
+        # Clean URL (remove trailing quotes, brackets, etc.)
+        clean_url = url.rstrip('",}]')
+        if clean_url not in seen:
+            seen.add(clean_url)
+            unique_urls.append(clean_url)
+    if unique_urls:
         # Format images as markdown
         formatted_output = ""
+        for url in unique_urls[:3]:  # Limit to first 3 images to avoid clutter
             formatted_output += f"![Image]({url})\n\n"
+        print(f"[FORMAT_TOOL_OUTPUT] ✅ Formatted {len(unique_urls[:3])} images as markdown")
         return formatted_output
     # If no images, return truncated text

examples/bird_example_7.jpg ADDED Viewed

Git LFS Details

SHA256: b1fce9ae21320a3fb1aac52ee117d619b3ce182c23d77247aceb0f56d110491a
Pointer size: 131 Bytes
Size of remote file: 551 kB

langgraph_agent/prompts.py CHANGED Viewed

@@ -268,20 +268,21 @@ IMAGE_IDENTIFIER_PROMPT_HF = """You are an Image Identification Specialist.
 **Your Job:**
 1. Classify uploaded bird images
 2. Show confidence score
-3. Get bird information
-4. Show reference images
 **Tools:**
 - classify_from_url(url) - Identify bird from image URL
 - classify_from_base64(image) - Identify bird from base64
 - get_bird_info(name) - Get species details
-- get_bird_images(name) - Get reference photos
 **Response Format:**
 1. Bird name (Common and Scientific)
 2. Confidence: X%
 3. Key features
-4. Reference images as: ![Bird](url)
 **CRITICAL - No Hallucination:**
 - If get_bird_images returns empty: Tell user "No reference images available"
@@ -360,11 +361,12 @@ ROUTER_PROMPT_HF = """You are BirdScope AI Supervisor. Route user requests to sp
 **Routing Rules:**
 1. Image uploads → image_identifier
-2. "Search" or "find" or "examples" or "list birds" → generalist
-3. "Audio" or "sound" or "song" → generalist
-4. Species info by name → image_identifier
-5. "Conservation" or "endangered" → taxonomy_specialist
-6. "Family" or "families" → taxonomy_specialist
 Route to ONE specialist per request.

 **Your Job:**
 1. Classify uploaded bird images
 2. Show confidence score
+3. Get bird information using get_bird_info
+4. ALWAYS call get_bird_images to fetch reference photos
+5. Display reference images for the user
 **Tools:**
 - classify_from_url(url) - Identify bird from image URL
 - classify_from_base64(image) - Identify bird from base64
 - get_bird_info(name) - Get species details
+- get_bird_images(name) - ALWAYS call this to get reference photos
 **Response Format:**
 1. Bird name (Common and Scientific)
 2. Confidence: X%
 3. Key features
+4. ALWAYS call get_bird_images and show: ![Bird](url)
 **CRITICAL - No Hallucination:**
 - If get_bird_images returns empty: Tell user "No reference images available"
 **Routing Rules:**
 1. Image uploads → image_identifier
+2. "Show me image" or "picture" or "photo" requests → image_identifier
+3. Species info by name → image_identifier
+4. "Search" or "find" or "examples" or "list birds" → generalist
+5. "Audio" or "sound" or "song" → generalist
+6. "Conservation" or "endangered" → taxonomy_specialist
+7. "Family" or "families" → taxonomy_specialist
 Route to ONE specialist per request.

langgraph_agent/structured_output.py CHANGED Viewed

@@ -37,22 +37,33 @@ def extract_urls_from_text(text: str) -> tuple[List[str], List[str]]:
     """
     Extract image and audio URLs from text using regex.
     Returns:
         tuple: (image_urls, audio_urls)
     """
-    # Pattern for image URLs (jpg, jpeg, png, gif, webp, svg)
-    image_pattern = r'https?://[^\s<>"{}|\\^`\[\]]+\.(?:jpg|jpeg|png|gif|webp|svg)(?:\?[^\s]*)?'
     # Pattern for audio URLs - handles both direct audio files AND xeno-canto links
-    # Matches: .mp3, .wav, .ogg, .m4a files OR xeno-canto.org URLs with /download
-    audio_pattern_files = r'https?://[^\s<>"{}|\\^`\[\]]+\.(?:mp3|wav|ogg|m4a)(?:\?[^\s]*)?'
     audio_pattern_xenocanto = r'https?://xeno-canto\.org/\d+/download'
     # Extract all URLs
-    image_urls = list(set(re.findall(image_pattern, text, re.IGNORECASE)))
-    audio_urls_files = list(set(re.findall(audio_pattern_files, text, re.IGNORECASE)))
     audio_urls_xenocanto = list(set(re.findall(audio_pattern_xenocanto, text, re.IGNORECASE)))
     # Combine both types of audio URLs
     audio_urls = audio_urls_files + audio_urls_xenocanto

     """
     Extract image and audio URLs from text using regex.
+    Updated to handle URLs within markdown, JSON, and plain text.
     Returns:
         tuple: (image_urls, audio_urls)
     """
+    # Updated pattern for image URLs - more permissive to catch URLs in various contexts
+    # Matches URLs ending in image extensions, allowing most characters before the extension
+    # Stops at whitespace or common delimiters like ), ], }
+    image_pattern = r'https?://[^\s)}\]]+?\.(?:jpg|jpeg|png|gif|webp|svg)(?:\?[^\s)}\]]*)?'
     # Pattern for audio URLs - handles both direct audio files AND xeno-canto links
+    # Updated to be more permissive like image pattern
+    audio_pattern_files = r'https?://[^\s)}\]]+?\.(?:mp3|wav|ogg|m4a)(?:\?[^\s)}\]]*)?'
     audio_pattern_xenocanto = r'https?://xeno-canto\.org/\d+/download'
     # Extract all URLs
+    raw_image_urls = re.findall(image_pattern, text, re.IGNORECASE)
+    raw_audio_urls_files = re.findall(audio_pattern_files, text, re.IGNORECASE)
     audio_urls_xenocanto = list(set(re.findall(audio_pattern_xenocanto, text, re.IGNORECASE)))
+    # Clean URLs (remove trailing quotes, commas, etc.)
+    def clean_url(url: str) -> str:
+        return url.rstrip('",;)')
+    image_urls = list(set(clean_url(url) for url in raw_image_urls))
+    audio_urls_files = list(set(clean_url(url) for url in raw_audio_urls_files))
     # Combine both types of audio URLs
     audio_urls = audio_urls_files + audio_urls_xenocanto

langgraph_agent/subagent_config.py CHANGED Viewed

@@ -68,14 +68,15 @@ class SubAgentConfig:
 **Your Role:**
 1. Use classification tools to identify birds from uploaded images
 2. Provide accurate species identification with confidence scores
-3. Fetch basic species information (taxonomy, size, status)
-4. Show reference images to help users verify identification
 **Response Style:**
 - Lead with the bird's common name and scientific name
 - Always cite confidence scores from classifier
 - Describe key identifying features visible in the image
-- Show reference images using markdown image syntax: ![Bird Name](image_url)
 - Mention if confidence is low and suggest why
 - Keep responses focused and concise
@@ -222,12 +223,13 @@ Analyze each user request and route it to the MOST appropriate specialist.
 **Routing Guidelines:**
 1. **Image uploads/URLs** → image_identifier (has classification tools)
-2. **"Search"/"find"/"examples"/"list birds"** → generalist (has search_birds tool for database queries)
-3. **"Audio"/"sound"/"song"/"call"/"recording"** → generalist (has audio search and retrieval)
-4. **Species info by name** → image_identifier (has get_bird_info and get_bird_images)
-5. **"Family"/"families" + broad questions** → taxonomy_specialist (has family tools)
-6. **"Conservation"/"endangered"/"threatened"** → taxonomy_specialist (has status filters)
-7. **Taxonomic relationships** → taxonomy_specialist (specializes in classification)
 **Decision-making:**
 - Consider the user's INTENT, not just keywords

 **Your Role:**
 1. Use classification tools to identify birds from uploaded images
 2. Provide accurate species identification with confidence scores
+3. Fetch basic species information (taxonomy, size, status) using get_bird_info
+4. ALWAYS call get_bird_images to fetch reference photos for the identified species
+5. Display reference images to help users verify identification
 **Response Style:**
 - Lead with the bird's common name and scientific name
 - Always cite confidence scores from classifier
 - Describe key identifying features visible in the image
+- ALWAYS call get_bird_images and show reference images using markdown: ![Bird Name](image_url)
 - Mention if confidence is low and suggest why
 - Keep responses focused and concise
 **Routing Guidelines:**
 1. **Image uploads/URLs** → image_identifier (has classification tools)
+2. **"Show me image"/"picture"/"photo" requests** → image_identifier (ONLY agent with get_bird_images tool)
+3. **Species info by name** → image_identifier (has get_bird_info and get_bird_images)
+4. **"Search"/"find"/"examples"/"list birds"** → generalist (has search_birds tool for database queries)
+5. **"Audio"/"sound"/"song"/"call"/"recording"** → generalist (has audio search and retrieval)
+6. **"Family"/"families" + broad questions** → taxonomy_specialist (has family tools)
+7. **"Conservation"/"endangered"/"threatened"** → taxonomy_specialist (has status filters)
+8. **Taxonomic relationships** → taxonomy_specialist (specializes in classification)
 **Decision-making:**
 - Consider the user's INTENT, not just keywords