File size: 2,769 Bytes
77bf197
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
import gradio as gr
import json
from PIL import Image
import numpy as np

# --- User Defined Logic ---
import duckdb
import pandas as pd
import requests
import tempfile
import os

def paris_trees_sql(sql_query: str) -> str:
    """
    Execute a SQL query on the Paris trees dataset using DuckDB.
    
    Downloads the official Paris open data file (les-arbres) as a Parquet file,
    loads it into an in-memory DuckDB database as table 'paris_trees', and
    executes the provided SQL query against it. Results are limited to 1000 rows.
    
    Args:
        sql_query (str): A valid SQL query to run against the paris_trees table.
    
    Returns:
        str: A formatted table string with column headers and up to 1000 rows of results.
             If an error occurs, returns a descriptive error message.
    """
    url = "https://opendata.paris.fr/api/explore/v2.1/catalog/datasets/les-arbres/exports/parquet?lang=fr&timezone=Europe%2FBerlin"
    
    try:
        # Download the Parquet file
        response = requests.get(url, timeout=60)
        response.raise_for_status()
        
        # Save to a temporary file
        with tempfile.NamedTemporaryFile(delete=False, suffix=".parquet") as tmp:
            tmp.write(response.content)
            tmp_path = tmp.name
        
        # Connect to DuckDB in-memory and load the data
        conn = duckdb.connect(":memory:")
        conn.execute(f"CREATE TABLE paris_trees AS SELECT * FROM read_parquet('{tmp_path}')")
        
        # Execute user query with a limit
        limited_query = f"{sql_query.rstrip(';')} LIMIT 1000"
        result_df = conn.execute(limited_query).df()
        
        # Format output as a table string
        if result_df.empty:
            return "Query executed successfully but returned no results."
        
        output = result_df.to_string(index=False)
        return output
    
    except requests.exceptions.RequestException as e:
        return f"Error downloading dataset: {e}"
    except duckdb.CatalogException as e:
        return f"SQL error (possibly invalid table/column names): {e}"
    except duckdb.ParserException as e:
        return f"SQL syntax error: {e}"
    except Exception as e:
        return f"Unexpected error: {e}"
    finally:
        # Clean up temporary file
        if 'tmp_path' in locals() and os.path.exists(tmp_path):
            os.remove(tmp_path)

# --- Interface Factory ---
def create_interface():
    return gr.Interface(
        fn=paris_trees_sql,
        inputs=[gr.Textbox(label=k) for k in ['sql_query']],
        outputs=gr.Textbox(label="Query results formatted as a table (max 1000 rows) or an error message"),
        title="paris_trees_sql",
        description="Auto-generated tool: paris_trees_sql"
    )