Spaces:

syntaxhacker
/

developer-portfolio-rag

Sleeping

File size: 5,343 Bytes

b7b8e60

# RAG Pipeline API Documentation

## Overview
FastAPI-based RAG (Retrieval-Augmented Generation) pipeline with OpenRouter GLM integration for intelligent tool calling.

## Base URL
```
http://localhost:8000
```

## Endpoints

### `/chat` - Main Chat Endpoint
**Method:** `POST`  
**Description:** Intelligent chat with RAG tool calling. GLM automatically determines when to use RAG vs. general conversation.

#### Request Body
```json
{
  "messages": [
    {
      "role": "user|assistant|system",
      "content": "string"
    }
  ]
}
```

#### Response Format
```json
{
  "response": "string",
  "tool_calls": [
    {
      "name": "rag_qa",
      "arguments": "{\"question\": \"string\", \"dataset\": \"string\"}"
    }
  ] | null
}
```

#### Examples

**1. General Greeting (No RAG):**
```bash
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"hi"}]}'
```

**Response:**
```json
{
  "response": "Hi! I'm Rohit's AI assistant. I can help you learn about his professional background, skills, and experience. What would you like to know about Rohit?",
  "tool_calls": null
}
```

**2. Portfolio Question (RAG Enabled):**
```bash
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"What is your current role?"}]}'
```

**Response:**
```json
{
  "response": "Based on the portfolio information, Rohit is currently working as a Tech Lead at FleetEnable, where he leads UI development for a logistics SaaS product focused on drayage and freight management...",
  "tool_calls": [
    {
      "name": "rag_qa", 
      "arguments": "{\"question\": \"What is your current role?\"}"
    }
  ]
}
```

### `/health` - Health Check
**Method:** `GET`  
**Description:** Check API and dataset loading status.

#### Response
```json
{
  "status": "healthy",
  "datasets_loaded": 1,
  "available_datasets": ["developer-portfolio"]
}
```

### `/datasets` - List Available Datasets
**Method:** `GET`  
**Description:** Get list of available datasets.

#### Response
```json
{
  "datasets": ["developer-portfolio"]
}
```

## Features

### 🧠 Intelligent Tool Calling
- **Automatic Detection:** GLM determines when questions need RAG vs. general conversation
- **Context-Aware:** Uses portfolio information for relevant questions
- **Natural Responses:** Synthesizes RAG results into conversational answers

### 🎯 Third-Person AI Assistant
- **Portfolio Focus:** Responds about Rohit's experience (not "my" experience)
- **Professional Tone:** Maintains proper third-person references
- **Context Integration:** Combines multiple data points coherently

### ⚡ Performance Optimizations
- **On-Demand Loading:** Datasets load only when RAG is needed
- **Clean Output:** No verbose ML logging for general conversations
- **Fast Responses:** Sub-second for greetings, ~20s for first RAG query

## Available Datasets

### `developer-portfolio`
- **Content:** Work experience, skills, projects, achievements
- **Topics:** FleetEnable, Coditude, technologies, leadership
- **Size:** 19 documents with full metadata

## Error Handling

### Common Responses
- **Datasets Loading:** "RAG Pipeline is running but datasets are still loading..."
- **Dataset Not Found:** "Dataset 'xyz' not available. Available datasets: [...]"
- **API Errors:** HTTP 500 with error details

### Status Codes
- `200` - Success
- `400` - Bad Request (invalid JSON, missing fields)
- `500` - Internal Server Error

## Environment Variables

Create `.env` file:
```bash
OPENROUTER_API_KEY=sk-or-v1-your-key-here
PORT=8000
TOKENIZERS_PARALLELISM=false
```

## Development

### Running Locally
```bash
# Install dependencies
pip install -r requirements.txt

# Start server
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# Or use script
./start.sh
```

### Testing
```bash
# Health check
curl http://localhost:8000/health

# Chat test
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"hi"}]}'
```

## Deployment

### Docker
```bash
# Build
docker build -t rag-pipeline .

# Run
docker run -p 8000:8000 rag-pipeline
```

### Hugging Face Spaces
1. Push code to repository
2. Connect Space to repository
3. Set environment variables in Space settings
4. Automatic deployment from `main` branch

## Architecture

```
OpenRouter GLM-4.5-air (Parent AI)
├── Tool Calling Logic
│   ├── Automatically detects RAG-worthy questions
│   └── Falls back to general knowledge
├── RAG Tool Function
│   ├── Dataset selection (developer-portfolio)
│   ├── Document retrieval
│   └── Context formatting
└── Response Generation
    ├── Tool results integration
    └── Natural language responses
```

## Changelog

### v2.0 - Current
- ✅ OpenRouter GLM integration with tool calling
- ✅ Intelligent RAG vs. conversation detection
- ✅ Third-person AI assistant for Rohit's portfolio
- ✅ On-demand dataset loading
- ✅ Removed `/answer` endpoint (use `/chat` only)
- ✅ Environment variable configuration
- ✅ Performance optimizations

### v1.0 - Legacy
- Google Gemini integration
- Multiple endpoints (`/answer`, `/chat`)
- Background dataset loading
- First-person responses