| # RAG Pipeline API Documentation | |
| ## Overview | |
| FastAPI-based RAG (Retrieval-Augmented Generation) pipeline with OpenRouter GLM integration for intelligent tool calling. | |
| ## Base URL | |
| ``` | |
| http://localhost:8000 | |
| ``` | |
| ## Endpoints | |
| ### `/chat` - Main Chat Endpoint | |
| **Method:** `POST` | |
| **Description:** Intelligent chat with RAG tool calling. GLM automatically determines when to use RAG vs. general conversation. | |
| #### Request Body | |
| ```json | |
| { | |
| "messages": [ | |
| { | |
| "role": "user|assistant|system", | |
| "content": "string" | |
| } | |
| ] | |
| } | |
| ``` | |
| #### Response Format | |
| ```json | |
| { | |
| "response": "string", | |
| "tool_calls": [ | |
| { | |
| "name": "rag_qa", | |
| "arguments": "{\"question\": \"string\", \"dataset\": \"string\"}" | |
| } | |
| ] | null | |
| } | |
| ``` | |
| #### Examples | |
| **1. General Greeting (No RAG):** | |
| ```bash | |
| curl -X POST http://localhost:8000/chat \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"messages":[{"role":"user","content":"hi"}]}' | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "response": "Hi! I'm Rohit's AI assistant. I can help you learn about his professional background, skills, and experience. What would you like to know about Rohit?", | |
| "tool_calls": null | |
| } | |
| ``` | |
| **2. Portfolio Question (RAG Enabled):** | |
| ```bash | |
| curl -X POST http://localhost:8000/chat \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"messages":[{"role":"user","content":"What is your current role?"}]}' | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "response": "Based on the portfolio information, Rohit is currently working as a Tech Lead at FleetEnable, where he leads UI development for a logistics SaaS product focused on drayage and freight management...", | |
| "tool_calls": [ | |
| { | |
| "name": "rag_qa", | |
| "arguments": "{\"question\": \"What is your current role?\"}" | |
| } | |
| ] | |
| } | |
| ``` | |
| ### `/health` - Health Check | |
| **Method:** `GET` | |
| **Description:** Check API and dataset loading status. | |
| #### Response | |
| ```json | |
| { | |
| "status": "healthy", | |
| "datasets_loaded": 1, | |
| "available_datasets": ["developer-portfolio"] | |
| } | |
| ``` | |
| ### `/datasets` - List Available Datasets | |
| **Method:** `GET` | |
| **Description:** Get list of available datasets. | |
| #### Response | |
| ```json | |
| { | |
| "datasets": ["developer-portfolio"] | |
| } | |
| ``` | |
| ## Features | |
| ### π§ Intelligent Tool Calling | |
| - **Automatic Detection:** GLM determines when questions need RAG vs. general conversation | |
| - **Context-Aware:** Uses portfolio information for relevant questions | |
| - **Natural Responses:** Synthesizes RAG results into conversational answers | |
| ### π― Third-Person AI Assistant | |
| - **Portfolio Focus:** Responds about Rohit's experience (not "my" experience) | |
| - **Professional Tone:** Maintains proper third-person references | |
| - **Context Integration:** Combines multiple data points coherently | |
| ### β‘ Performance Optimizations | |
| - **On-Demand Loading:** Datasets load only when RAG is needed | |
| - **Clean Output:** No verbose ML logging for general conversations | |
| - **Fast Responses:** Sub-second for greetings, ~20s for first RAG query | |
| ## Available Datasets | |
| ### `developer-portfolio` | |
| - **Content:** Work experience, skills, projects, achievements | |
| - **Topics:** FleetEnable, Coditude, technologies, leadership | |
| - **Size:** 19 documents with full metadata | |
| ## Error Handling | |
| ### Common Responses | |
| - **Datasets Loading:** "RAG Pipeline is running but datasets are still loading..." | |
| - **Dataset Not Found:** "Dataset 'xyz' not available. Available datasets: [...]" | |
| - **API Errors:** HTTP 500 with error details | |
| ### Status Codes | |
| - `200` - Success | |
| - `400` - Bad Request (invalid JSON, missing fields) | |
| - `500` - Internal Server Error | |
| ## Environment Variables | |
| Create `.env` file: | |
| ```bash | |
| OPENROUTER_API_KEY=sk-or-v1-your-key-here | |
| PORT=8000 | |
| TOKENIZERS_PARALLELISM=false | |
| ``` | |
| ## Development | |
| ### Running Locally | |
| ```bash | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Start server | |
| python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload | |
| # Or use script | |
| ./start.sh | |
| ``` | |
| ### Testing | |
| ```bash | |
| # Health check | |
| curl http://localhost:8000/health | |
| # Chat test | |
| curl -X POST http://localhost:8000/chat \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"messages":[{"role":"user","content":"hi"}]}' | |
| ``` | |
| ## Deployment | |
| ### Docker | |
| ```bash | |
| # Build | |
| docker build -t rag-pipeline . | |
| # Run | |
| docker run -p 8000:8000 rag-pipeline | |
| ``` | |
| ### Hugging Face Spaces | |
| 1. Push code to repository | |
| 2. Connect Space to repository | |
| 3. Set environment variables in Space settings | |
| 4. Automatic deployment from `main` branch | |
| ## Architecture | |
| ``` | |
| OpenRouter GLM-4.5-air (Parent AI) | |
| βββ Tool Calling Logic | |
| β βββ Automatically detects RAG-worthy questions | |
| β βββ Falls back to general knowledge | |
| βββ RAG Tool Function | |
| β βββ Dataset selection (developer-portfolio) | |
| β βββ Document retrieval | |
| β βββ Context formatting | |
| βββ Response Generation | |
| βββ Tool results integration | |
| βββ Natural language responses | |
| ``` | |
| ## Changelog | |
| ### v2.0 - Current | |
| - β OpenRouter GLM integration with tool calling | |
| - β Intelligent RAG vs. conversation detection | |
| - β Third-person AI assistant for Rohit's portfolio | |
| - β On-demand dataset loading | |
| - β Removed `/answer` endpoint (use `/chat` only) | |
| - β Environment variable configuration | |
| - β Performance optimizations | |
| ### v1.0 - Legacy | |
| - Google Gemini integration | |
| - Multiple endpoints (`/answer`, `/chat`) | |
| - Background dataset loading | |
| - First-person responses |