Spaces:
Running
Running
File size: 5,903 Bytes
7c07ade 77627ff 7c07ade 77627ff 7c07ade 77627ff 7c07ade 77627ff d0b14c0 481bdd7 77627ff 7c07ade 77627ff 20ba79b 77627ff 953b850 7c07ade 953b850 77627ff 953b850 7c07ade 953b850 77627ff 953b850 7c07ade 953b850 77627ff 7c07ade 77627ff 953b850 7c07ade 953b850 77627ff 7c07ade d7e5abb 9760706 d7e5abb 77627ff 20ba79b 77627ff d7e5abb 7c07ade 77627ff |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
# Implementation Roadmap: DeepCritical (Vertical Slices)
**Philosophy:** AI-Native Engineering, Vertical Slice Architecture, TDD, Modern Tooling (2025).
This roadmap defines the execution strategy to deliver **DeepCritical** effectively. We reject "overplanning" in favor of **ironclad, testable vertical slices**. Each phase delivers a fully functional slice of end-to-end value.
---
## The 2025 "Gucci" Tooling Stack
We are using the bleeding edge of Python engineering to ensure speed, safety, and developer joy.
| Category | Tool | Why? |
|----------|------|------|
| **Package Manager** | **`uv`** | Rust-based, 10-100x faster than pip/poetry. Manages python versions, venvs, and deps. |
| **Linting/Format** | **`ruff`** | Rust-based, instant. Replaces black, isort, flake8. |
| **Type Checking** | **`mypy`** | Strict static typing. Run via `uv run mypy`. |
| **Testing** | **`pytest`** | The standard. |
| **Test Plugins** | **`pytest-sugar`** | Instant feedback, progress bars. "Gucci" visuals. |
| **Test Plugins** | **`pytest-asyncio`** | Essential for our async agent loop. |
| **Test Plugins** | **`pytest-cov`** | Coverage reporting to ensure TDD adherence. |
| **Git Hooks** | **`pre-commit`** | Enforce ruff/mypy before commit. |
---
## Architecture: Vertical Slices
Instead of horizontal layers (e.g., "Building the Database Layer"), we build **Vertical Slices**.
Each slice implements a feature from **Entry Point (UI/API) -> Logic -> Data/External**.
### Directory Structure (Maintainer's Structure)
```bash
src/
βββ app.py # Entry point (Gradio UI)
βββ orchestrator.py # Agent loop (Search -> Judge -> Loop)
βββ agent_factory/ # Agent creation and judges
β βββ __init__.py
β βββ agents.py # PydanticAI agent definitions
β βββ judges.py # JudgeHandler for evidence assessment
βββ tools/ # Search tools
β βββ __init__.py
β βββ pubmed.py # PubMed E-utilities tool
β βββ websearch.py # DuckDuckGo search tool
β βββ search_handler.py # Orchestrates multiple tools
βββ prompts/ # Prompt templates
β βββ __init__.py
β βββ judge.py # Judge prompts
βββ utils/ # Shared utilities
β βββ __init__.py
β βββ config.py # Settings/configuration
β βββ exceptions.py # Custom exceptions
β βββ models.py # Shared Pydantic models
β βββ dataloaders.py # Data loading utilities
β βββ parsers.py # Parsing utilities
βββ middleware/ # (Future: middleware components)
βββ database_services/ # (Future: database integrations)
βββ retrieval_factory/ # (Future: RAG components)
tests/
βββ unit/
β βββ tools/
β β βββ test_pubmed.py
β β βββ test_websearch.py
β β βββ test_search_handler.py
β βββ agent_factory/
β β βββ test_judges.py
β βββ test_orchestrator.py
βββ integration/
βββ test_pubmed_live.py
```
---
## Phased Execution Plan
### **Phase 1: Foundation & Tooling (Day 1)**
*Goal: A rock-solid, CI-ready environment with `uv` and `pytest` configured.*
- [ ] Initialize `pyproject.toml` with `uv`.
- [ ] Configure `ruff` (strict) and `mypy` (strict).
- [ ] Set up `pytest` with sugar and coverage.
- [ ] Implement `src/utils/config.py` (Configuration Slice).
- [ ] Implement `src/utils/exceptions.py` (Custom exceptions).
- **Deliverable**: A repo that passes CI with `uv run pytest`.
### **Phase 2: The "Search" Vertical Slice (Day 2)**
*Goal: Agent can receive a query and get raw results from PubMed/Web.*
- [ ] **TDD**: Write test for `SearchHandler`.
- [ ] Implement `src/tools/pubmed.py` (PubMed E-utilities).
- [ ] Implement `src/tools/websearch.py` (DuckDuckGo).
- [ ] Implement `src/tools/search_handler.py` (Orchestrates tools).
- [ ] Implement `src/utils/models.py` (Evidence, Citation, SearchResult).
- **Deliverable**: Function that takes "long covid" -> returns `List[Evidence]`.
### **Phase 3: The "Judge" Vertical Slice (Day 3)**
*Goal: Agent can decide if evidence is sufficient.*
- [ ] **TDD**: Write test for `JudgeHandler` (Mocked LLM).
- [ ] Implement `src/prompts/judge.py` (Structured outputs).
- [ ] Implement `src/agent_factory/judges.py` (LLM interaction).
- **Deliverable**: Function that takes `List[Evidence]` -> returns `JudgeAssessment`.
### **Phase 4: The "Loop" & UI Slice (Day 4)**
*Goal: End-to-End User Value.*
- [ ] Implement `src/orchestrator.py` (Connects Search + Judge loops).
- [ ] Build `src/app.py` (Gradio with Streaming).
- **Deliverable**: Working DeepCritical Agent on HuggingFace.
---
### **Phase 5: Magentic Integration (OPTIONAL - Post-MVP)**
*Goal: Upgrade orchestrator to use Microsoft Agent Framework patterns.*
- [ ] Wrap SearchHandler as `AgentProtocol` (SearchAgent) with strict protocol compliance.
- [ ] Wrap JudgeHandler as `AgentProtocol` (JudgeAgent) with strict protocol compliance.
- [ ] Implement `MagenticOrchestrator` using `MagenticBuilder`.
- [ ] Create factory pattern for switching implementations.
- **Deliverable**: Same API, better multi-agent orchestration engine.
**NOTE**: Only implement Phase 5 if time permits after MVP is shipped.
---
## Spec Documents
1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)**
2. **[Phase 2 Spec: Search Slice](02_phase_search.md)**
3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)**
4. **[Phase 4 Spec: UI & Loop](04_phase_ui.md)**
5. **[Phase 5 Spec: Magentic Integration](05_phase_magentic.md)** *(Optional)*
*Start by reading Phase 1 Spec to initialize the repo.*
|