Spaces:
Running
Implementation Roadmap: DeepCritical (Vertical Slices)
Philosophy: AI-Native Engineering, Vertical Slice Architecture, TDD, Modern Tooling (2025).
This roadmap defines the execution strategy to deliver DeepCritical effectively. We reject "overplanning" in favor of ironclad, testable vertical slices. Each phase delivers a fully functional slice of end-to-end value.
The 2025 "Gucci" Tooling Stack
We are using the bleeding edge of Python engineering to ensure speed, safety, and developer joy.
| Category | Tool | Why? |
|---|---|---|
| Package Manager | uv |
Rust-based, 10-100x faster than pip/poetry. Manages python versions, venvs, and deps. |
| Linting/Format | ruff |
Rust-based, instant. Replaces black, isort, flake8. |
| Type Checking | mypy |
Strict static typing. Run via uv run mypy. |
| Testing | pytest |
The standard. |
| Test Plugins | pytest-sugar |
Instant feedback, progress bars. "Gucci" visuals. |
| Test Plugins | pytest-asyncio |
Essential for our async agent loop. |
| Test Plugins | pytest-cov |
Coverage reporting to ensure TDD adherence. |
| Git Hooks | pre-commit |
Enforce ruff/mypy before commit. |
Architecture: Vertical Slices
Instead of horizontal layers (e.g., "Building the Database Layer"), we build Vertical Slices. Each slice implements a feature from Entry Point (UI/API) -> Logic -> Data/External.
Directory Structure (Maintainer's Structure)
src/
βββ app.py # Entry point (Gradio UI)
βββ orchestrator.py # Agent loop (Search -> Judge -> Loop)
βββ agent_factory/ # Agent creation and judges
β βββ __init__.py
β βββ agents.py # PydanticAI agent definitions
β βββ judges.py # JudgeHandler for evidence assessment
βββ tools/ # Search tools
β βββ __init__.py
β βββ pubmed.py # PubMed E-utilities tool
β βββ websearch.py # DuckDuckGo search tool
β βββ search_handler.py # Orchestrates multiple tools
βββ prompts/ # Prompt templates
β βββ __init__.py
β βββ judge.py # Judge prompts
βββ utils/ # Shared utilities
β βββ __init__.py
β βββ config.py # Settings/configuration
β βββ exceptions.py # Custom exceptions
β βββ models.py # Shared Pydantic models
β βββ dataloaders.py # Data loading utilities
β βββ parsers.py # Parsing utilities
βββ middleware/ # (Future: middleware components)
βββ database_services/ # (Future: database integrations)
βββ retrieval_factory/ # (Future: RAG components)
tests/
βββ unit/
β βββ tools/
β β βββ test_pubmed.py
β β βββ test_websearch.py
β β βββ test_search_handler.py
β βββ agent_factory/
β β βββ test_judges.py
β βββ test_orchestrator.py
βββ integration/
βββ test_pubmed_live.py
Phased Execution Plan
Phase 1: Foundation & Tooling (Day 1)
Goal: A rock-solid, CI-ready environment with uv and pytest configured.
- Initialize
pyproject.tomlwithuv. - Configure
ruff(strict) andmypy(strict). - Set up
pytestwith sugar and coverage. - Implement
src/utils/config.py(Configuration Slice). - Implement
src/utils/exceptions.py(Custom exceptions). - Deliverable: A repo that passes CI with
uv run pytest.
Phase 2: The "Search" Vertical Slice (Day 2)
Goal: Agent can receive a query and get raw results from PubMed/Web.
- TDD: Write test for
SearchHandler. - Implement
src/tools/pubmed.py(PubMed E-utilities). - Implement
src/tools/websearch.py(DuckDuckGo). - Implement
src/tools/search_handler.py(Orchestrates tools). - Implement
src/utils/models.py(Evidence, Citation, SearchResult). - Deliverable: Function that takes "long covid" -> returns
List[Evidence].
Phase 3: The "Judge" Vertical Slice (Day 3)
Goal: Agent can decide if evidence is sufficient.
- TDD: Write test for
JudgeHandler(Mocked LLM). - Implement
src/prompts/judge.py(Structured outputs). - Implement
src/agent_factory/judges.py(LLM interaction). - Deliverable: Function that takes
List[Evidence]-> returnsJudgeAssessment.
Phase 4: The "Loop" & UI Slice (Day 4)
Goal: End-to-End User Value.
- Implement
src/orchestrator.py(Connects Search + Judge loops). - Build
src/app.py(Gradio with Streaming). - Deliverable: Working DeepCritical Agent on HuggingFace.
Phase 5: Magentic Integration (OPTIONAL - Post-MVP)
Goal: Upgrade orchestrator to use Microsoft Agent Framework patterns.
- Wrap SearchHandler as
AgentProtocol(SearchAgent) with strict protocol compliance. - Wrap JudgeHandler as
AgentProtocol(JudgeAgent) with strict protocol compliance. - Implement
MagenticOrchestratorusingMagenticBuilder. - Create factory pattern for switching implementations.
- Deliverable: Same API, better multi-agent orchestration engine.
NOTE: Only implement Phase 5 if time permits after MVP is shipped.
Spec Documents
- Phase 1 Spec: Foundation
- Phase 2 Spec: Search Slice
- Phase 3 Spec: Judge Slice
- Phase 4 Spec: UI & Loop
- Phase 5 Spec: Magentic Integration (Optional)
Start by reading Phase 1 Spec to initialize the repo.