Spaces:

DataQuests
/

DeepCritical

Running

File size: 14,771 Bytes

# Phase 1 Implementation Spec: Foundation & Tooling

**Goal**: Establish a "Gucci Banger" development environment using 2025 best practices.
**Philosophy**: "If the build isn't solid, the agent won't be."

---

## 1. Prerequisites

Before starting, ensure these are installed:

```bash
# Install uv (Rust-based package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Verify
uv --version  # Should be >= 0.4.0
```

---

## 2. Project Initialization

```bash
# From project root
uv init --name deepcritical
uv python install 3.11  # Pin Python version
```

---

## 3. The Tooling Stack (Exact Dependencies)

### `pyproject.toml` (Complete, Copy-Paste Ready)

```toml
[project]
name = "deepcritical"
version = "0.1.0"
description = "AI-Native Drug Repurposing Research Agent"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
    # Core
    "pydantic>=2.7",
    "pydantic-settings>=2.2",      # For BaseSettings (config)
    "pydantic-ai>=0.0.16",          # Agent framework

    # HTTP & Parsing
    "httpx>=0.27",                   # Async HTTP client
    "beautifulsoup4>=4.12",          # HTML parsing
    "xmltodict>=0.13",               # PubMed XML -> dict

    # Search
    "duckduckgo-search>=6.0",        # Free web search

    # UI
    "gradio>=5.0",                   # Chat interface

    # Utils
    "python-dotenv>=1.0",            # .env loading
    "tenacity>=8.2",                 # Retry logic
    "structlog>=24.1",               # Structured logging
]

[project.optional-dependencies]
dev = [
    # Testing
    "pytest>=8.0",
    "pytest-asyncio>=0.23",
    "pytest-sugar>=1.0",
    "pytest-cov>=5.0",
    "pytest-mock>=3.12",
    "respx>=0.21",                   # Mock httpx requests

    # Quality
    "ruff>=0.4.0",
    "mypy>=1.10",
    "pre-commit>=3.7",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

# ============== RUFF CONFIG ==============
[tool.ruff]
line-length = 100
target-version = "py311"
src = ["src", "tests"]

[tool.ruff.lint]
select = [
    "E",    # pycodestyle errors
    "F",    # pyflakes
    "B",    # flake8-bugbear
    "I",    # isort
    "N",    # pep8-naming
    "UP",   # pyupgrade
    "PL",   # pylint
    "RUF",  # ruff-specific
]
ignore = [
    "PLR0913",  # Too many arguments (agents need many params)
]

[tool.ruff.lint.isort]
known-first-party = ["src"]

# ============== MYPY CONFIG ==============
[tool.mypy]
python_version = "3.11"
strict = true
ignore_missing_imports = true
disallow_untyped_defs = true
warn_return_any = true
warn_unused_ignores = true

# ============== PYTEST CONFIG ==============
[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"
addopts = [
    "-v",
    "--tb=short",
    "--strict-markers",
]
markers = [
    "unit: Unit tests (mocked)",
    "integration: Integration tests (real APIs)",
    "slow: Slow tests",
]

# ============== COVERAGE CONFIG ==============
[tool.coverage.run]
source = ["src"]
omit = ["*/__init__.py"]

[tool.coverage.report]
exclude_lines = [
    "pragma: no cover",
    "if TYPE_CHECKING:",
    "raise NotImplementedError",
]
```

---

## 4. Directory Structure (Maintainer's Structure)

```bash
# Execute these commands to create the directory structure
mkdir -p src/utils
mkdir -p src/tools
mkdir -p src/prompts
mkdir -p src/agent_factory
mkdir -p src/middleware
mkdir -p src/database_services
mkdir -p src/retrieval_factory
mkdir -p tests/unit/tools
mkdir -p tests/unit/agent_factory
mkdir -p tests/unit/utils
mkdir -p tests/integration

# Create __init__.py files (required for imports)
touch src/__init__.py
touch src/utils/__init__.py
touch src/tools/__init__.py
touch src/prompts/__init__.py
touch src/agent_factory/__init__.py
touch tests/__init__.py
touch tests/unit/__init__.py
touch tests/unit/tools/__init__.py
touch tests/unit/agent_factory/__init__.py
touch tests/unit/utils/__init__.py
touch tests/integration/__init__.py
```

### Final Structure:

```
src/
├── __init__.py
├── app.py                      # Entry point (Gradio UI)
├── orchestrator.py             # Agent loop
├── agent_factory/              # Agent creation and judges
│   ├── __init__.py
│   ├── agents.py
│   └── judges.py
├── tools/                      # Search tools
│   ├── __init__.py
│   ├── pubmed.py
│   ├── websearch.py
│   └── search_handler.py
├── prompts/                    # Prompt templates
│   ├── __init__.py
│   └── judge.py
├── utils/                      # Shared utilities
│   ├── __init__.py
│   ├── config.py
│   ├── exceptions.py
│   ├── models.py
│   ├── dataloaders.py
│   └── parsers.py
├── middleware/                 # (Future)
├── database_services/          # (Future)
└── retrieval_factory/          # (Future)

tests/
├── __init__.py
├── conftest.py
├── unit/
│   ├── __init__.py
│   ├── tools/
│   │   ├── __init__.py
│   │   ├── test_pubmed.py
│   │   ├── test_websearch.py
│   │   └── test_search_handler.py
│   ├── agent_factory/
│   │   ├── __init__.py
│   │   └── test_judges.py
│   ├── utils/
│   │   ├── __init__.py
│   │   └── test_config.py
│   └── test_orchestrator.py
└── integration/
    ├── __init__.py
    └── test_pubmed_live.py
```

---

## 5. Configuration Files

### `.env.example` (Copy to `.env` and fill)

```bash
# LLM Provider (choose one)
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Optional: PubMed API key (higher rate limits)
NCBI_API_KEY=your-ncbi-key-here

# Optional: For HuggingFace deployment
HF_TOKEN=hf_your-token-here

# Agent Config
MAX_ITERATIONS=10
LOG_LEVEL=INFO
```

### `.pre-commit-config.yaml`

```yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.4
    hooks:
      - id: ruff
        args: [--fix]
      - id: ruff-format

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.10.0
    hooks:
      - id: mypy
        additional_dependencies:
          - pydantic>=2.7
          - pydantic-settings>=2.2
        args: [--ignore-missing-imports]
```

### `tests/conftest.py` (Pytest Fixtures)

```python
"""Shared pytest fixtures for all tests."""
import pytest
from unittest.mock import AsyncMock


@pytest.fixture
def mock_httpx_client(mocker):
    """Mock httpx.AsyncClient for API tests."""
    mock = mocker.patch("httpx.AsyncClient")
    mock.return_value.__aenter__ = AsyncMock(return_value=mock.return_value)
    mock.return_value.__aexit__ = AsyncMock(return_value=None)
    return mock


@pytest.fixture
def mock_llm_response():
    """Factory fixture for mocking LLM responses."""
    def _mock(content: str):
        return AsyncMock(return_value=content)
    return _mock


@pytest.fixture
def sample_evidence():
    """Sample Evidence objects for testing."""
    from src.utils.models import Evidence, Citation
    return [
        Evidence(
            content="Metformin shows promise in Alzheimer's...",
            citation=Citation(
                source="pubmed",
                title="Metformin and Alzheimer's Disease",
                url="https://pubmed.ncbi.nlm.nih.gov/12345678/",
                date="2024-01-15"
            ),
            relevance=0.85
        )
    ]
```

---

## 6. Core Utilities Implementation

### `src/utils/config.py`

```python
"""Application configuration using Pydantic Settings."""
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field
from typing import Literal
import structlog


class Settings(BaseSettings):
    """Strongly-typed application settings."""

    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore",
    )

    # LLM Configuration
    openai_api_key: str | None = Field(default=None, description="OpenAI API key")
    anthropic_api_key: str | None = Field(default=None, description="Anthropic API key")
    llm_provider: Literal["openai", "anthropic"] = Field(
        default="openai",
        description="Which LLM provider to use"
    )
    openai_model: str = Field(default="gpt-4o", description="OpenAI model name")
    anthropic_model: str = Field(default="claude-3-5-sonnet-20241022", description="Anthropic model")

    # PubMed Configuration
    ncbi_api_key: str | None = Field(default=None, description="NCBI API key for higher rate limits")

    # Agent Configuration
    max_iterations: int = Field(default=10, ge=1, le=50)
    search_timeout: int = Field(default=30, description="Seconds to wait for search")

    # Logging
    log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO"

    def get_api_key(self) -> str:
        """Get the API key for the configured provider."""
        if self.llm_provider == "openai":
            if not self.openai_api_key:
                raise ValueError("OPENAI_API_KEY not set")
            return self.openai_api_key
        else:
            if not self.anthropic_api_key:
                raise ValueError("ANTHROPIC_API_KEY not set")
            return self.anthropic_api_key


def get_settings() -> Settings:
    """Factory function to get settings (allows mocking in tests)."""
    return Settings()


def configure_logging(settings: Settings) -> None:
    """Configure structured logging."""
    structlog.configure(
        processors=[
            structlog.stdlib.filter_by_level,
            structlog.stdlib.add_logger_name,
            structlog.stdlib.add_log_level,
            structlog.processors.TimeStamper(fmt="iso"),
            structlog.processors.JSONRenderer(),
        ],
        wrapper_class=structlog.stdlib.BoundLogger,
        context_class=dict,
        logger_factory=structlog.stdlib.LoggerFactory(),
    )


# Singleton for easy import
settings = get_settings()
```

### `src/utils/exceptions.py`

```python
"""Custom exceptions for DeepCritical."""


class DeepCriticalError(Exception):
    """Base exception for all DeepCritical errors."""
    pass


class SearchError(DeepCriticalError):
    """Raised when a search operation fails."""
    pass


class JudgeError(DeepCriticalError):
    """Raised when the judge fails to assess evidence."""
    pass


class ConfigurationError(DeepCriticalError):
    """Raised when configuration is invalid."""
    pass


class RateLimitError(SearchError):
    """Raised when we hit API rate limits."""
    pass
```

---

## 7. TDD Workflow: First Test

### `tests/unit/utils/test_config.py`

```python
"""Unit tests for configuration loading."""
import pytest
from unittest.mock import patch
import os


class TestSettings:
    """Tests for Settings class."""

    def test_default_max_iterations(self):
        """Settings should have default max_iterations of 10."""
        from src.utils.config import Settings

        # Clear any env vars
        with patch.dict(os.environ, {}, clear=True):
            settings = Settings()
            assert settings.max_iterations == 10

    def test_max_iterations_from_env(self):
        """Settings should read MAX_ITERATIONS from env."""
        from src.utils.config import Settings

        with patch.dict(os.environ, {"MAX_ITERATIONS": "25"}):
            settings = Settings()
            assert settings.max_iterations == 25

    def test_invalid_max_iterations_raises(self):
        """Settings should reject invalid max_iterations."""
        from src.utils.config import Settings
        from pydantic import ValidationError

        with patch.dict(os.environ, {"MAX_ITERATIONS": "100"}):
            with pytest.raises(ValidationError):
                Settings()  # 100 > 50 (max)

    def test_get_api_key_openai(self):
        """get_api_key should return OpenAI key when provider is openai."""
        from src.utils.config import Settings

        with patch.dict(os.environ, {
            "LLM_PROVIDER": "openai",
            "OPENAI_API_KEY": "sk-test-key"
        }):
            settings = Settings()
            assert settings.get_api_key() == "sk-test-key"

    def test_get_api_key_missing_raises(self):
        """get_api_key should raise when key is not set."""
        from src.utils.config import Settings

        with patch.dict(os.environ, {"LLM_PROVIDER": "openai"}, clear=True):
            settings = Settings()
            with pytest.raises(ValueError, match="OPENAI_API_KEY not set"):
                settings.get_api_key()
```

---

## 8. Makefile (Developer Experience)

Create a `Makefile` for standard devex commands:

```makefile
.PHONY: install test lint format typecheck check clean

install:
	uv sync --all-extras
	uv run pre-commit install

test:
	uv run pytest tests/unit/ -v

test-cov:
	uv run pytest --cov=src --cov-report=term-missing

lint:
	uv run ruff check src tests

format:
	uv run ruff format src tests

typecheck:
	uv run mypy src

check: lint typecheck test
	@echo "All checks passed!"

clean:
	rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage
	find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
```

---

## 9. Execution Commands

```bash
# Install all dependencies
uv sync --all-extras

# Run tests (should pass after implementing config.py)
uv run pytest tests/unit/utils/test_config.py -v

# Run full test suite with coverage
uv run pytest --cov=src --cov-report=term-missing

# Run linting
uv run ruff check src tests
uv run ruff format src tests

# Run type checking
uv run mypy src

# Set up pre-commit hooks
uv run pre-commit install
```

---

## 10. Implementation Checklist

- [ ] Install `uv` and verify version
- [ ] Run `uv init --name deepcritical`
- [ ] Create `pyproject.toml` (copy from above)
- [ ] Create directory structure (run mkdir commands)
- [ ] Create `.env.example` and `.env`
- [ ] Create `.pre-commit-config.yaml`
- [ ] Create `Makefile` (copy from above)
- [ ] Create `tests/conftest.py`
- [ ] Implement `src/utils/config.py`
- [ ] Implement `src/utils/exceptions.py`
- [ ] Write tests in `tests/unit/utils/test_config.py`
- [ ] Run `make install`
- [ ] Run `make check` — **ALL CHECKS MUST PASS**
- [ ] Commit: `git commit -m "feat: phase 1 foundation complete"`

---

## 11. Definition of Done

Phase 1 is **COMPLETE** when:

1. `uv run pytest` passes with 100% of tests green
2. `uv run ruff check src tests` has 0 errors
3. `uv run mypy src` has 0 errors
4. Pre-commit hooks are installed and working
5. `from src.utils.config import settings` works in Python REPL

**Proceed to Phase 2 ONLY after all checkboxes are complete.**