File size: 14,771 Bytes
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
 
d33aedd
 
 
 
 
 
 
 
 
 
 
77627ff
7c07ade
 
 
d33aedd
 
 
 
7c07ade
 
d33aedd
 
 
7c07ade
 
 
d33aedd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7c07ade
 
 
 
 
 
 
 
 
 
 
d33aedd
 
 
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
d33aedd
7c07ade
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
d33aedd
7c07ade
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
 
 
 
 
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
32fcf60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7c07ade
 
 
 
 
 
d33aedd
7c07ade
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32fcf60
7c07ade
 
 
 
77627ff
7c07ade
 
32fcf60
7c07ade
d33aedd
 
 
32fcf60
 
7c07ade
 
 
 
32fcf60
7c07ade
 
 
d33aedd
 
 
 
 
7c07ade
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
# Phase 1 Implementation Spec: Foundation & Tooling

**Goal**: Establish a "Gucci Banger" development environment using 2025 best practices.
**Philosophy**: "If the build isn't solid, the agent won't be."

---

## 1. Prerequisites

Before starting, ensure these are installed:

```bash
# Install uv (Rust-based package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Verify
uv --version  # Should be >= 0.4.0
```

---

## 2. Project Initialization

```bash
# From project root
uv init --name deepcritical
uv python install 3.11  # Pin Python version
```

---

## 3. The Tooling Stack (Exact Dependencies)

### `pyproject.toml` (Complete, Copy-Paste Ready)

```toml
[project]
name = "deepcritical"
version = "0.1.0"
description = "AI-Native Drug Repurposing Research Agent"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
    # Core
    "pydantic>=2.7",
    "pydantic-settings>=2.2",      # For BaseSettings (config)
    "pydantic-ai>=0.0.16",          # Agent framework

    # HTTP & Parsing
    "httpx>=0.27",                   # Async HTTP client
    "beautifulsoup4>=4.12",          # HTML parsing
    "xmltodict>=0.13",               # PubMed XML -> dict

    # Search
    "duckduckgo-search>=6.0",        # Free web search

    # UI
    "gradio>=5.0",                   # Chat interface

    # Utils
    "python-dotenv>=1.0",            # .env loading
    "tenacity>=8.2",                 # Retry logic
    "structlog>=24.1",               # Structured logging
]

[project.optional-dependencies]
dev = [
    # Testing
    "pytest>=8.0",
    "pytest-asyncio>=0.23",
    "pytest-sugar>=1.0",
    "pytest-cov>=5.0",
    "pytest-mock>=3.12",
    "respx>=0.21",                   # Mock httpx requests

    # Quality
    "ruff>=0.4.0",
    "mypy>=1.10",
    "pre-commit>=3.7",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

# ============== RUFF CONFIG ==============
[tool.ruff]
line-length = 100
target-version = "py311"
src = ["src", "tests"]

[tool.ruff.lint]
select = [
    "E",    # pycodestyle errors
    "F",    # pyflakes
    "B",    # flake8-bugbear
    "I",    # isort
    "N",    # pep8-naming
    "UP",   # pyupgrade
    "PL",   # pylint
    "RUF",  # ruff-specific
]
ignore = [
    "PLR0913",  # Too many arguments (agents need many params)
]

[tool.ruff.lint.isort]
known-first-party = ["src"]

# ============== MYPY CONFIG ==============
[tool.mypy]
python_version = "3.11"
strict = true
ignore_missing_imports = true
disallow_untyped_defs = true
warn_return_any = true
warn_unused_ignores = true

# ============== PYTEST CONFIG ==============
[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"
addopts = [
    "-v",
    "--tb=short",
    "--strict-markers",
]
markers = [
    "unit: Unit tests (mocked)",
    "integration: Integration tests (real APIs)",
    "slow: Slow tests",
]

# ============== COVERAGE CONFIG ==============
[tool.coverage.run]
source = ["src"]
omit = ["*/__init__.py"]

[tool.coverage.report]
exclude_lines = [
    "pragma: no cover",
    "if TYPE_CHECKING:",
    "raise NotImplementedError",
]
```

---

## 4. Directory Structure (Maintainer's Structure)

```bash
# Execute these commands to create the directory structure
mkdir -p src/utils
mkdir -p src/tools
mkdir -p src/prompts
mkdir -p src/agent_factory
mkdir -p src/middleware
mkdir -p src/database_services
mkdir -p src/retrieval_factory
mkdir -p tests/unit/tools
mkdir -p tests/unit/agent_factory
mkdir -p tests/unit/utils
mkdir -p tests/integration

# Create __init__.py files (required for imports)
touch src/__init__.py
touch src/utils/__init__.py
touch src/tools/__init__.py
touch src/prompts/__init__.py
touch src/agent_factory/__init__.py
touch tests/__init__.py
touch tests/unit/__init__.py
touch tests/unit/tools/__init__.py
touch tests/unit/agent_factory/__init__.py
touch tests/unit/utils/__init__.py
touch tests/integration/__init__.py
```

### Final Structure:

```
src/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ app.py                      # Entry point (Gradio UI)
β”œβ”€β”€ orchestrator.py             # Agent loop
β”œβ”€β”€ agent_factory/              # Agent creation and judges
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ agents.py
β”‚   └── judges.py
β”œβ”€β”€ tools/                      # Search tools
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ pubmed.py
β”‚   β”œβ”€β”€ websearch.py
β”‚   └── search_handler.py
β”œβ”€β”€ prompts/                    # Prompt templates
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── judge.py
β”œβ”€β”€ utils/                      # Shared utilities
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config.py
β”‚   β”œβ”€β”€ exceptions.py
β”‚   β”œβ”€β”€ models.py
β”‚   β”œβ”€β”€ dataloaders.py
β”‚   └── parsers.py
β”œβ”€β”€ middleware/                 # (Future)
β”œβ”€β”€ database_services/          # (Future)
└── retrieval_factory/          # (Future)

tests/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ conftest.py
β”œβ”€β”€ unit/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ tools/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ test_pubmed.py
β”‚   β”‚   β”œβ”€β”€ test_websearch.py
β”‚   β”‚   └── test_search_handler.py
β”‚   β”œβ”€β”€ agent_factory/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── test_judges.py
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── test_config.py
β”‚   └── test_orchestrator.py
└── integration/
    β”œβ”€β”€ __init__.py
    └── test_pubmed_live.py
```

---

## 5. Configuration Files

### `.env.example` (Copy to `.env` and fill)

```bash
# LLM Provider (choose one)
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Optional: PubMed API key (higher rate limits)
NCBI_API_KEY=your-ncbi-key-here

# Optional: For HuggingFace deployment
HF_TOKEN=hf_your-token-here

# Agent Config
MAX_ITERATIONS=10
LOG_LEVEL=INFO
```

### `.pre-commit-config.yaml`

```yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.4
    hooks:
      - id: ruff
        args: [--fix]
      - id: ruff-format

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.10.0
    hooks:
      - id: mypy
        additional_dependencies:
          - pydantic>=2.7
          - pydantic-settings>=2.2
        args: [--ignore-missing-imports]
```

### `tests/conftest.py` (Pytest Fixtures)

```python
"""Shared pytest fixtures for all tests."""
import pytest
from unittest.mock import AsyncMock


@pytest.fixture
def mock_httpx_client(mocker):
    """Mock httpx.AsyncClient for API tests."""
    mock = mocker.patch("httpx.AsyncClient")
    mock.return_value.__aenter__ = AsyncMock(return_value=mock.return_value)
    mock.return_value.__aexit__ = AsyncMock(return_value=None)
    return mock


@pytest.fixture
def mock_llm_response():
    """Factory fixture for mocking LLM responses."""
    def _mock(content: str):
        return AsyncMock(return_value=content)
    return _mock


@pytest.fixture
def sample_evidence():
    """Sample Evidence objects for testing."""
    from src.utils.models import Evidence, Citation
    return [
        Evidence(
            content="Metformin shows promise in Alzheimer's...",
            citation=Citation(
                source="pubmed",
                title="Metformin and Alzheimer's Disease",
                url="https://pubmed.ncbi.nlm.nih.gov/12345678/",
                date="2024-01-15"
            ),
            relevance=0.85
        )
    ]
```

---

## 6. Core Utilities Implementation

### `src/utils/config.py`

```python
"""Application configuration using Pydantic Settings."""
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field
from typing import Literal
import structlog


class Settings(BaseSettings):
    """Strongly-typed application settings."""

    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore",
    )

    # LLM Configuration
    openai_api_key: str | None = Field(default=None, description="OpenAI API key")
    anthropic_api_key: str | None = Field(default=None, description="Anthropic API key")
    llm_provider: Literal["openai", "anthropic"] = Field(
        default="openai",
        description="Which LLM provider to use"
    )
    openai_model: str = Field(default="gpt-4o", description="OpenAI model name")
    anthropic_model: str = Field(default="claude-3-5-sonnet-20241022", description="Anthropic model")

    # PubMed Configuration
    ncbi_api_key: str | None = Field(default=None, description="NCBI API key for higher rate limits")

    # Agent Configuration
    max_iterations: int = Field(default=10, ge=1, le=50)
    search_timeout: int = Field(default=30, description="Seconds to wait for search")

    # Logging
    log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO"

    def get_api_key(self) -> str:
        """Get the API key for the configured provider."""
        if self.llm_provider == "openai":
            if not self.openai_api_key:
                raise ValueError("OPENAI_API_KEY not set")
            return self.openai_api_key
        else:
            if not self.anthropic_api_key:
                raise ValueError("ANTHROPIC_API_KEY not set")
            return self.anthropic_api_key


def get_settings() -> Settings:
    """Factory function to get settings (allows mocking in tests)."""
    return Settings()


def configure_logging(settings: Settings) -> None:
    """Configure structured logging."""
    structlog.configure(
        processors=[
            structlog.stdlib.filter_by_level,
            structlog.stdlib.add_logger_name,
            structlog.stdlib.add_log_level,
            structlog.processors.TimeStamper(fmt="iso"),
            structlog.processors.JSONRenderer(),
        ],
        wrapper_class=structlog.stdlib.BoundLogger,
        context_class=dict,
        logger_factory=structlog.stdlib.LoggerFactory(),
    )


# Singleton for easy import
settings = get_settings()
```

### `src/utils/exceptions.py`

```python
"""Custom exceptions for DeepCritical."""


class DeepCriticalError(Exception):
    """Base exception for all DeepCritical errors."""
    pass


class SearchError(DeepCriticalError):
    """Raised when a search operation fails."""
    pass


class JudgeError(DeepCriticalError):
    """Raised when the judge fails to assess evidence."""
    pass


class ConfigurationError(DeepCriticalError):
    """Raised when configuration is invalid."""
    pass


class RateLimitError(SearchError):
    """Raised when we hit API rate limits."""
    pass
```

---

## 7. TDD Workflow: First Test

### `tests/unit/utils/test_config.py`

```python
"""Unit tests for configuration loading."""
import pytest
from unittest.mock import patch
import os


class TestSettings:
    """Tests for Settings class."""

    def test_default_max_iterations(self):
        """Settings should have default max_iterations of 10."""
        from src.utils.config import Settings

        # Clear any env vars
        with patch.dict(os.environ, {}, clear=True):
            settings = Settings()
            assert settings.max_iterations == 10

    def test_max_iterations_from_env(self):
        """Settings should read MAX_ITERATIONS from env."""
        from src.utils.config import Settings

        with patch.dict(os.environ, {"MAX_ITERATIONS": "25"}):
            settings = Settings()
            assert settings.max_iterations == 25

    def test_invalid_max_iterations_raises(self):
        """Settings should reject invalid max_iterations."""
        from src.utils.config import Settings
        from pydantic import ValidationError

        with patch.dict(os.environ, {"MAX_ITERATIONS": "100"}):
            with pytest.raises(ValidationError):
                Settings()  # 100 > 50 (max)

    def test_get_api_key_openai(self):
        """get_api_key should return OpenAI key when provider is openai."""
        from src.utils.config import Settings

        with patch.dict(os.environ, {
            "LLM_PROVIDER": "openai",
            "OPENAI_API_KEY": "sk-test-key"
        }):
            settings = Settings()
            assert settings.get_api_key() == "sk-test-key"

    def test_get_api_key_missing_raises(self):
        """get_api_key should raise when key is not set."""
        from src.utils.config import Settings

        with patch.dict(os.environ, {"LLM_PROVIDER": "openai"}, clear=True):
            settings = Settings()
            with pytest.raises(ValueError, match="OPENAI_API_KEY not set"):
                settings.get_api_key()
```

---

## 8. Makefile (Developer Experience)

Create a `Makefile` for standard devex commands:

```makefile
.PHONY: install test lint format typecheck check clean

install:
	uv sync --all-extras
	uv run pre-commit install

test:
	uv run pytest tests/unit/ -v

test-cov:
	uv run pytest --cov=src --cov-report=term-missing

lint:
	uv run ruff check src tests

format:
	uv run ruff format src tests

typecheck:
	uv run mypy src

check: lint typecheck test
	@echo "All checks passed!"

clean:
	rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage
	find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
```

---

## 9. Execution Commands

```bash
# Install all dependencies
uv sync --all-extras

# Run tests (should pass after implementing config.py)
uv run pytest tests/unit/utils/test_config.py -v

# Run full test suite with coverage
uv run pytest --cov=src --cov-report=term-missing

# Run linting
uv run ruff check src tests
uv run ruff format src tests

# Run type checking
uv run mypy src

# Set up pre-commit hooks
uv run pre-commit install
```

---

## 10. Implementation Checklist

- [ ] Install `uv` and verify version
- [ ] Run `uv init --name deepcritical`
- [ ] Create `pyproject.toml` (copy from above)
- [ ] Create directory structure (run mkdir commands)
- [ ] Create `.env.example` and `.env`
- [ ] Create `.pre-commit-config.yaml`
- [ ] Create `Makefile` (copy from above)
- [ ] Create `tests/conftest.py`
- [ ] Implement `src/utils/config.py`
- [ ] Implement `src/utils/exceptions.py`
- [ ] Write tests in `tests/unit/utils/test_config.py`
- [ ] Run `make install`
- [ ] Run `make check` β€” **ALL CHECKS MUST PASS**
- [ ] Commit: `git commit -m "feat: phase 1 foundation complete"`

---

## 11. Definition of Done

Phase 1 is **COMPLETE** when:

1. `uv run pytest` passes with 100% of tests green
2. `uv run ruff check src tests` has 0 errors
3. `uv run mypy src` has 0 errors
4. Pre-commit hooks are installed and working
5. `from src.utils.config import settings` works in Python REPL

**Proceed to Phase 2 ONLY after all checkboxes are complete.**