File size: 3,659 Bytes
53c4c46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c37620b
ce644a9
 
 
448c679
cb48bd4
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# Services API Reference

This page documents the API for DeepCritical services.

## EmbeddingService

**Module**: `src.services.embeddings`

**Purpose**: Local sentence-transformers for semantic search and deduplication.

### Methods

#### `embed`

```python
async def embed(self, text: str) -> list[float]
```

Generates embedding for a text string.

**Parameters**:
- `text`: Text to embed

**Returns**: Embedding vector as list of floats.

#### `embed_batch`

```python
async def embed_batch(self, texts: list[str]) -> list[list[float]]
```

Generates embeddings for multiple texts.

**Parameters**:
- `texts`: List of texts to embed

**Returns**: List of embedding vectors.

#### `similarity`

```python
async def similarity(self, text1: str, text2: str) -> float
```

Calculates similarity between two texts.

**Parameters**:
- `text1`: First text
- `text2`: Second text

**Returns**: Similarity score (0.0-1.0).

#### `find_duplicates`

```python
async def find_duplicates(
    self,
    texts: list[str],
    threshold: float = 0.85
) -> list[tuple[int, int]]
```

Finds duplicate texts based on similarity threshold.

**Parameters**:
- `texts`: List of texts to check
- `threshold`: Similarity threshold (default: 0.85)

**Returns**: List of (index1, index2) tuples for duplicate pairs.

### Factory Function

#### `get_embedding_service`

```python
@lru_cache(maxsize=1)
def get_embedding_service() -> EmbeddingService
```

Returns singleton EmbeddingService instance.

## LlamaIndexRAGService

**Module**: `src.services.rag`

**Purpose**: Retrieval-Augmented Generation using LlamaIndex.

### Methods

#### `ingest_evidence`

```python
async def ingest_evidence(self, evidence: list[Evidence]) -> None
```

Ingests evidence into RAG service.

**Parameters**:
- `evidence`: List of Evidence objects to ingest

**Note**: Requires OpenAI API key for embeddings.

#### `retrieve`

```python
async def retrieve(
    self,
    query: str,
    top_k: int = 5
) -> list[Document]
```

Retrieves relevant documents for a query.

**Parameters**:
- `query`: Search query string
- `top_k`: Number of top results to return (default: 5)

**Returns**: List of Document objects with metadata.

#### `query`

```python
async def query(
    self,
    query: str,
    top_k: int = 5
) -> str
```

Queries RAG service and returns formatted results.

**Parameters**:
- `query`: Search query string
- `top_k`: Number of top results to return (default: 5)

**Returns**: Formatted query results as string.

### Factory Function

#### `get_rag_service`

```python
@lru_cache(maxsize=1)
def get_rag_service() -> LlamaIndexRAGService | None
```

Returns singleton LlamaIndexRAGService instance, or None if OpenAI key not available.

## StatisticalAnalyzer

**Module**: `src.services.statistical_analyzer`

**Purpose**: Secure execution of AI-generated statistical code.

### Methods

#### `analyze`

```python
async def analyze(
    self,
    hypothesis: str,
    evidence: list[Evidence],
    data_description: str | None = None
) -> AnalysisResult
```

Analyzes a hypothesis using statistical methods.

**Parameters**:
- `hypothesis`: Hypothesis to analyze
- `evidence`: List of Evidence objects
- `data_description`: Optional data description

**Returns**: `AnalysisResult` with:
- `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
- `code`: Generated analysis code
- `output`: Execution output
- `error`: Error message if execution failed

**Note**: Requires Modal credentials for sandbox execution.

## See Also

- [Architecture - Services](../architecture/services.md) - Architecture overview
- [Configuration](../configuration/index.md) - Service configuration