DeepCritical / docs /api /services.md
Joseph Pollack
demo starts , needs debugging on huggingface spaces
c37620b unverified
|
raw
history blame
3.65 kB

Services API Reference

This page documents the API for DeepCritical services.

EmbeddingService

Module: src.services.embeddings

Purpose: Local sentence-transformers for semantic search and deduplication.

Methods

embed

async def embed(self, text: str) -> list[float]

Generates embedding for a text string.

Parameters:

  • text: Text to embed

Returns: Embedding vector as list of floats.

embed_batch

async def embed_batch(self, texts: list[str]) -> list[list[float]]

Generates embeddings for multiple texts.

Parameters:

  • texts: List of texts to embed

Returns: List of embedding vectors.

similarity

async def similarity(self, text1: str, text2: str) -> float

Calculates similarity between two texts.

Parameters:

  • text1: First text
  • text2: Second text

Returns: Similarity score (0.0-1.0).

find_duplicates

async def find_duplicates(
    self,
    texts: list[str],
    threshold: float = 0.85
) -> list[tuple[int, int]]

Finds duplicate texts based on similarity threshold.

Parameters:

  • texts: List of texts to check
  • threshold: Similarity threshold (default: 0.85)

Returns: List of (index1, index2) tuples for duplicate pairs.

Factory Function

get_embedding_service

@lru_cache(maxsize=1)
def get_embedding_service() -> EmbeddingService

Returns singleton EmbeddingService instance.

LlamaIndexRAGService

Module: src.services.rag

Purpose: Retrieval-Augmented Generation using LlamaIndex.

Methods

ingest_evidence

async def ingest_evidence(self, evidence: list[Evidence]) -> None

Ingests evidence into RAG service.

Parameters:

  • evidence: List of Evidence objects to ingest

Note: Requires OpenAI API key for embeddings.

retrieve

async def retrieve(
    self,
    query: str,
    top_k: int = 5
) -> list[Document]

Retrieves relevant documents for a query.

Parameters:

  • query: Search query string
  • top_k: Number of top results to return (default: 5)

Returns: List of Document objects with metadata.

query

async def query(
    self,
    query: str,
    top_k: int = 5
) -> str

Queries RAG service and returns formatted results.

Parameters:

  • query: Search query string
  • top_k: Number of top results to return (default: 5)

Returns: Formatted query results as string.

Factory Function

get_rag_service

@lru_cache(maxsize=1)
def get_rag_service() -> LlamaIndexRAGService | None

Returns singleton LlamaIndexRAGService instance, or None if OpenAI key not available.

StatisticalAnalyzer

Module: src.services.statistical_analyzer

Purpose: Secure execution of AI-generated statistical code.

Methods

analyze

async def analyze(
    self,
    hypothesis: str,
    evidence: list[Evidence],
    data_description: str | None = None
) -> AnalysisResult

Analyzes a hypothesis using statistical methods.

Parameters:

  • hypothesis: Hypothesis to analyze
  • evidence: List of Evidence objects
  • data_description: Optional data description

Returns: AnalysisResult with:

  • verdict: SUPPORTED, REFUTED, or INCONCLUSIVE
  • code: Generated analysis code
  • output: Execution output
  • error: Error message if execution failed

Note: Requires Modal credentials for sandbox execution.

See Also