view article Article EMO: Pretraining mixture of experts for emergent modularity about 12 hours ago • 15
EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval Paper • 2301.12005 • Published Jul 3, 2023 • 1
XTR Replicability Collection All the models used in experiments from "A Replicability Study of XTR" • 16 items • Updated 4 days ago • 6
Prism-Reranker: Beyond Relevance Scoring -- Jointly Producing Contributions and Evidence for Agentic Retrieval Paper • 2604.23734 • Published 13 days ago • 3
BidirLM-Embedding Collection BidirLM is a family of 5 frontier bidirectional encoders, including an omnimodal variant at 2.5B. • 6 items • Updated Apr 7 • 6
Embed Mamba2 Collection Text embedding models based on Mamba2 with linear-time and constant-memory inference through vertical chunking. • 5 items • Updated 18 days ago • 3
VISA: Retrieval Augmented Generation with Visual Source Attribution Paper • 2412.14457 • Published Dec 19, 2024 • 1
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 108
DenseOn & LateOn Collection A collection of open state-of-the-art single and multi-vector models • 7 items • Updated 17 days ago • 9
view article Article DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models 18 days ago • 37
Portuguese PII and De-Identification Collection 35 open-source Portuguese PII detection models. 54 entity types. Apache 2.0. • 31 items • Updated 19 days ago • 25