Ai2 Release Partners

Team

non-profit

AI & ML interests

None defined yet.

Recent Activity

soldni authored a paper 10 days ago

2 OLMo 2 Furious

soldni authored a paper 10 days ago

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation

soldni authored a paper 10 days ago

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

View all activity

soldni

authored 11 papers 10 days ago

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 22

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation

Paper • 2502.10341 • Published Feb 14, 2025 • 3

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

Paper • 2502.18443 • Published Feb 25, 2025 • 9

DataDecide: How to Predict Best Pretraining Data with Small Experiments

Paper • 2504.11393 • Published Apr 15, 2025 • 18

Teaching Models to Understand (but not Generate) High-risk Data

Paper • 2505.03052 • Published May 5, 2025 • 6

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5, 2025 • 60

FlexOlmo: Open Language Models for Flexible Data Use

Paper • 2507.07024 • Published Jul 9, 2025 • 9

olmOCR 2: Unit Test Rewards for Document OCR

Paper • 2510.19817 • Published Oct 22, 2025 • 16

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 61

Olmo 3

Paper • 2512.13961 • Published Dec 15, 2025 • 28

Bolmo: Byteifying the Next Generation of Language Models

Paper • 2512.15586 • Published Dec 17, 2025 • 17

tairaa

authored a paper 29 days ago

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published 30 days ago • 28

pradeepd

authored a paper 3 months ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 61

natolambert

authored a paper 10 months ago

Reinforcement Learning from Human Feedback

Paper • 2504.12501 • Published Apr 16, 2025 • 4

soldni

authored a paper 10 months ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9, 2025 • 77

tairaa

authored a paper 10 months ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9, 2025 • 77

taylorb

authored a paper 10 months ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9, 2025 • 77

pradeepd

authored a paper 12 months ago

Large-Scale Data Selection for Instruction Tuning

Paper • 2503.01807 • Published Mar 3, 2025 • 14

natolambert

authored 2 papers about 1 year ago

Objective Mismatch in Model-based Reinforcement Learning

Paper • 2002.04523 • Published Feb 11, 2020

Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Paper • 2308.00862 • Published Aug 1, 2023