FastContext: Training Efficient Repository Explorer for Coding Agents Paper • 2606.14066 • Published 6 days ago • 81
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models Paper • 2606.16140 • Published 3 days ago • 81
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories Paper • 2606.11176 • Published 9 days ago • 108
Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents Paper • 2606.06036 • Published 14 days ago • 64
WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces Paper • 2606.09426 • Published 10 days ago • 100
Gliner Guard v1 Collection GLiNER2-based guardrail for PII, content safety classification, prompt attacks detection and more via single forward pass • 5 items • Updated May 9 • 7
view article Article PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend PaddlePaddle • 30 days ago • 37
view article Article MTEB Leaderboard: From a slow demo to feature-rich leaderboard Samoed • 5 days ago • 21
SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents Paper • 2606.05761 • Published 14 days ago • 19
When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents Paper • 2606.05806 • Published 14 days ago • 23
LLM Explainability with Counterfactual Chains and Causal Graphs Paper • 2606.05972 • Published 14 days ago • 17
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 21 days ago • 193
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Paper • 2605.31264 • Published 20 days ago • 112
Trust-Region Behavior Blending for On-Policy Distillation Paper • 2605.31159 • Published 20 days ago • 66
JLT: Clean-Latent Prediction in Latent Diffusion Transformers Paper • 2605.27102 • Published 23 days ago • 33
Macaron-A2UI: A Model for Generative UI in Personal Agents Paper • 2605.24830 • Published 25 days ago • 82