papers - a passagereptile455 Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

passagereptile455 's Collections

papers

papers

updated 13 days ago

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

Paper • 2503.14734 • Published Mar 18, 2025 • 5
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Paper • 2401.02117 • Published Jan 4, 2024 • 33
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2, 2025 • 148
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19, 2025 • 89
Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21, 2025 • 68
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Paper • 2507.18553 • Published Jul 24, 2025 • 41
MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents

Paper • 2507.19478 • Published Jul 25, 2025 • 32
CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Paper • 2507.18392 • Published Jul 24, 2025 • 20
PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving

Paper • 2507.17596 • Published Jul 23, 2025 • 7
Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement

Paper • 2507.18742 • Published Jul 24, 2025 • 6
Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI

Paper • 2507.10510 • Published Jul 14, 2025 • 5
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Paper • 2507.19457 • Published Jul 25, 2025 • 29
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report

Paper • 2507.16534 • Published Jul 22, 2025 • 8
A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17, 2025 • 261
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1, 2025 • 250
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316
Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10, 2025 • 160
MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4, 2025 • 159
Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2, 2025 • 130
GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21, 2025 • 133
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

Paper • 2507.16784 • Published Jul 22, 2025 • 122
T-LoRA: Single Image Diffusion Model Customization Without Overfitting

Paper • 2507.05964 • Published Jul 8, 2025 • 120
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 134
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Paper • 2410.10813 • Published Oct 14, 2024 • 14
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Paper • 2506.11928 • Published Jun 13, 2025 • 24
Defeating Prompt Injections by Design

Paper • 2503.18813 • Published Mar 24, 2025 • 24
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

Paper • 2505.22954 • Published May 29, 2025 • 14
Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis

Paper • 2505.11581 • Published May 16, 2025 • 3
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 128
Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 8
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
Gorilla: Large Language Model Connected with Massive APIs

Paper • 2305.15334 • Published May 24, 2023 • 5
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace

Paper • 2303.17580 • Published Mar 30, 2023 • 15
Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 6
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 10
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Paper • 2509.09677 • Published Sep 11, 2025 • 35
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 107
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189
Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3, 2025 • 58
BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues

Paper • 2501.10836 • Published Jan 18, 2025 • 1
Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1, 2024 • 186
DynaSaur: Large Language Agents Beyond Predefined Actions

Paper • 2411.01747 • Published Nov 4, 2024 • 37
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Paper • 2401.00812 • Published Jan 1, 2024 • 11
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Paper • 2510.24702 • Published Oct 28, 2025 • 29
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Paper • 2509.18058 • Published Sep 22, 2025 • 12
Speculative Safety-Aware Decoding

Paper • 2508.17739 • Published Aug 25, 2025
Latent Fusion Jailbreak: Blending Harmful and Harmless Representations to Elicit Unsafe LLM Outputs

Paper • 2508.10029 • Published Aug 8, 2025
Context Misleads LLMs: The Role of Context Filtering in Maintaining Safe Alignment of LLMs

Paper • 2508.10031 • Published Aug 9, 2025
Poison Once, Refuse Forever: Weaponizing Alignment for Injecting Bias in LLMs

Paper • 2508.20333 • Published Aug 28, 2025
Mitigating Jailbreaks with Intent-Aware LLMs

Paper • 2508.12072 • Published Aug 16, 2025
D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

Paper • 2509.17938 • Published Sep 22, 2025 • 4
A Simple and Efficient Jailbreak Method Exploiting LLMs' Helpfulness

Paper • 2509.14297 • Published Sep 17, 2025
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 506
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

Paper • 2412.21199 • Published Dec 30, 2024 • 13
Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published Jun 9, 2025 • 20
ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

Paper • 2510.24592 • Published Oct 28, 2025 • 17
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26, 2025 • 77
GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 244
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance

Paper • 2506.03828 • Published Jun 4, 2025 • 15
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 118
Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published Dec 18, 2025 • 86
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 295
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 254

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs