DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models Paper • 2309.16292 • Published Sep 28, 2023 • 1
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving Paper • 2311.05332 • Published Nov 9, 2023 • 13
Drive Like a Human: Rethinking Autonomous Driving with Large Language Models Paper • 2307.07162 • Published Jul 14, 2023
OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving Paper • 2402.03830 • Published Feb 6, 2024 • 2
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models Paper • 2406.11633 • Published Jun 17, 2024 • 1
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published Mar 10 • 61
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving Paper • 2504.15780 • Published Apr 22 • 6
O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering Paper • 2505.16582 • Published May 22
KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision Paper • 2506.00783 • Published Jun 1 • 1
IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video? Paper • 2509.24709 • Published Sep 29 • 6
RE-Searcher: Robust Agentic Search with Goal-oriented Planning and Self-reflection Paper • 2509.26048 • Published Sep 30 • 7
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks Paper • 2510.08002 • Published Oct 9 • 23