-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2508.18265
-
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 303 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 317 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
Paper • 2509.18905 • Published • 29
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
Paper • 2504.15271 • Published • 67
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 140 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158 -
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL
Paper • 2508.13167 • Published • 129
-
Apriel-1.5-15b-Thinker
Paper • 2510.01141 • Published • 117 -
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
Paper • 2509.21268 • Published • 103 -
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Paper • 2509.00676 • Published • 84 -
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 83
-
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 118 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
OpenGVLab/InternVL3_5-241B-A28B-HF
Image-Text-to-Text • 241B • Updated • 93 • 11 -
OpenGVLab/InternVL3_5-38B-HF
Image-Text-to-Text • 38B • Updated • 893 • 6 -
OpenGVLab/InternVL3_5-30B-A3B-HF
Image-Text-to-Text • 31B • Updated • 256 • 5
-
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259 -
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 133 -
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents
Paper • 2507.22827 • Published • 99 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Apriel-1.5-15b-Thinker
Paper • 2510.01141 • Published • 117 -
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
Paper • 2509.21268 • Published • 103 -
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Paper • 2509.00676 • Published • 84 -
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 83
-
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 303 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 317 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
Paper • 2509.18905 • Published • 29
-
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 118 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
Paper • 2504.15271 • Published • 67
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 140 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158 -
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL
Paper • 2508.13167 • Published • 129
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208 -
OpenGVLab/InternVL3_5-241B-A28B-HF
Image-Text-to-Text • 241B • Updated • 93 • 11 -
OpenGVLab/InternVL3_5-38B-HF
Image-Text-to-Text • 38B • Updated • 893 • 6 -
OpenGVLab/InternVL3_5-30B-A3B-HF
Image-Text-to-Text • 31B • Updated • 256 • 5
-
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259 -
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 133 -
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents
Paper • 2507.22827 • Published • 99 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 208