4 54 18

Xinyu Fang

nebulae09

FangXinyu-0913

AI & ML interests

None yet

Recent Activity

authored a paper 3 days ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

authored a paper 3 days ago

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

authored a paper 3 days ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

View all activity

Organizations

authored 3 papers 3 days ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

Paper • 2509.24709 • Published Sep 29 • 6

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

Paper • 2511.14366 • Published 20 days ago • 15

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published 3 days ago • 43

upvoted 4 papers 3 days ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

Paper • 2509.24709 • Published Sep 29 • 6

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

Paper • 2511.14366 • Published 20 days ago • 15

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published 3 days ago • 43

Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction

Paper • 2512.04987 • Published 4 days ago • 64

liked a Space 13 days ago

ATLAS Benchmark

🧪

ATLAS for Frontier Scientific Benchmark

upvoted a paper 13 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published 13 days ago • 54

upvoted 3 papers about 1 month ago

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Paper • 2510.27606 • Published Oct 31 • 27

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published Oct 27 • 84

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27 • 96

upvoted 3 papers about 2 months ago

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

Paper • 2510.08555 • Published Oct 9 • 63

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9 • 109

upvoted a paper 2 months ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 173

upvoted 2 papers 3 months ago

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

Paper • 2508.18124 • Published Aug 25 • 48

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 208

liked a dataset 4 months ago

HuggingFaceTB/smoltalk2

Viewer • Updated Oct 31 • 8.61M • 9.05k • 127

upvoted a paper 4 months ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published Aug 5 • 37

Xinyu Fang

AI & ML interests

Recent Activity

Organizations

nebulae09's activity

ATLAS Benchmark