Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published 9 days ago • 127
UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture Paper • 2512.21675 • Published Dec 25, 2025 • 25
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls Paper • 2511.09148 • Published Nov 12, 2025 • 18
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 64
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding Paper • 2507.14533 • Published Jul 19, 2025 • 6
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 55
Factuality Matters: When Image Generation and Editing Meet Structured Visuals Paper • 2510.05091 • Published Oct 6, 2025 • 20