Mobile User Interface Element Detection Via Adaptively Prompt Tuning Paper • 2305.09699 • Published May 16, 2023
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation Paper • 2504.10905 • Published Apr 15
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7 • 53
MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks Paper • 2509.14638 • Published Sep 18 • 11
Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba Training Paper • 2408.17081 • Published Aug 30, 2024
Towards Explainable Fake Image Detection with Multi-Modal Large Language Models Paper • 2504.14245 • Published Apr 19
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts Paper • 2508.07785 • Published Aug 11 • 28
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark Paper • 2405.19707 • Published May 30, 2024 • 8
Model-Aware Contrastive Learning: Towards Escaping the Dilemmas Paper • 2207.07874 • Published Jul 16, 2022