DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published 5 days ago • 25
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation Paper • 2510.08551 • Published Oct 9 • 32
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7 • 47
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering Paper • 2506.23329 • Published Jun 29 • 8
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent Paper • 2506.17612 • Published Jun 21 • 64
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers Paper • 2506.05573 • Published Jun 5 • 81
SpatialLM: Training Large Language Models for Structured Indoor Modeling Paper • 2506.07491 • Published Jun 9 • 50
LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS Paper • 2311.17245 • Published Nov 28, 2023 • 2
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially? Paper • 2503.12349 • Published Mar 16 • 44
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding Paper • 2412.18450 • Published Dec 24, 2024 • 36
Large Spatial Model: End-to-end Unposed Images to Semantic 3D Paper • 2410.18956 • Published Oct 24, 2024 • 1
InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds Paper • 2403.20309 • Published Mar 29, 2024 • 19
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation Paper • 2402.05054 • Published Feb 7, 2024 • 28