view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 4 days ago • 45
view post Post 1104 FYI: Mistral.Ministral-3 dequantizer FP8->BF16https://github.com/csabakecskemeti/ministral-3_dequantizer_fp8-bf16(The instruct model weights are in FP8) See translation 🚀 2 2 👍 1 1 + Reply
PixelDiT: Pixel Diffusion Transformers for Image Generation Paper • 2511.20645 • Published 12 days ago • 25
Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression Paper • 2512.05081 • Published 3 days ago • 19
LATTICE: Democratize High-Fidelity 3D Generation at Scale Paper • 2512.03052 • Published 14 days ago • 7
Generative Neural Video Compression via Video Diffusion Prior Paper • 2512.05016 • Published 3 days ago • 7
TV2TV: A Unified Framework for Interleaved Language and Video Generation Paper • 2512.05103 • Published 3 days ago • 10
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation Paper • 2512.04678 • Published 3 days ago • 32
UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers Paper • 2512.04504 • Published 4 days ago • 13
NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation Paper • 2512.05106 • Published 3 days ago • 11
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model Paper • 2512.01030 • Published 7 days ago • 16
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression Paper • 2512.00891 • Published 7 days ago • 14
CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation Paper • 2512.03540 • Published 4 days ago • 11