87 3

Ming Chen

ChenMing-thu14

AI & ML interests

3D Human Pose Estimation

Recent Activity

upvoted a paper 3 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

upvoted a paper 4 days ago

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

upvoted a paper 5 days ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

View all activity

Organizations

None yet

upvoted a paper 3 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 6 days ago • 127

upvoted a paper 4 days ago

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Paper • 2603.25730 • Published 8 days ago • 48

upvoted a paper 5 days ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 8 days ago • 151

upvoted 2 papers 11 days ago

Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

Paper • 2603.21986 • Published 11 days ago • 120

Versatile Editing of Video Content, Actions, and Dynamics without Training

Paper • 2603.17989 • Published 16 days ago • 16

upvoted a paper 15 days ago

SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing

Paper • 2603.19228 • Published 15 days ago • 67

upvoted 2 papers 17 days ago

WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation

Paper • 2603.16871 • Published 17 days ago • 60

Grounding World Simulation Models in a Real-World Metropolis

Paper • 2603.15583 • Published 18 days ago • 152

upvoted a paper 18 days ago

OmniForcing: Unleashing Real-time Joint Audio-Visual Generation

Paper • 2603.11647 • Published 22 days ago • 31

upvoted a paper 22 days ago

ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation

Paper • 2603.11421 • Published 23 days ago • 34

upvoted a paper 25 days ago

EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation

Paper • 2603.06014 • Published 28 days ago • 9

upvoted a paper 26 days ago

Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published 28 days ago • 117

upvoted a paper 30 days ago

Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published about 1 month ago • 178

upvoted 2 papers about 1 month ago

Kling-MotionControl Technical Report

Paper • 2603.03160 • Published Mar 3 • 26

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

Paper • 2602.18422 • Published Feb 20 • 30

upvoted 5 papers about 2 months ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Paper • 2602.08711 • Published Feb 9 • 28

Covo-Audio Technical Report

Paper • 2602.09823 • Published Feb 10 • 13

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Paper • 2602.03510 • Published Feb 3 • 27

HY3D-Bench: Generation of 3D Assets

Paper • 2602.03907 • Published Feb 3 • 23

3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Paper • 2602.03796 • Published Feb 3 • 64

Ming Chen

AI & ML interests

Recent Activity

Organizations

ChenMing-thu14's activity