AI & ML interests

Video generation Video reasoning

Recent Activity

juyilΒ  updated a Space 8 days ago
VideoReason/README
juyilΒ  published a Space 11 days ago
VideoReason/README
View all activity

VMEvalKit πŸŽ₯🧠

Github video reasoning evaluation toolkit
VMevalkit slack community for VMEvalKit

results Paper Hugging Face WeChat Homepage

A framework to evaluate reasoning capabilities in video generation models at scale.

Invitation to Collaborate 🀝

VMEvalKit is meant to be a permissively open-source shared playground for everyone. If you’re interested in machine cognition, video models, evaluation, or anything anything πŸ¦„βœ¨, we’d love to build with you:

  • πŸ§ͺ Add new reasoning tasks (planning, causality, social, physical, etc.)
  • πŸŽ₯ Plug in new video models (APIs or open-source)
  • πŸ“Š Experiment with better evaluation metrics and protocols
  • 🧱 Improve infrastructure, logging, and the web dashboard
  • πŸ“š Use VMEvalKit in your own research and share back configs/scripts
  • πŸŒŸπŸŽ‰ Or Anything anything πŸ¦„βœ¨

πŸ’¬ Join us on Slack to ask questions, propose ideas, or start a collab: Slack Invite πŸš€

Research

Here we keep track of papers spinned off from this code infrastructure and some works in progress.

This paper implements our experimental framework and demonstrates that leading video generation models (Sora-2 etc) can perform visual reasoning tasks with >60% success rates. See results.

License

Apache 2.0

Citation

If you find VMEvalKit useful in your research, please cite:

@misc{VMEvalKit,
  author       = {VMEvalKit Team},
  title        = {VMEvalKit: A framework for evaluating reasoning abilities in foundational video models},
  year         = {2025},
  howpublished = {\url{https://github.com/Video-Reason/VMEvalKit}}
}

models 0

None public yet