prolongvid/ProLongVid_data
Viewer • Updated • 1.34M • 14 • 1
How to use prolongvid/prolongvid_stage2_7B with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("prolongvid/prolongvid_stage2_7B")
model = AutoModelForCausalLM.from_pretrained("prolongvid/prolongvid_stage2_7B")The ProLongVid-v1 models are 7B parameter models trained on ProLongVid_data, based on our extended Qwen2.5 language model with a context window of 256K tokens.
This prolongvid_stage2_7B model is trained on stage-2 long-video data of ProLongVid_data, based on prolongvid_stage1_7B.
We suggest testing this model with 128 frames.
@inproceedings{wang2025prolongvid,
title={ProLongVid: A Simple but Strong Baseline for Long-context Video Instruction Tuning},
author={Wang, Rui and Li, Bohao and Dai, Xiyang and Yang, Jianwei and Chen, Yi-Ling and Xing, Zhen and Yang, Yifan and Chen, Dongdong and Qiu, Xipeng and Wu, Zuxuan and others},
booktitle={EMNLP},
year={2025}
}
Base model
Qwen/Qwen2.5-7B