The ViCaS dataset is designed to help AI models understand video content at both a broad (holistic) and detailed (pixel-level) scale. It is significantly larger than previous benchmarks like YouTube-VOS.
: The dataset includes 20,416 annotated videos . Download File VicaTS Vids Pt-2.rar
: It contains over 65,500 labeled object tracks and more than 11,000 unique noun phrases describing those objects. The ViCaS dataset is designed to help AI
: The videos average roughly 9.1 seconds in length. Download File VicaTS Vids Pt-2.rar
: Unlike standard datasets that only use single-word labels, ViCaS provides holistic video-level captions and detailed segmentation masks for objects. Possible Alternatives