RVOS is the first real-world video object segmentation dataset. Last updated on Mar. 28, 2018.
There will be two mainly contributions of this dataset:
First, there is a lack of real-world data of video object segmentation. DAVIS is captured using expensive video camera. We use mobile phone to capture objects. The photographers are volunteer and not well trained. RVOS will fill the gap for real-world video object segmentation.
Second, in E-commerce area, there needs to show products using mobile phone. The sellers capture the video of products which will be segmented in E-commerce mobile applications. Then the background can be changed or composited to a better one. We hope our dataset is useful for this purpose.
It consists of two hundreds short videos, which focus on common object in family and office.
The dataset is challenging because the objects in these videos vary wildly in their categories. There are occlusions and sudden motion and other complexity. The videos are captured in portrait using mobile phones. About 35 frames are annotated which are selected uniformly in each video. Every frame is rotated and resized to 854*480, which is 480p resolution, same as the DAVIS dataset.
Raw videos can be downloaded via: Google Drive
Selected frames and annotations can be downloaded via: Google Drive
All sequences are owned by the authors of
RVOS and are licensed under Creative Commons Attributions 4.0 License.
README.md # readme of the dataset
JPEGImages # folders containing frames
Annotations # folders containing human annotations
train.txt # list of all the sequences for training
trainval.txt # all the sequences for training and validation
val.txt # list of all the sequences for validation
to be added after the paper review of TMM.