Video Multi-Person Human Parsing
For video instance-level human parsing, we use three metrics for multi-human parsing evaluation.
a. Mean IoU(%) for semantic part segmentation, reported by the FCN paper.
b. Follow Mask-RCNN paper, we used the mean value of several mean Average Precision(mAP) with IOU thresholds from 0.5 to 0.95 for evaluation of human instance segmentation, referred as APr.
c. APrvol for instance-level human parsing, reported by Holistic, Instance-Level Human Parsing
2. Submit format
A parent folder named vp_results.zip(Click to download a template file) contains 50 sub-folders in it. Each sub-folder represents a video result of test set video. Each video folder contains 3 sub-folders in it:
- A folder of png images, named as "global_parsing". The content of id.png is the global human parsing results (instance-agnostic) for the image with exactly the same size.
- The content of id.png is the instance segmentation index image with exactly the same size. Each human instance belongs a unique human index id. 0 is always assumed to be the background label.
A text file id.txt. Each line is of the format
. The first line of this file corresponds to human instance index 1 in instance segmentation indexed image. The second line corresponds to 2 in indexed png and so on.
- An indexed-png image with the segmentation. Here, each number belongs to a unique part. 0 is always assumed to be the background label.
- A text file. Each line is of the format < class_id score >. The first line of this file corresponds to 1 in the indexed png, the second line corresponds to 2 in the indexed png and so on.
3. Class Definition