[Topic] Holistic Video Understanding

Introduction

We address the problem of multiclass semantic video segmentation, aiming to integrate object reasoning, scene layout estimation with pixel labeling. To this end, we have developed an object-augmented structured model in spatio-temporal domain, which captures long-range dependency between superpixels, and imposes consistency between object and superpixel labels. We also designed an efficient inference algorithm to jointly infer the superpixel labels, object activations and their occlusion relations for a large number of object hypotheses.

3D Object Structure Recovery via Semi-supervised Learning on Videos Qian He, Desen Zhou, Xuming He British Machine Vision Conference (BMVC),2018

Learning Dynamic Hierarchical Models for Anytime Scene Labeling, Buyu Liu, Xuming He European Conference on Computer Vision (ECCV), 2016

Multiclass Semantic Video Segmentation with Object-Level Active Inference, Buyu Liu, Xuming He IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015