An EM Framework for Online Incremental Learning of Semantic Segmentation

Model Overview

Abstract

Incremental learning of semantic segmentation has emerged as a promising strategy for visual scene interpretation in the openworld setting. However, it remains challenging to acquire novel classes in an online fashion for the segmentation task, mainly due to its continuously-evolving semantic label space, partial pixelwise ground-truth annotations, and constrained data availability. To address this, we propose an incremental learning strategy that can fast adapt deep segmentation models without catastrophic forgetting, using a streaming input data with pixel annotations on the novel classes only. To this end, we develop a unified learning strategy based on the Expectation-Maximization (EM) framework, which integrates an iterative relabeling strategy that fills in the missing labels and a rehearsal-based incremental learning step that balances the stability-plasticity of the model. Moreover, our EM algorithm adopts an adaptive sampling method to select informative training data and a class-balancing training strategy in the incremental model updates, both improving the efficacy of model learning. We validate our approach on the PASCAL VOC 2012 and ADE20K datasets, and the results demonstrate its superior performance over the existing incremental methods.

Publication
In The 29th ACM International Conference on Multimedia
Shipeng Yan
Shipeng Yan
Bytedance

My research interests include few/low-shot learning, incremental learning and representation learning.

Jiale Zhou
Jiale Zhou
MEGVII

My research interests include few-shot learning, incremental learning and reinforcement learning.

Jiangwei Xie
Jiangwei Xie
Sensetime

My research interests include Automatic Machine Learning, Multi-task Learning and Life-long Learning.

Songyang Zhang
Songyang Zhang
Shanghai AI Lab

My research interests include few/low-shot learning, graph neural networks and video understanding.

Xuming He
Xuming He
Associate Professor

My research interests include few/low-shot learning, graph neural networks and video understanding.

Related