Masked Cross-image Encoding For Few-shot Segmentation

Wenbo Xu, Huaxi Huang, Ming Cheng, Litao Yu, Qiang Wu, Jian Zhang

July, 2023

Abstract

Few-shot segmentation (FSS) aims to label pixels of an image with very few annotated images. The main challenge in FSS is determining the query pixel labels by referencing the class prototypes learned from the few labeled support exemplars. Previous methods focus on independently learning the class-wise descriptors from support images, which ignores the detailed contextual information and mutual relations of support-query features. To address this issue, we propose a joint learning method, Masked Cross-Image Encoding (MCE), to mine common visual properties describing object details and learn bidirectional inter-image dependencies enhancing feature interaction. MCE is more than a visual representation enrichment module; it also considers the cross-image mutual dependencies and contextual information. Thus the labeled visual feature representations of support images are better exploited to guide the prediction of a query image. Experiments on public FSS benchmarks PASCAL-5i and COCO-20i demonstrate the state-of-the-art meta-learning performance of the proposed method.

Type

Conference paper

Publication

In IEEE International Conference on Multimedia and Expo

Huaxi Huang

Researcher

My research interests include multimedia data analysis, computer vision and trustworthy machine leanring.