TOAN: Target-Oriented Alignment Network for Fine-Grained Image Categorization with Few Labeled Samples

Abstract

In this paper, we study the fine-grained categorization problem under the few-shot setting, i.e., each fine-grained class only contains a few labeled examples, termed Fine-Grained Few-Shot classification (FGFS). The core predicament in FGFS is the high intra-class variance yet low inter-class fluctuations in the dataset. In traditional fine-grained classification, the high intra-class variance can be somewhat relieved by conducting the supervised training on the abundant labeled samples. However, with few labeled examples, it is hard for the FGFS model to learn a robust class representation with the significantly higher intra-class variance. Moreover, the inter- and intra-class variance are closely related. The significant intra-class variance in FGFS often aggravates the low inter-class variance issue. To address the above challenges, we propose a Target-Oriented Alignment Network (TOAN) to tackle the FGFS problem from both intra- and inter-class perspective. To reduce the intra-class variance, we propose a target-oriented matching mechanism to reformulate the spatial features of each support image to match the query ones in the embedding space. To enhance the inter-class discrimination, we devise discriminative fine-grained features by integrating local compositional concept representations with the global second-order pooling. We conducted extensive experiments on four public datasets for fine-grained categorization, and the results show the proposed TOAN obtains the state-of-the-art.

Publication
In IEEE Transactions on Circuits and Systems for Video Technology
Huaxi Huang
Huaxi Huang
Researcher

My research interests include multimedia data analysis, computer vision and trustworthy machine leanring.