FSDA: Frequency re-scaling in data augmentation for corruption-robust image classification

Citations

WEB OF SCIENCE

10
Citations

SCOPUS

10

초록

Modern convolutional neural networks (CNNs) are used in various applications, including computer vision, speech recognition, and robotics. However, practical usage in various applications requires large-scale datasets, and real -world data contains various corruptions that degrade the model's performance owing to the inconsistencies in the training and testing distributions. In this study, we propose Frequency re -Scaling Data Augmentation (FSDA) to improve the classification performance, robustness against corruption, and localizability of classifiers trained on various image classification datasets. Our method consists of two processes: mask generation process (MGP) and pattern re -scaling process (PSP). MGP clusters the frequency domain spectra to produce similar frequency patterns, and then PSP scales frequency by learning rescaling parameters from frequency patterns. Because the CNN classifies images by focusing on their structural features highlighted with FSDA, CNN trained with the proposed method has more robustness against corruption than that with the other data augmentations (DAs). Our technique outperforms the existing DAs on four public image classification datasets, including the CIFAR-10/100, STL-10, and ImageNet. Particularly, our strategy increases the robustness of the classifier against the different corruption errors by an average of 5.04% over the baseline.

키워드

Deep learningImage classificationConvolutional neural networksData augmentationFrequency domain
제목
FSDA: Frequency re-scaling in data augmentation for corruption-robust image classification
저자
Nam, Ju-HyeonLee, Sang-Chul
DOI
10.1016/j.patcog.2024.110332
발행일
2024-06
유형
Article
저널명
Pattern Recognition
150