Random image frequency aggregation dropout in image classification for deep convolutional neural networks

Citations

WEB OF SCIENCE

4
Citations

SCOPUS

6

초록

Modern deep neural networks are core technologies used in various fields of artificial intelligence. However, they require a sufficient dataset to leverage their high representation power and full potential. As a result, data augmentation is an essential method for fully utilizing these networks' capabilities. This study proposes a novel data augmentation method called random image frequency aggregation dropout, (RIFAD). RIFAD consists of two sub-algorithms: Fourier spectrum analysis (FSA) and frequency aggregation dropout (FAD). In FSA, the angular distribution is extracted by analyzing the Fourier spectrum of the input image. In FAD, frequency aggregation is deleted after randomly selecting an angle from the angle distribution. Empirically, we demonstrated that our method significantly improves prior state-of-the-art data augmentation using various convolutional neural network (CNN) architectures for image classification. We achieved superior Top-1 errors of 5.14%, 24.65%, and 21.45% on CIFAR-10, CIFAR-100, and STL-10 with Wide ResNet-40-2, respectively. We also demonstrated that the CNN with RIFAD had the best accuracy of 0.8961, precision of 0.9034, recall of 0.9046, f1-score of 0.8942, and area under the curve (AUC) of 0.9858 on a chest X-ray dataset.

키워드

Deep learningConvolutional neural networkImage classificationData augmentationFrequency domain
제목
Random image frequency aggregation dropout in image classification for deep convolutional neural networks
저자
Nam, Ju-HyeonLee, Sang-Chul
DOI
10.1016/j.cviu.2023.103684
발행일
2023-07
유형
Article
저널명
Computer Vision and Image Understanding
232