WM-811K 데이터셋의 클래스 불균형 방안에 관한 연구

A Study on Class Imbalance Handling in the WM-811K Dataset

초록

This paper proposes a data augmentation–based oversampling technique to mitigate severe class imbalance and improve the classification performance of wafer map defect patterns in semiconductor manufacturing. Using the publicly available WM-811k dataset, the training data—excluding the dominant None class—are augmented to match different proportions of the None class (0%, 25%, 50%, and 100%) in order to determine the most effective augmentation level. Three deep learning models—DenseNet121, EfficientNet-B2, and ResNet50—are employed for performance benchmarking. In addition to overall accuracy, evaluation metrics that account for class imbalance, namely the Geometric Mean (GM) and Weighted Geometric Mean (GM+), are used to assess model effectiveness. Experimental results show that ResNet50 yields the most stable performance in terms of GM+, and that 50% augmentation achieves comparable results to full (100%) augmentation, indicating that excessive augmentation may be unnecessary. These findings suggest that rotation-based oversampling tailored to the characteristics of wafer maps significantly enhances defect pattern classification and can support the development of machine-learning-based defect analysis systems in semiconductor manufacturing.

키워드

Wafer MapOversamplingData AugmentationAugmentation RatioDeep LearningSemiconductor DefectImbalanced Dataset
제목
WM-811K 데이터셋의 클래스 불균형 방안에 관한 연구
제목 (타언어)
A Study on Class Imbalance Handling in the WM-811K Dataset
저자
박상현김지성남춘성
발행일
2025-10
유형
Y
저널명
멀티미디어학회논문지
28
10
페이지
1576 ~ 1585