상세 보기
A divide-oversampling and conquer algorithm based support vector machine for massive and highly imbalanced data
- Bang, Sungwan;
- Kim, Jaeoh
WEB OF SCIENCE
0초록
The support vector machine (SVM) has been successfully applied to various classification areas with a high level of classification accuracy. However, it is infeasible to use the SVM in analyzing massive data because of its significant computational problems. When analyzing imbalanced data with different class sizes, furthermore, the classification accuracy of SVM in minority class may drop significantly because its classifier could be biased toward the majority class. To overcome such a problem, we propose the DOC-SVM method, which uses divide-oversampling and conquers techniques. The proposed DOC-SVM divides the majority class into a few subsets and applies an oversampling technique to the minority class in order to produce the balanced subsets. And then the DOC-SVM obtains the final classifier by aggregating all SVM classifiers obtained from the balanced subsets. Simulation studies are presented to demonstrate the satisfactory performance of the proposed method.
키워드
- 제목
- A divide-oversampling and conquer algorithm based support vector machine for massive and highly imbalanced data
- 저자
- Bang, Sungwan; Kim, Jaeoh
- 발행일
- 2022-04
- 유형
- Article
- 저널명
- 응용통계연구
- 권
- 35
- 호
- 2
- 페이지
- 177 ~ 188