VIZECGNET: VISUAL ECG IMAGE NETWORK FOR CARDIOVASCULAR DISEASES CLASSIFICATION WITH MULTI-MODAL TRAINING AND KNOWLEDGE DISTILLATION

Citations

WEB OF SCIENCE

1
Citations

SCOPUS

2

초록

An electrocardiogram (ECG) captures the heart's electrical signal to assess various heart conditions. In practice, ECG data is stored as either digitized signals or printed images. Despite the emergence of numerous deep learning models for digitized signals, many hospitals prefer image storage due to cost considerations. Recognizing the unavailability of raw ECG signals in many clinical settings, we propose VizECGNet, which uses only printed ECG graphics to determine the prognosis of multiple cardiovascular diseases. During training, cross-modal attention modules (CMAM) are used to integrate information from two modalities - image and signal, while self-modality attention modules (SMAM) capture inherent long-range dependencies in ECG data of each modality. Additionally, we utilize knowledge distillation to improve the similarity between two distinct predictions from each modality stream. This innovative multi-modal deep learning architecture enables the utilization of only ECG images during inference. VizECGNet with image input achieves higher performance in precision, recall, and F1-Score compared to signal-based ECG classification models, with improvements of 3.50%, 8.21%, and 7.38%, respectively.

키워드

Deep LearningSignal ProcessingMulti-Modality LearningECG Classification
제목
VIZECGNET: VISUAL ECG IMAGE NETWORK FOR CARDIOVASCULAR DISEASES CLASSIFICATION WITH MULTI-MODAL TRAINING AND KNOWLEDGE DISTILLATION
저자
Nam, Ju-HyeonPark, Seo-HyungKim, Su JungLee, Sang-Chul
DOI
10.1109/ICIP51287.2024.10647478
발행일
2024
유형
Proceedings Paper
저널명
Proceedings - International Conference on Image Processing, ICIP
페이지
3219 ~ 3223