DualDistill: Self and cross-modal knowledge distillation for multimodal emotion recognition in conversation

Citations

SCOPUS

0

초록

Emotion Recognition in Conversation (ERC) has become increasingly critical in diverse applications, including health care and virtual assistants. This paper proposes DualDistill, a multimodal ERC model. The proposed model employs a self-distillation strategy based on the Exponential Moving Average (EMA) to incorporate soft-label signals and enhance text representations, providing rich emotional cues. In addition, cross-modal Knowledge Distillation (KD) is applied to transfer contextual emotional cues from text to non-verbal modalities (audio and visual), alleviating modality imbalance and improving multimodal fusion. The DualDistill method achieves state-of-the-art performance on IEMOCAP and MELD benchmarks, demonstrating robustness and strong generalizability. Copyright © 2026. Published by Elsevier B.V.

키워드

Cross-modal knowledge distillationGated fusionMultimodal emotion recognition in conversationSelf-distillation
제목
DualDistill: Self and cross-modal knowledge distillation for multimodal emotion recognition in conversation
저자
Kim, DeogHwaKim, Deok-Hwan
DOI
10.1016/j.icte.2026.04.004
발행일
2026
유형
Article in press
저널명
ICT Express