Towards AI Standardization: A Survey on Multimodal Emotion Recognition Datasets

초록

Multimodal Emotion Recognition (MER) is a rapidly expanding field in artificial intelligence (AI), enabling machines to interpret human affective states by integrating speech, facial expressions, text, and physiological signals. It underpins applications in human?computer interaction, affective computing, healthcare, education, driver state detection, and socially assistive robotics. By combining complementary modalities, MER achieves greater robustness than unimodal approaches. Progress in this area relies heavily on benchmark datasets such as IEMOCAP, MELD, CMU-MOSEI, DEAP, and MAHNOB-HCI, which provide diverse modalities, annotation schemes, and recording conditions. However, heterogeneity across these resources? categorical labels in IEMOCAP and MELD, dimensional ratings in DEAP and MAHNOB-HCI, and hybrid annotations in CMU-MOSEI?creates difficulties for cross-dataset benchmarking, reproducibility, and unified evaluation. Additional issues include differences in recording environments, data quality, sample imbalance, and limited accessibility, as well as ethical concerns surrounding sensitive multimodal data. This survey analyzes representative MER datasets with emphasis on annotation schemes, key challenges, and implications for reproducibility and trustworthy AI. We argue that standardization is essential and propose several directions for future research: establishing unified dimensional frameworks, developing automated and semi-automated annotation methods to reduce subjectivity, constructing large-scale and culturally diverse corpora to enhance generalizability, and implementing cross-dataset mapping protocols for fair benchmarking. Ensuring privacy and ethical safeguards remains central. Together, these efforts will support the creation of standardized and reliable benchmarks, foster reproducible research, and enable the deployment of safe, ethical, and socially responsible emotion-aware AI systems.

제목
Towards AI Standardization: A Survey on Multimodal Emotion Recognition Datasets
저자
KIM DEOKHWAN
학회명
2025표준인증안전학회 추계학술대회
개최지
제주대학교 아라컨벤션홀교
학회 개최일
2025-10-30 ~ 2025-11-01