Enhancing Speech Emotion Recognition with Hybrid Graph Neural Networks: A GCN-GAT Framework

초록

This paper proposes a speech emotion recognition method based on modeling speech signals as circular or linear graphs, enabling the extraction of node characteristics and practical analysis of relationships between nodes. The proposed method combines Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) layers to leverage the strengths of each in processing graph data. Precisely, GCN captures local relationships between nodes by aggregating information from neighboring nodes. The GAT mechanism better captures complex global relationships between nodes by assigning weights to neighboring nodes. Experiments validate our approach using the IEMOCAP dataset, demonstrating performance comparable to state-of-the-art models in emotion recognition tasks. The results of this study provide new insights and methodologies for further exploration in the field of speech signal processing.

키워드

Speech signal modelingGraph Convolutional Networks (GCN)Graph Attention Networks (GAT)Emotion RecognitionNode Feature ExtractionRelationship Analysis음성 신호 모델링그래프 합성곱 신경망그래프 어텐션 신경망감정 인식노드 특성 추출관계 분석
제목
Enhancing Speech Emotion Recognition with Hybrid Graph Neural Networks: A GCN-GAT Framework
저자
왕함김덕화김덕환
DOI
10.23019/kingpc.20.4.202408.001
발행일
2024-08
유형
Y
저널명
한국차세대컴퓨팅학회 논문지
20
4
페이지
7 ~ 20