Multimodal Attention Network for Continuous-Time Emotion Recognition Using Video and EEG Signals

Choi, Dong Yoon; Kim, Deok-Hwan; Song, Byung Cheol

doi:10.1109/ACCESS.2020.3036877

상세 보기

Multimodal Attention Network for Continuous-Time Emotion Recognition Using Video and EEG Signals

Citations

WEB OF SCIENCE

24

Citations

SCOPUS

42

초록

Emotion recognition is a very important technique for ultimate interactions between human beings and artificial intelligence systems. For effective emotion recognition in a continuous-time domain, this article presents a multimodal fusion network which integrates video modality and electroencephalogram (EEG) modality networks. To calculate the attention weights of facial video features and the corresponding EEG features in fusion, a multimodal attention network, that is utilizing bilinear pooling based on low-rank decomposition, is proposed. Finally, continuous domain valence values are computed by using two modality network outputs and attention weights. Experimental results show that the proposed fusion network provides an improved performance of about 6.9% over the video modality network for the MAHNOB human computer interface (MAHNOB-HCI) dataset. Also, we achieved the performance improvement even for our proprietary dataset.

키워드

Emotion recognition; video; EEG; multimodality; multimodal fusion; attention; FACIAL EXPRESSION RECOGNITION

제목: Multimodal Attention Network for Continuous-Time Emotion Recognition Using Video and EEG Signals

저자: Choi, Dong Yoon; Kim, Deok-Hwan; Song, Byung Cheol

DOI: 10.1109/ACCESS.2020.3036877

발행일: 2020

유형: Article

저널명: IEEE Access

권: 8

페이지: 203814 ~ 203826