Multi-attention multimodal sentiment analysis

Citations

SCOPUS

36

초록

Sentiment analysis plays an important role in natural-language processing. It has been performed on multimodal data including text, audio, and video. Previously conducted research does not make full utilization of such heterogeneous data. In this study, we propose a model of Multi-Attention Recurrent Neural Network (MA-RNN) for performing sentiment analysis on multimodal data. The proposed network consists of two attention layers and a Bidirectional Gated Recurrent Neural Network (BiGRU). The first attention layer is used for data fusion and dimensionality reduction, and the second attention layer is used for the augmentation of BiGRU to capture key parts of the contextual information among utterances. Experiments on multimodal sentiment analysis indicate that our proposed model achieves the state-of-the-art performance of 84.31% accuracy on the Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis (CMU-MOSI) dataset. Furthermore, an ablation study is conducted to evaluate the contributions of different components of the network. We believe that our findings of this study may also offer helpful insights into the design of models using multimodal data. © 2020 ACM.

키워드

Deep learningMultimodal machine learningSentimental analysis on multimedia
제목
Multi-attention multimodal sentiment analysis
저자
Kim, TaeyongLee, Bowon
DOI
10.1145/3372278.3390698
발행일
2020
유형
Conference paper
저널명
ICMR 2020 - Proceedings of the 2020 International Conference on Multimedia Retrieval
페이지
436 ~ 441