pisco_log
banner

RBL: A Multimodal Emotion Analysis Model Based on Language Guided Transformer

Han Qiao, Yuan Sun, Chenxuan Li, Xinya Li

Abstract


Emotional analysis has important value in the economic and social fields such as public opinion analysis, market research and risk
prediction. Nowadays, the rapid development of short video media has led to the expansion of emotional analysis content from text content to
multimedia content. Multimodal emotion analysis is to mine emotional information from the fusion of text, language and pictures. At present,
some achievements have been made in the research of emotion analysis for multimodal data. However, due to the huge content of multimodal
information, contextual emotion correlation between modes and within modes, the current multimodal emotion analysis still has some shortcomings. In view of this, this study proposes the RBL model - a multimodal sentiment analysis model based on LGT (Linguistics-Guided
Transformer), which uses Bert and Resnet models to extract text features and image features, and Librosa toolkit to extract voice features.

Keywords


Multi-mode; Deep learning; Affective analysis

Full Text:

PDF

Included Database


References


[1] Gandhi A, Adhvaryu K, Poria S, et al. Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions[J]. Information Fusion, 2023, 91:424-444.

[2] Zeng Y, Li Z, Tang Z, et al. Heterogeneous graph convolution based on In-domain Selfsupervision for Multimodal Sentiment Analysis[J]. Expert Systems with Applications, 2023, 213: Article 119240.

[3] Zhang Xinsheng, Gao Teng Object level emotion classification of multi head attention memory networks [J]. Pattern Recognition and

Artificial Intelligence, 2019, 32 (11): 997-1005H. Bao et al., VLMo: Unified Vision-Language Pre-Training with Mixture-of-ModalityExperts. arXiv, May 27, 2022. Accessed: Feb. 05, 2023.

[4] Silva N F F D, Coletta L F S, Hruschka E R, A survey and comparative study of tweet sentiment analysis via semi-supervised learning[J]. ACM Computing Surveys, 2016, 49(1): 15.

[5] Kiritchenko S, Zhu X, Mohammad S M, Sentiment analysis of short informal text[J]. Journal of Artificial Intelligence Research, 2014, 50:723-762.

[6] Da Silva N F F, Hruschka E R, Hruschka E R, Tweet sentiment analysis with classifier ensembles[J]. Decision Support Systems, 2014, 66: 170-179.

[7] Zhang C, Yang Z, He X, et al. Multimodal Intelligence: Representation Learning, Information Fusion, and Applications[J]. IEEE Journal of Selected Topics in Signal Processing, 2020, PP (99):1-1.

[8] Cai Guoyong, Xia Binbin Image and Text Fusion Media Emotion Prediction Based on Convolutional Neural Networks [J]. Computer Applications, 2016, 36 (02): 428-431+477.

[9] Hazarika D, Poria S, Mihalcea R, et al. Icon: Interactive conversational memory network for multimodal emotion detection[C]. Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), 2018: 2594-2604.

[10] A. Vaswani et al., Attention is All you Need.




DOI: http://dx.doi.org/10.18686/aitr.v2i2.4021

Refbacks

  • There are currently no refbacks.