Application of Multimodal Emotion Recognition Technology in Recommendation Systems
DOI:
https://doi.org/10.54097/ek3g1e85Keywords:
Multimodol emotion recognition technology, recommender system, large language model.Abstract
Traditional recommender systems, relying mainly on users' historical behavioral data to predict preferences, often overlook the importance of real-time emotions in decision-making. As a result, they fail to meet the emotional needs of users. This study focuses on integrating multimodal emotion recognition technology with recommender systems to address this issue. It clarifies the necessity of such integration by analyzing the shift from correlation mining to causal understanding in recommendation systems and the value of multimodal emotion recognition. The research systematically analyzes five cutting-edge technologies: empathetic recommendation via large language models, robustness enhancement through causal inference, generative recommendation using diffusion models, emotion alignment via cross-modal contrastive learning, and privacy-preserving recommendation based on federated learning. Then, the research explores core challenges like efficiency, interpretability, and short-term homogenization, along with solutions such as knowledge distillation and neuro-symbolic methods. These technologies effectively tackle issues like emotional understanding, data bias, sparsity, modal alignment, and privacy protection. The study concludes that these technologies are converging, with future directions focusing on causal explanation, efficiency improvement, and long-term user well-being, driving recommender systems toward greater intelligence, robustness, reliability, and human-centricity.
Downloads
References
[1] D. Wang, X. Zhao, "Affective video recommender systems: A survey," Frontiers in Neuroscience, vol. 16, pp. 984404, 2022.
[2] Q. Liu, J. Hu, Y. Xiao, X. Zhao, J. Gao, W. Wang, Q. Li, J. Tang, "Multimodal Recommender Systems: A Survey," arXiv preprint arXiv:2302.03883, 2024.
[3] J. Pan, Z. He, Z. Li, Y. Liang, and L. Qiu, "A review of multimodal emotion recognition," CAAI Transactions on Intelligent Systems, vol. 15, no. 4, pp. 633 - 645, 2020.
[4] Y. Wu, Q. Mi, T. Gao, "A Comprehensive Review of Multimodal Emotion Recognition: Techniques, Challenges, and Future Directions," Biomimetics, vol. 10, no. 7, pp. 418, 2025.
[5] D. Mamieva, A. B. Abdusalomov, A. Kutlimuratov, B. Muminov, T. K. Whangbo, "Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features," Sensors, vol. 23, no. 12, pp. 5475, 2023.
[6] N. Lee, J. Kim, "SEALR: Sequential Emotion-Aware LLM-Based Personalized Recommendation System," 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '25), Padua, Italy, 2025, pp. 1 - 5.
[7] S. Li, F. Xue, K. Liu, D. Guo, R. Hong, "Multimodal Graph Causal Embedding for Multimedia-Based Recommendation," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 12, pp. 8842 - 8858, 2024.
[8] Y. Jiang, L. Xia, W. Wei, D. Luo, K. Lin, C. Huang, "DiffMM: Multi-Modal Diffusion Model for Recommendation," arXiv preprint arXiv: 2406.11781, 2024.
[9] H. Liao, S. Wang, H. Cheng, W. Zhang, J. Zhang, M. Zhou, K. Lu, R. Mao, X. Xie, "Aspect-Enhanced Explainable Recommendation with Multi-modal Contrastive Learning," ACM Transactions on Intelligent Systems and Technology, vol. 16, no. 1, Article 8, 2025.
[10] X. Zhou, Q. Yang, X. Zheng, W. Liang, K. I-K. Wang, J. Ma, Y. Pan, Q. Jin, "Personalized Federated Learning with Model-Contrastive Learning for Multi-Modal User Modeling in Human-Centric Metaverse," IEEE Journal on Selected Areas in Communications, vol. 42, no. 4, pp. 817 - 832, 2024.
[11] Y. Cui, F. Liu, P. Wang, B. Wang, H. Tang, Y. Wan, J. Wang, J. Chen, "Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Models," 18th ACM Conference on Recommender Systems (RecSys '24), Bari, Italy, 2024, pp. 507 - 517.
[12] T. Carraro, "Overcoming Recommendation Limitations with Neuro-Symbolic Integration," 17th ACM Conference on Recommender Systems (RecSys '23), Singapore, Singapore, 2023, pp. 1325 - 1331.
[13] L. Chen, Q. Dai, Z. Zhang, X. Feng, M. Zhang, P. Tang, X. Chen, Y. Zhu, Z. Dong, "RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems," ACM Web Conference 2025 (WWW '25 Companion), Sydney, Australia, 2025, pp. 133 - 142.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







