Application of Multimodal Emotion Recognition Technology in Recommendation Systems

Authors

  • Wenhao Deng

DOI:

https://doi.org/10.54097/ek3g1e85

Keywords:

Multimodol emotion recognition technology, recommender system, large language model.

Abstract

Traditional recommender systems, relying mainly on users' historical behavioral data to predict preferences, often overlook the importance of real-time emotions in decision-making. As a result, they fail to meet the emotional needs of users. This study focuses on integrating multimodal emotion recognition technology with recommender systems to address this issue. It clarifies the necessity of such integration by analyzing the shift from correlation mining to causal understanding in recommendation systems and the value of multimodal emotion recognition. The research systematically analyzes five cutting-edge technologies: empathetic recommendation via large language models, robustness enhancement through causal inference, generative recommendation using diffusion models, emotion alignment via cross-modal contrastive learning, and privacy-preserving recommendation based on federated learning. Then, the research explores core challenges like efficiency, interpretability, and short-term homogenization, along with solutions such as knowledge distillation and neuro-symbolic methods. These technologies effectively tackle issues like emotional understanding, data bias, sparsity, modal alignment, and privacy protection. The study concludes that these technologies are converging, with future directions focusing on causal explanation, efficiency improvement, and long-term user well-being, driving recommender systems toward greater intelligence, robustness, reliability, and human-centricity.

Downloads

Download data is not yet available.

References

[1] D. Wang, X. Zhao, "Affective video recommender systems: A survey," Frontiers in Neuroscience, vol. 16, pp. 984404, 2022.

[2] Q. Liu, J. Hu, Y. Xiao, X. Zhao, J. Gao, W. Wang, Q. Li, J. Tang, "Multimodal Recommender Systems: A Survey," arXiv preprint arXiv:2302.03883, 2024.

[3] J. Pan, Z. He, Z. Li, Y. Liang, and L. Qiu, "A review of multimodal emotion recognition," CAAI Transactions on Intelligent Systems, vol. 15, no. 4, pp. 633 - 645, 2020.

[4] Y. Wu, Q. Mi, T. Gao, "A Comprehensive Review of Multimodal Emotion Recognition: Techniques, Challenges, and Future Directions," Biomimetics, vol. 10, no. 7, pp. 418, 2025.

[5] D. Mamieva, A. B. Abdusalomov, A. Kutlimuratov, B. Muminov, T. K. Whangbo, "Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features," Sensors, vol. 23, no. 12, pp. 5475, 2023.

[6] N. Lee, J. Kim, "SEALR: Sequential Emotion-Aware LLM-Based Personalized Recommendation System," 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '25), Padua, Italy, 2025, pp. 1 - 5.

[7] S. Li, F. Xue, K. Liu, D. Guo, R. Hong, "Multimodal Graph Causal Embedding for Multimedia-Based Recommendation," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 12, pp. 8842 - 8858, 2024.

[8] Y. Jiang, L. Xia, W. Wei, D. Luo, K. Lin, C. Huang, "DiffMM: Multi-Modal Diffusion Model for Recommendation," arXiv preprint arXiv: 2406.11781, 2024.

[9] H. Liao, S. Wang, H. Cheng, W. Zhang, J. Zhang, M. Zhou, K. Lu, R. Mao, X. Xie, "Aspect-Enhanced Explainable Recommendation with Multi-modal Contrastive Learning," ACM Transactions on Intelligent Systems and Technology, vol. 16, no. 1, Article 8, 2025.

[10] X. Zhou, Q. Yang, X. Zheng, W. Liang, K. I-K. Wang, J. Ma, Y. Pan, Q. Jin, "Personalized Federated Learning with Model-Contrastive Learning for Multi-Modal User Modeling in Human-Centric Metaverse," IEEE Journal on Selected Areas in Communications, vol. 42, no. 4, pp. 817 - 832, 2024.

[11] Y. Cui, F. Liu, P. Wang, B. Wang, H. Tang, Y. Wan, J. Wang, J. Chen, "Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Models," 18th ACM Conference on Recommender Systems (RecSys '24), Bari, Italy, 2024, pp. 507 - 517.

[12] T. Carraro, "Overcoming Recommendation Limitations with Neuro-Symbolic Integration," 17th ACM Conference on Recommender Systems (RecSys '23), Singapore, Singapore, 2023, pp. 1325 - 1331.

[13] L. Chen, Q. Dai, Z. Zhang, X. Feng, M. Zhang, P. Tang, X. Chen, Y. Zhu, Z. Dong, "RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems," ACM Web Conference 2025 (WWW '25 Companion), Sydney, Australia, 2025, pp. 133 - 142.

Downloads

Published

12-03-2026

How to Cite

Deng, W. (2026). Application of Multimodal Emotion Recognition Technology in Recommendation Systems. Highlights in Science, Engineering and Technology, 161, 43-52. https://doi.org/10.54097/ek3g1e85