Application of Multimodal Emotion Recognition Technology in Recommendation Systems

Wenhao Deng

doi:10.54097/ek3g1e85

Authors

Wenhao Deng

DOI:

https://doi.org/10.54097/ek3g1e85

Keywords:

Multimodol emotion recognition technology, recommender system, large language model.

Abstract

Traditional recommender systems, relying mainly on users' historical behavioral data to predict preferences, often overlook the importance of real-time emotions in decision-making. As a result, they fail to meet the emotional needs of users. This study focuses on integrating multimodal emotion recognition technology with recommender systems to address this issue. It clarifies the necessity of such integration by analyzing the shift from correlation mining to causal understanding in recommendation systems and the value of multimodal emotion recognition. The research systematically analyzes five cutting-edge technologies: empathetic recommendation via large language models, robustness enhancement through causal inference, generative recommendation using diffusion models, emotion alignment via cross-modal contrastive learning, and privacy-preserving recommendation based on federated learning. Then, the research explores core challenges like efficiency, interpretability, and short-term homogenization, along with solutions such as knowledge distillation and neuro-symbolic methods. These technologies effectively tackle issues like emotional understanding, data bias, sparsity, modal alignment, and privacy protection. The study concludes that these technologies are converging, with future directions focusing on causal explanation, efficiency improvement, and long-term user well-being, driving recommender systems toward greater intelligence, robustness, reliability, and human-centricity.

Downloads

Download data is not yet available.

References

[1] D. Wang, X. Zhao, "Affective video recommender systems: A survey," Frontiers in Neuroscience, vol. 16, pp. 984404, 2022.

[2] Q. Liu, J. Hu, Y. Xiao, X. Zhao, J. Gao, W. Wang, Q. Li, J. Tang, "Multimodal Recommender Systems: A Survey," arXiv preprint arXiv:2302.03883, 2024.

[3] J. Pan, Z. He, Z. Li, Y. Liang, and L. Qiu, "A review of multimodal emotion recognition," CAAI Transactions on Intelligent Systems, vol. 15, no. 4, pp. 633 - 645, 2020.

[4] Y. Wu, Q. Mi, T. Gao, "A Comprehensive Review of Multimodal Emotion Recognition: Techniques, Challenges, and Future Directions," Biomimetics, vol. 10, no. 7, pp. 418, 2025.

[5] D. Mamieva, A. B. Abdusalomov, A. Kutlimuratov, B. Muminov, T. K. Whangbo, "Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features," Sensors, vol. 23, no. 12, pp. 5475, 2023.

[6] N. Lee, J. Kim, "SEALR: Sequential Emotion-Aware LLM-Based Personalized Recommendation System," 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '25), Padua, Italy, 2025, pp. 1 - 5.

[7] S. Li, F. Xue, K. Liu, D. Guo, R. Hong, "Multimodal Graph Causal Embedding for Multimedia-Based Recommendation," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 12, pp. 8842 - 8858, 2024.

[8] Y. Jiang, L. Xia, W. Wei, D. Luo, K. Lin, C. Huang, "DiffMM: Multi-Modal Diffusion Model for Recommendation," arXiv preprint arXiv: 2406.11781, 2024.

[9] H. Liao, S. Wang, H. Cheng, W. Zhang, J. Zhang, M. Zhou, K. Lu, R. Mao, X. Xie, "Aspect-Enhanced Explainable Recommendation with Multi-modal Contrastive Learning," ACM Transactions on Intelligent Systems and Technology, vol. 16, no. 1, Article 8, 2025.

[10] X. Zhou, Q. Yang, X. Zheng, W. Liang, K. I-K. Wang, J. Ma, Y. Pan, Q. Jin, "Personalized Federated Learning with Model-Contrastive Learning for Multi-Modal User Modeling in Human-Centric Metaverse," IEEE Journal on Selected Areas in Communications, vol. 42, no. 4, pp. 817 - 832, 2024.

[11] Y. Cui, F. Liu, P. Wang, B. Wang, H. Tang, Y. Wan, J. Wang, J. Chen, "Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Models," 18th ACM Conference on Recommender Systems (RecSys '24), Bari, Italy, 2024, pp. 507 - 517.

[12] T. Carraro, "Overcoming Recommendation Limitations with Neuro-Symbolic Integration," 17th ACM Conference on Recommender Systems (RecSys '23), Singapore, Singapore, 2023, pp. 1325 - 1331.

[13] L. Chen, Q. Dai, Z. Zhang, X. Feng, M. Zhang, P. Tang, X. Chen, Y. Zhu, Z. Dong, "RecUserSim: A Realistic and Diverse User Simulator for Evaluating Conversational Recommender Systems," ACM Web Conference 2025 (WWW '25 Companion), Sydney, Australia, 2025, pp. 133 - 142.

Application of Multimodal Emotion Recognition Technology in Recommendation Systems

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Indexing

Latest publications

Information