Welcome to the MM Group

Welcome to the Multimedia (MM) Research Group at the School of Artificial Intelligence, College of Intelligence and Computing, Tianjin University. Our research focuses on fundamental theories, efficient learning algorithms, and intelligent system applications in multimedia, computer vision, and artificial intelligence, with a particular emphasis on intelligent perception and embodied intelligence in open and complex environments. We aim to move beyond traditional static visual understanding paradigms and advance intelligent systems from passive perception toward autonomous agents capable of cognition, decision-making, and action.

To address intelligence in real-world open environments, our recent research centers on open-world perception and 3D environmental understanding, including autonomous driving perception and decision-making, embodied navigation and interactive learning, 3D visual perception and scene modeling, and unified Vision–Language–Action (VLA) modeling frameworks. These efforts aim to enable intelligent systems to achieve continual generalization, adaptive perception, and reliable action under unseen environments and dynamic distribution shifts. Building upon these directions, we further investigate the safety and reliability of deep learning models in real-world deployment scenarios. Our research systematically explores out-of-distribution generalization, adversarial robustness, and interpretable learning mechanisms in open environments. By integrating cross-modal understanding, representation learning, causal modeling, and adversarial learning, we seek to uncover the underlying behavioral mechanisms of intelligent models and establish theoretical and methodological foundations for building safe, trustworthy, and long-term autonomous intelligent perception and embodied decision-making systems.

We have published papers on leading journals and conferences of multimedia, computer vision, machine learning, and artificial intelligence, such as IEEE TPAMI, IEEE TIP, IEEE TMM, IEEE TKDE, IEEE TIFS, IEEE TNNLS, IEEE TCSVT, IEEE TCYB, ACM MM, CVPR, ICCV, ECCV, NeurIPS, ICML, AAAI, IJCAI, SIGIR, etc. One PhD student was awarded the China Society of Image and Graphics (CSIG) Outstanding Dissertation 2021, and two Master students were selected for the Tencent Rhino-Bird Elite Talent Training Program in 2018 and 2020, respectively. Our team achieved third place overall in the Untargeted and Targeted Attack Tracks of the NeurIPS 2018 Adversarial Vision Challenge. Our paper was awarded “Best Paper Finalist” of ACM Multimedia 2017. We also got winner records in main technical challenges such as the Champion of the Large Scale Movie Description Challenge (LSMDC 2017, joint with ICCV 2017) and the Runner-up of the 2nd MSR Large-Scale Video to Language Challenge (Honorable Mention Award of Grand Challenge @ ACM MM 2017).

We are looking for passionate new PhD students and Master students to join the team !
If you are interested, please contact Prof. Yahong Han

Undergraduate courses:
Media Computing

News

7. July 2026

Zihao's paper "Prototype-Anchored Generalized Manifold Regression for Unknown-Domain Object Detection" was accepted by IEEE TPAMI.

28. May 2026

Zihao Zhang received the 2026 CCF Doctoral Student Funding Program (2026年CCF博士生资助计划)

1. May 2026

Xitie's paper "Decompose and Recompose: Reasoning New Skills from Existing Abilities for Cross-Task Robotic Manipulation" was accepted by ICML 2026 (CCF-A).

21. February 2026

Three papers about 3D perception, embodied navigation, and e2e autonomous driving were accepted by CVPR 2026 (CCF-A).

20. February 2026

The paper "Hierarchical Cross-Modal Reasoning for Visible-Infrared Camouflaged Object Detection" was accepted by IEEE TMM.

12. February 2026

Zihao's paper "Fourier-KAN: Feature Distribution Decomposition and Recombination for Unknown-Domain Object Detection" was accepted by IEEE Transactions on Image Processing (CCF-A).

24. December 2025

Nana Yu's paper "Enhanced Visual Prompt Meets Low-Light Saliency Detection" was accepted by Pattern Recognition.

Welcome to the MM Group

News

... see all News