Trustworthy AI Lab
Research Field
Shao-Yuan Lo is a newly appointed Assistant Professor at National Taiwan University (NTU). Before joining NTU, he was a Research Scientist at Honda Research Institute USA. He received his Ph.D. from Johns Hopkins University in 2023 and his M.S. and B.S. degrees from National Chiao Tung University in 2019 and 2017, respectively. His recent research focuses on Multimodal LLMs and Trustworthy AI. He has first- or corresponding-authored nearly 20 publications in venues such as IEEE T-PAMI, IEEE T-IP, IJCV, ICML (Spotlight), CVPR (Highlight), and ECCV. He won the Outstanding Reviewer at CVPR 2025 and the Best Paper Award at ACM Multimedia Asia 2019.
The Trustworthy AI Lab at NTU was founded in August 2025. Our recent research centers on Multimodal Large Language Models (MLLMs), including enhancing their safety, such as robustness and alignment, and leveraging MLLM reasoning to improve performance on multimodal tasks, such as anomaly detection, affective understanding, and theory-of-mind reasoning.
- Multimodal AI
- Trustworthy AI
- Machine Learning
- Computer Vision
- Yushan Young Fellow, Ministry of Education, Taiwan, 2025
- Outstanding Reviewer, IEEE/CVF CVPR, 2025
- Robert F. Wagner All-Conference Best Student Paper Award, SPIE Medical Imaging, 2024
- CVPR Travel Award, IEEE/CVF CVPR, 2023
- Best Paper Award, ACM Multimedia Asia, 2019
- Best Master's Thesis Award, Chinese Image Processing and Pattern Recognition Society, 2019
- Students’ Outstanding Contribution Award, National Chiao Tung University, 2019
- Ph.D. in ECE, Johns Hopkins University, 2023
- M.S. in EE, National Chiao Tung University, 2019
- B.S. in EECS, National Chiao Tung University, 2017
Job Description
This research focuses on visual reasoning with multimodal large language models (MLLMs). It aims to develop advanced capabilities in integrating visual and textual information for multimodal reasoning. The work emphasizes the use of state-of-the-art MLLMs as the core technology and explores their applications in smart manufacturing and human–AI collaboration. In addition, the research investigates training improvement methods to enable more stable logical reasoning, thereby improving both the accuracy and interpretability of model predictions.
Preferred Intern Educational Level
B.S., M.S., or Ph.D. in Computer Science or related field
Skill sets or Qualities
- Solid background in Machine Learning and Deep Learning
- Proficiency in Linux, Python, and PyTorch
Job Description
This research focuses on visual reasoning with multimodal large language models (MLLMs). It aims to develop advanced capabilities in integrating visual and textual information for multimodal reasoning. The work emphasizes the use of state-of-the-art MLLMs as the core technology and explores their applications in smart manufacturing and human–AI collaboration. In addition, the research investigates training improvement methods to enable more stable logical reasoning, thereby improving both the accuracy and interpretability of model predictions.
Preferred Intern Educational Level
B.S., M.S., or Ph.D. in Computer Science or related field
Skill sets or Qualities
- Solid background in Machine Learning and Deep Learning
- Proficiency in Linux, Python, and PyTorch