National Taiwan University

Trustworthy AI Lab

Shao-Yuan Lo
https://shaoyuanlo.github.io

Research Field

Smart Computing (Information)

Introduction

Shao-Yuan Lo is a newly appointed Assistant Professor at National Taiwan University (NTU). Before joining NTU, he was a Research Scientist at Honda Research Institute USA. He received his Ph.D. from Johns Hopkins University in 2023 and his M.S. and B.S. degrees from National Chiao Tung University in 2019 and 2017, respectively. His recent research focuses on Multimodal LLMs and Trustworthy AI. He has first- or corresponding-authored nearly 20 publications in venues such as IEEE T-PAMI, IEEE T-IP, IJCV, ICML (Spotlight), CVPR (Highlight), and ECCV. He won the Outstanding Reviewer at CVPR 2025 and the Best Paper Award at ACM Multimedia Asia 2019.

The Trustworthy AI Lab at NTU was founded in August 2025. Our recent research centers on Multimodal Large Language Models (MLLMs), including enhancing their safety, such as robustness and alignment, and leveraging MLLM reasoning to improve performance on multimodal tasks, such as anomaly detection, affective understanding, and theory-of-mind reasoning.


Research Topics
  • Multimodal AI
  • Trustworthy AI
  • Machine Learning
  • Computer Vision

Honor

Educational Background
  • Ph.D. in ECE, Johns Hopkins University, 2023
  • M.S. in EE, National Chiao Tung University, 2019
  • B.S. in EECS, National Chiao Tung University, 2017

Job Description

This research focuses on visual reasoning with multimodal large language models (MLLMs). It aims to develop advanced capabilities in integrating visual and textual information for multimodal reasoning. The work emphasizes the use of state-of-the-art MLLMs as the core technology and explores their applications in smart manufacturing and human–AI collaboration. In addition, the research investigates training improvement methods to enable more stable logical reasoning, thereby improving both the accuracy and interpretability of model predictions.

Preferred Intern Educational Level

B.S., M.S., or Ph.D. in Computer Science or related field

Skill sets or Qualities

  • Solid background in Machine Learning and Deep Learning
  • Proficiency in Linux, Python, and PyTorch

Job Description

This research focuses on visual reasoning with multimodal large language models (MLLMs). It aims to develop advanced capabilities in integrating visual and textual information for multimodal reasoning. The work emphasizes the use of state-of-the-art MLLMs as the core technology and explores their applications in smart manufacturing and human–AI collaboration. In addition, the research investigates training improvement methods to enable more stable logical reasoning, thereby improving both the accuracy and interpretability of model predictions.

Preferred Intern Educational Level

B.S., M.S., or Ph.D. in Computer Science or related field

Skill sets or Qualities

  • Solid background in Machine Learning and Deep Learning
  • Proficiency in Linux, Python, and PyTorch