About me
I am a CS PhD student at UIUC, advised by Prof. Tong Zhang and Prof. Huan Zhang. Previously, I earned my bachelor’s and master’s degrees from the Department of Automation at Tsinghua University and CSE, HKUST. Currently, my research focuses on: Agents, Trustworthy LLMs/VLMs, and Deep reinforcement learning.
Prior to my PhD, I was fortunate to work closely with Prof. Chongjie Zhang (Washington University in St. Louis), Dr. Lei Han (Tencent AI Lab), and Prof. Meng Fang (University of Liverpool).
News
- 🎉 (2025.9) GUI-Actor and ADG are accepted to NeurIPS 2025! MergeBench is accepted to the Datasets & Benchmarks Track! Congrats to all co-authors!
- 🎉 (2025.8) MiCRo is accepted to EMNLP 2025 main conference with award nomination!
- 🌟 (2025.6) We’ve released GUI-Actor, a novel GUI grounding model that combines an attention-based action head with a grounding verifier. Explore more on our project page!
- 💻 (2025.5) Starting my internship at Microsoft Research, Redmond.
- 🎉 (2025.5) Decomposed Reward Models (DRMs) is accepted to ACL 2025.
- 🎉 (2025.5) EmbodiedBench is accepted to ICML 2025 as an oral paper! Thanks to my co-authors!
- 🌟 (2025.02) We released EmbodiedBench, a new comprehensive and multifaceted benchmark for multimodal embodied agents. Check out our paper and project page.
- 🎉 (2025.1) Robust Decision Transformer and DynaMath are accepted by ICLR 2025! New versions will be updated soon.
- 🌟 (2024.10) A dynamic visual math benchmark is out! Check the project page and the DynaMath paper.
- 🎉 (2024.9) GRM is accepted by NeurIPS 2024! Check out our GRM series here.
- 🎉 (2024.5) Rewards-in-Context (RiC) is accepted by ICML 2024! Thanks to my co-authors!
- 🎉 (2024.1) Robust IQL is accepted by ICLR 2024 as a spotlight paper!
Selected Publications
Multimodal GUI Agent and Embodied Agent
GUI-Actor: Attention-based Grounding with Verifiable Action Head for GUI Agents. Preprint 2025. [code] [website]
Qianhui Wu$^*$, Kanzhi Cheng$^*$, Rui Yang$^*$, Chaoyun Zhang, Jianwei Yang, Huiqiang Jiang, Jian Mu, Baolin Peng, Bo Qiao, Reuben Tan, Si Qin, Lars Liden, Qingwei Lin, Huan Zhang, Tong Zhang, Jianbing Zhang, Dongmei Zhang, Jianfeng GaoEmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents. ICML 2025 (Oral). [code] [website]
Rui Yang$^*$, Hanyang Chen $^*$, Junyu Zhang $^*$, Mark Zhao $^*$, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, Huan Zhang, Tong Zhang
Multimodal Math Reasoning
- DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models. ICLR 2025. [code] [website]
Chengke Zou $^*$, Xingang Guo $^*$, Rui Yang $^*$, Junyu Zhang, Bin Hu, Huan Zhang.
ML for LLMs
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs. NeurIPS 2024. [code]
Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang.Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment. ICML 2024. [code]
Rui Yang $^*$, Xiaoman Pan $^*$, Feng Luo $^*$, Shuang Qiu $^*$, Han Zhong, Dong Yu, Jianshu Chen.Rethinking Diverse Human Preference Learning through Principal Component Analysis. ACL 2025 (Findings).
Feng Luo$^*$, Rui Yang$^*$, Hao Sun, Chunyuan Deng, Jiarui Yao, Jingyan Shen, Huan Zhang, Hanjie Chen.MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning. EMNLP 2025 (Main).
Jingyan Shen$^*$, Jiarui Yao$^*$, Rui Yang$^*$, Yifan Sun, Feng Luo, Rui Pan, Tong Zhang, Han Zhao.
Robust Offline RL
Robust Decision Transformer: Tackling Data Corruption in Offline RL via Sequence Modeling. ICLR 2025.
Jiawei Xu $^*$, Rui Yang $^*$, Shuang Qiu, Feng Luo, Meng Fang, Baoxiang Wang, Lei Han.Towards Robust Offline Reinforcement Learning under Diverse Data Corruption. ICLR 2024. (Spotlight) [code]
Rui Yang $^*$, Han Zhong $^*$, Jiawei Xu $^*$, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang.Corruption-Robust Offline Reinforcement Learning with General Function Approximation. NeurIPS 2023. [code]
Chenlu Ye $^*$, Rui Yang $^*$, Quanquan Gu, Tong Zhang.RORL: Robust Offline Reinforcement Learning via Conservative Smoothing. NeurIPS 2022. (Spotlight) [code]
Rui Yang $^*$, Chenjia Bai $^*$, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han.
Goal-conditioned RL
What Is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?. ICML 2023. [code]
Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang.Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL. ICLR 2022. [code]
Rui Yang, Yiming Lu, Wenzhe Li, Hao Sun, Meng Fang, Yali Du, Xiu Li, Lei Han, Chongjie Zhang.
Experiences
Research Intern at Microsoft Research, 2025.
Research Intern at Tencent AI Lab and Robotics X Lab, 2020-2022 (Multiple internship terms).
ML Intern at Meituan Financial Service Group, 2019.
Services
Conference Reviewer: ICML, ICLR, NeurIPS ($\color{red}{\text{Top Reviewer}}$ in NeurIPS 2023), ACL/ARR, ICRA, AAMAS.
Journal Reviewer: IEEE Robotics and Automation Letters (RA-L), IEEE Transactions on Neural Networks and Learning Systems (TNNLS), IEEE Transactions on Artificial Intelligence (TAI), Machine Learning, Journal of Artificial Intelligence Research.
Teaching Assistant: COMP 4211 Machine Learning, HKUST; COMP 1021 Introduction to Computer Science, HKUST
Hobbies
In my leisure time, I enjoy sports like running, table tennis, and swimming. During my time at Tsinghua University, I was an amateur long-distance runner. In 2019, I completed a half marathon (21.0975 km) in 1 h 30 min and a full marathon (42.195 km) in 3 h 36 min. However, since starting my PhD I haven’t had time for regular running training, so I’ve let it slide. Hopefully I’ll get a chance to update my record once I graduate🙂.