I’m a postgraduate student at CSE, the Hong Kong University of Science and Technology, supervised by Prof. Tong Zhang. I received my master’s and bachelor’s degrees from the Department of Automation at Tsinghua University. I am furtunate to work closely with Prof. Chongjie Zhang (Washington University in St. Louis), Dr. Lei Han (Tencent AI Lab), and Prof. Meng Fang (University of Liverpool). My research interests lie in deep reinforcement learning (RL), especially goal-conditioned RL, offline RL, model-based RL, and the application of RL algorithms to Large Language Models (LLMs), and game AI.

Currently, I am actively researching ways to improve the robustness and generalization abilities of deep reinforcement learning, while also trying to enhance the trustworthiness of LLMs. Feel free to contact me by email if you are interested in discussing or collaborating with me.


  • 🎉 (2024.5) Rewards-in-Context (RiC) is accepted by ICML 2024! Thanks to my co-authors!
  • 🎉 (2024.5) GOPlan is accepted by Transactions on Machine Learning Research (TMLR)!
  • 🎉 (2024.1) Robust IQL is accepted by ICLR 2024 as a spotlight paper!

  • 2022.09 - now, postgraduat student, Department of Computer Science and Engineering, the Hong Kong University of Science and Technology.
  • 2019.09 - 2022.07, Master, Department of Automation, Tsinghua University.
  • 2015.09 - 2019.07, Bachelor, Department of Automation, Tsinghua University.


  • Internship at Tencent AI Lab

  • Internship at Meituan Financial Service Group


Conference Reviewer: ICML (2022,2024), ICLR (2024), NeurIPS (2022,2023 $\color{red}{\text{Top Reviewer}}$), ICRA (2023), AAMAS(2024).

Journal Reviewer: IEEE Robotics and Automation Letters (RA-L), IEEE Transactions on Artificial Intelligence (TAI), Machine Learning.


During my leisure time, I like sports such as running, table tennis and swimming. I used to be an amateur long-distance runner at Tsinghua University. I finished a half marathon (21.0975 km) in 1h30min and a marathon (42.195 km) in 3h36min.