Hanze Dong

About

Hanze Dong is a Senior Researcher at Microsoft Research and a founding member of the Singapore research lab. He was previously a Research Scientist at Salesforce Research. He serves as the managing editor of JMLR and has published in leading machine learning journals and conferences. Hanze earned a PhD in Mathematics from HKUST under the supervision of Professor Tong Zhang and a BSc in Mathematics from Fudan University.
Recently, he works on the reproducibility and interpretability of modern foundation models. He is the core author of several influential open-source post-training GitHub packages. His theoretical work explores the core principles of generalization and optimization in foundation models, encompassing theory-guaranteed training algorithms and the analysis of diffusion-like dynamics.

His current research focuses on the post-training and alignment of foundation models, generative modeling, and Monte Carlo sampling. He also has some experience in designing robust and efficient ML algorithms as well as ML theory.

Hanze is open to research collaborations, especially for post-training, RL, diffusion models. He can be reached via email.

News

[2025-09] One paper has been accepted at NeurIPS 2025.
[2025-08] Will serve as the publication chair of ICML 2026.
[2025-05] One paper has been accepted at ACL 2025 (Findings).
[2025-05] RSD has been accepted at ICML 2025.
[2025-04] Attending ICLR 2025 at Singapore.
[2025-01] Two papers have been accepted to ICLR 2025. See you in Singapore.
[2024-12] Attending NeurIPS 2024 at Vanvouver, Canada.
[2024-12] RAFT (TMLR 2023) has been invited to present in ICLR 2025.
[2024-10] PAPAL has been accepted to JMLR.
[2024-09] Two papers have been accepted to NeurIPS 2024.
[2024-09] Three papers have been accepted to EMNLP 2024 Main Conference.
[2024-09] RLHF Workflow has been accepted at TMLR.

[2024-07] Reverse Transition Kernel is selected as the best paper at ICML 2024 Workshop SPIGM.
[2024-07] Attending ICML 2024 at Vienna, Austria.
[2024-06] LMFlow has won Best Demo Award at NAACL 2024!
[2024-05] One paper has been accepted to COLT 2024, which provides the first sampling algorithm that support general non-log-Sobolev distribution with quasi-polynomial computation complexity.
[2024-05] Two papers have been accepted to ICML 2024, including Gibbs sampling from human feedback (GSHF) and stochastic proximal sampler.
[2024-03] Excited to share that LMFlow paper was accepted by NAACL 2024 Demo Track.
[2024-01] Two papers have been accepted to ICLR 2024, including rdMC and Spurious feature diversification.
[2023-12] The defense of the PhD thesis was successfully completed on November 30th, 2023.
[2023-11] Excited to share that RAFT has been accepted to TMLR. [Link].
[2023-07] Attending ICML 2023 at Honolulu, Hawaiʻi, USA.
[2023-05] Received HKUST RedBird Academic Excellence Award.
[2023-05] Attending ICLR 2023 at Kigali, Rwanda.
[2023-03] We're thrilled to release our project, LMFlow! This framework streamlines the development of LLMs, including fine-tuning, inference, and RLHF, in a more cost-effective and effortless manner. Our aspiration is that LMFlow will incite a broader range of imaginative applications of LLMs and cultivate a larger community of LLM aficionados!

Research

* denotes equal contribution. ^† denotes corresponding author. Name denotes mentored student/intern.
Full publication list can be found in Google Scholar.

Alignment of Foundation Models

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao, Yifan Hao, Hanning Zhang, Hanze Dong, Wei Xiong, Nan Jiang, Tong Zhang;
Annual Conference on Neural Information Processing Systems (NeurIPS), 2025. [Paper]
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Zirui Zhao, Hanze Dong^†, Amrita Saha, Caiming Xiong, Doyen Sahoo;
The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Paper]
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
Chenlu Ye*, Wei Xiong*, Yuheng Zhang*, Hanze Dong*, Nan Jiang, Tong Zhang;
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024. [Paper]
RLHF workflow: From Reward Modeling to online RLHF
Hanze Dong*, Wei Xiong*, Bo Pang*, Haoxiang Wang*, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang;
Transactions on Machine Learning Research (TMLR), 2024. [Paper]
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Wei Xiong*, Hanze Dong*, Chenlu Ye*, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang;
International Conference on Machine Learning (ICML), 2024.
ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models (Oral). [Paper]
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models.
Shizhe Diao*, Rui Pan*, Hanze Dong*, Kashun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang;
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) -- System Demonstration Track, 2024. (Best Demo Award) [Paper] [Github]
Raft: Reward ranked finetuning for generative foundation model alignment
Hanze Dong*, Wei Xiong*, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang;
Transactions on Machine Learning Research (TMLR), 2023. [Paper]
DetGPT: Detect What You Need via Reasoning
Renjie Pi*, Jiahui Gao*, Shizhe Diao*, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang;
The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023. [Paper]

Generative Modeling, Sampling, and Optimization over the Probability Space

PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium
Shihong Ding*, Hanze Dong*, Cong Fang, Zhouchen Lin, Tong Zhang;
Journal of Machine Learning Research (JMLR), 2024. [Paper]
Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference
Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yian Ma, Tong Zhang;
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024. (Spotlight)
Best Paper Award of ICML 2024 Workshop SPIGM. [Paper]
Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo
Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang;
Annual Conference on Learning Theory (COLT), 2024. [Paper]
Faster Sampling via Stochastic Gradient Proximal Sampler
Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang;
International Conference on Machine Learning (ICML), 2024. [Paper]
Reverse Diffusion Monte Carlo.
Xunpeng Huang*, Hanze Dong*, Yifan Hao, Yian Ma, Tong Zhang;
International Conference on Learning Representations (ICLR), 2024. [Paper]
Particle-based Variational Inference with Preconditioned Functional Gradient Flow
Hanze Dong, Xi Wang, Yong Lin, Tong Zhang;
International Conference on Learning Representations (ICLR), 2023. [Paper]
Weakly Supervised Disentangled Generative Causal Representation Learning
Xinwei Shen, Furui Liu, Hanze Dong, Qing Lian, Zhitang Chen, Tong Zhang;
Journal of Machine Learning Research (JMLR), 2022. [Paper]
Normalizing Flow with Variational Latent Representation
Hanze Dong*, Shizhe Diao*, Weizhong Zhang, Tong Zhang;
arXiv preprint arXiv:2211.11638, 2022. [Paper]
Mathematical models of Overparameterized Neural Networks
Cong Fang, Hanze Dong, Tong Zhang;
Proceedings of the IEEE (PIEEE), 2021. [Paper]

Robust and Efficient Machine Learning

Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Baohao Liao*, Yuhui Xu*, Hanze Dong*, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong;
The Forty-Second International Conference on Machine Learning (ICML), 2025. [Paper]
ThinK: Thinner Key Cache by Query-Driven Pruning
Yuhui Xu, Zhanming Jie, Hanze Dong, Lei Wang, Xudong Lu, Aojun Zhou, Amrita Saha, Caiming Xiong, Doyen Sahoo;
The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Paper]
Mitigating the alignment tax of RLHF
Yong Lin*, Hangyu Lin*, Wei Xiong*, Shizhe Diao*, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang;
The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Renjie Pi, Tianyang Han, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang;
The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
Spurious feature diversification improves out-of-distribution generalization
Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang;
International Conference on Learning Representations (ICLR), 2024.
Catalyst Acceleration of Error Compensated Methods Leads to Better Communication Complexity
Xun Qian, Hanze Dong, Tong Zhang, Peter Richtárik;
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023. [Paper]
Bayesian Invariant Risk Minimization
Yong Lin*, Hanze Dong*, Hao Wang, Tong Zhang;
Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral) [Paper]
Local Augmentation for Graph Neural Networks
Songtao Liu, Rex Ying, Hanze Dong, Lanqing Li, Tingyang Xu, Yu Rong, Peilin Zhao, Junzhou Huang, Dinghao Wu;
International Conference on Machine Learning (ICML), 2022. [Paper]
Learning the Compositional Spaces for Generalized Zero-shot Learning
Hanze Dong, Yanwei Fu, Leonid Sigal, Sung Ju Hwang, Xiangyang Xue;
Computer Vision and Image Understanding (CVIU), 2022. [Paper]
Vocabulary-informed Zero-shot and Open-set Learning
Yanwei Fu, Xiaomei Wang, Hanze Dong, Yu-Gang Jiang, Meng Wang, Xiangyang Xue, Leonid Sigal;
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. [Paper]

Academic Services

Managing Editor of JMLR.
Area Chair of NeurIPS, ACL.
Reviewer of ICML, ICLR, CVPR, IEEE Transactions on Signal Processing, TMLR.

Teaching

2022-2023 Fall -- MATH 2011: Introduction to Multivariable Calculus.

2022-2023 Fall -- MATH 6450J/COMP 6211E: Optimization for Machine Learning.

2021-2022 Spring -- MATH 2011: Introduction to Multivariable Calculus.

2021-2022 Fall -- MATH 1023: Honor Calculus.

2020-2021 Spring -- MATH 1014: Calculus II.

2020-2021 Fall -- MATH 1013: Calculus IB.

2019-2020 Spring -- MATH 1014: Calculus II.

2019-2020 Spring -- MATH 2411: Applied Statistics.

Awards

2024 -- Best Paper Award of ICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling.

2024 -- NAACL Best Demo Paper Award.

2022/2023 -- RedBird PhD Award, The Hong Kong University of Science and Technology.

2020/2021/2022 -- Best TA Award, The Hong Kong University of Science and Technology.

2019 -- Outstanding Graduates, Fudan University.

2016 -- The First Price, China Undergraduate Physics Tournament

2015 -- Outstanding Freshmen, Fudan University

2014 -- Bronze Medal and The First Price, Chinese Physics Olympiad.

Contact Me

A at B dot com
A = hendrydong
B = gmail

For work-related inquiries, please contact C at D dot com
C = hanzedong
D = microsoft

For JMLR-related inquiries, please contact E at jmlr.org
E = hanze.dong