Publications
Authors with * contributed equally.
2024
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
Miao Lu*, Han Zhong*, Tong Zhang, Jose Blanchet
ArXiv preprint, Apr, 2024 [PDF]
Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates
Miao Lu*, Beining Wu*, Xiaodong Yang, Difan Zou
NeurIPS Workshop on Mathematics of Modern Machine Learning (M3L) 2023
International Conference on Learning Representations (ICLR) 2024 [PDF] [Poster]
2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
Zhihan Liu*, Miao Lu*, Wei Xiong*, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang
Advances in Neural Information Processing Systems (NeurIPS) 2023, Spotlight [PDF] [Code] [Slides]
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage
Jose Blanchet, Miao Lu, Tong Zhang, Han Zhong (alphabetical)
Advances in Neural Information Processing Systems (NeurIPS) 2023 (short version)
ArXiv preprint (long version), Aug, 2023 [PDF] [Slides]
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang
International Conference on Learning Representations (ICLR) 2023 [PDF] [Slides]
2022
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
Zhihan Liu*, Miao Lu*, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang
International Conference on Machine Learning (ICML) 2022 [PDF] [Code] [Slides]
Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, and No Retraining
Miao Lu*, Xiaolong Luo*, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang
International Conference on Learning Representations (ICLR) 2022, Spotlight [PDF] [Code] [Slides]
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization
Yufei Kuang, Miao Lu, Jie Wang, Qi Zhou, Bin Li, Houqiang Li
Association for the Advancement of Artificial Intelligence (AAAI) 2022 [PDF] [Code] [Slides]
|