Publications

Authors with * contributed equally.

2024

  1. Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
    Miao Lu*, Han Zhong*, Tong Zhang, Jose Blanchet
    ArXiv preprint, Apr, 2024 [PDF]

  2. Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates
    Miao Lu*, Beining Wu*, Xiaodong Yang, Difan Zou
    NeurIPS Workshop on Mathematics of Modern Machine Learning (M3L) 2023
    International Conference on Learning Representations (ICLR) 2024 [PDF] [Poster]

2023

  1. Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
    Zhihan Liu*, Miao Lu*, Wei Xiong*, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang
    Advances in Neural Information Processing Systems (NeurIPS) 2023, Spotlight [PDF] [Code] [Slides]

  2. Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage
    Jose Blanchet, Miao Lu, Tong Zhang, Han Zhong (alphabetical)
    Advances in Neural Information Processing Systems (NeurIPS) 2023 (short version)
    ArXiv preprint (long version), Aug, 2023 [PDF] [Slides]

  3. Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
    Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang
    International Conference on Learning Representations (ICLR) 2023 [PDF] [Slides]

2022

  1. Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
    Zhihan Liu*, Miao Lu*, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang
    International Conference on Machine Learning (ICML) 2022 [PDF] [Code] [Slides]

  2. Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, and No Retraining
    Miao Lu*, Xiaolong Luo*, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang
    International Conference on Learning Representations (ICLR) 2022, Spotlight [PDF] [Code] [Slides]

  3. Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization
    Yufei Kuang, Miao Lu, Jie Wang, Qi Zhou, Bin Li, Houqiang Li
    Association for the Advancement of Artificial Intelligence (AAAI) 2022 [PDF] [Code] [Slides]