My recent research centers around RL, long-horizon LLM post-training.
With backgrounds in probability and statistics, my past research includes mathematical theory and algorithm design of RL, training dynamics and generalization in deep learning, and RL for operations and economics.