News

06/2026 🌋: Check out our new work ACTS, which builds a controller agent that steers chain-of-thought in-flight for efficient and budget-aware reasoning!
05/2026 🌋: Our multi-objective alignment paper is accepted to ICML 2026!
04/2026 🌋: Check out our new work HiLL, which learns to generate adaptive and transferable hints via RL for addressing GRPO signal collapse!
04/2026 🌋: I am joining Meta this summer as a research scientist intern. See you at the Bay Area!