Hi there! I'm a research scientist at Google DeepMind working scalable Reinforcement Learning (RL) & Post-training recipes. I was most recently a captain/co-owner for the Gemini 3 Flash and a core contributor to Gemini 3 Pro. Previously, I was a memeber of Technical staff at Cohere, where I worked on the post-training of Command series of models & worked under Sara Hooker advancing the frontier of post-training research for LLMs. In a previous life-time, I worked on CAD tool optimization for Field Programmable Gate Arrays (FPAGs) and various aspects of hardware design at the University of Toronto. In yet another life-time, I played video-games professionaly :)

Select Publications

  1. Back to basics: Revisiting REINFORCE style optimization for learning from human feedback in LLMs ACL 2024
    Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker
  2. Intriguing Properties of Quantization at Scale NeurIPS 2023
    Arash Ahmadian*, Saurabh Dash*, Hongyu Chen*, Bharat Venkitesh, Stephen Gou, Phil Blunsom, Ahmet Üstün, Sara Hooker
  3. Aya Expanse: Combining research breakthroughs for a new multilingual frontier
    John Dang*, Shivalika Singh*, Daniel D’souza*, Arash Ahmadian*, et. al,
  4. Self-improving robust preference optimization ICLR 2025
    Eugene Choi*, Arash Ahmadian*, Matthieu Geist, Oilvier Pietquin, Mohammad Gheshlaghi Azar
  5. Extremely parameter efficient MoE for instruction tuning ICLR 2024
    Ted Zadouri, Ahmet Üstün, Arash Ahmadian, Beyza Ermiş, Acyr Locatelli, Sara Hooker
  6. RLHF can speak many languages: Unlocking multilingual preference optimization for LLMs EMNLP 2024 (Oral)
    John Dang, Arash Ahmadian, Kelly Marchisio, Julia Kreutzer, Ahmet Üstün, Sara Hooker