Arash Ahmadian

Computer Engineer · Researcher in RL & Large Models

About

I work at the intersection of reinforcement learning and large models — focusing on scalable RL, preference training, and post-training techniques for deployed models. My work spans model training, efficiency at scale, and practical post-training recipes for large language and multimodal models.

Experience

  • Senior Research ScientistGoogle DeepMind

    June 2025 - present

    Scalable RL & post-training; core post-training for released Gemini models.

  • Senior Member of Technical StaffCohere

    Sept 2024 - June 2025

    Research: RL, Preference Training, Model Merging. Technical lead on multiple model releases.

  • Researcher / Intern rolesVector Institute, Univ. of Toronto, Cerebras

Awards & Scholarships

  • NSERC Research Award (2020)
  • Google Summer of Code (2021)
  • Multiple Academic In-course Scholarships (2020–2022)

Select Publications

  1. Back to basics: Revisiting REINFORCE style optimization for learning from human feedback in LLMsACL 2024
  2. Intriguing Properties of Quantization at ScaleNeurIPS 2023
  3. Aya Expanse: Combining research breakthroughs for a new multilingual frontier(multi-author)
  4. Self-improving robust preference optimizationICLR 2025
  5. Extremely parameter efficient MoE for instruction tuningICLR 2024
  6. RLHF can speak many languages: Unlocking multilingual preference optimization for LLMsEMNLP 2024 (Oral)