Will Brown

Researcher · Prime Intellect

Twitter / X

Bio

Columbia PhD (2024). Lead author on PRIME-RL (async & decentralized RL training at scale) and Verifiers (environments for LLM RL). Focus: multi-turn reasoning in LLM agents, credit assignment, distributed RL. Published at NeurIPS/ICLR. Ex-Morgan Stanley. willcb.com

Session (1)

Day 31:55pm-2:15pmTrack 9

PRIME-RL: Async & Decentralized RL Training at Scale

Will Brown is Research Lead at Prime Intellect and the creator/maintainer of the open-source `verifiers` library and the Environments Hub, making him one of the most hands-on practitioners in agentic reinforcement learning. His session offers direct insight into the infrastructure decisions and research driving multi-turn RL training at scale.

Recent talks (4)

RL Environments at Scale – Will Brown, Prime IntellectAI Engineer Code Summit · Nov 2025

How Prime Intellect Builds Scalable Infrastructure for Agentic RLRay Summit 2025 · Nov 2025

Training Agentic Reasoners — Will Brown, Prime IntellectCO/AI · Jul 2025

Open Questions in Agentic RL — Will Brown (Prime Intellect)Intelligence Unbound · May 2025

GitHub

@willccbb

Recent writing (2)

Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Reward Design · paper · May 2025
Technical Report: Full-Stack Fine-Tuning for the Q Programming Language · paper · Aug 2025

Podcasts & interviews (2)

Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect · Latent Space · May 2025
Building the GitHub for RL Environments: Prime Intellect's Will Brown & Johannes Hagemann · Training Data (Sequoia Capital) · Feb 2026

Public activity researched automatically · as of Jun 2026