How to Stop Shipping Low-Quality RL Environments

SessionEngineering tracktentative

How to Stop Shipping Low-Quality RL Environments

Day
Day 2 — Session Day 1
Time
3:45pm-4:05pm
Room
Track 9
Track
Forward Deployed Engineering

Accessible with the Engineering pass and above.

About this session

Training harnesses are data generators: when environments are flaky, stale, reward-hacked, or mismatched to production, models learn the wrong behavior. This talk distills common RL environment failures across coding, SaaS, and support-agent workflows, then offers a practical framework for building production-grade harnesses with clean signal, realistic state, fail-fast behavior, and trustworthy rewards.

Speaker