Why building building agent quality platforms is hard.

SessionExpo trackconfirmed

Why building building agent quality platforms is hard.

Day
Day 2 — Session Day 1
Time
12:05pm-12:25pm
Room
Expo Stage 2
Track

Accessible with the Expo Explorer pass and above.

About this session

An eval platform is not just a test runner. You are building shared definitions of good, reliable data pipelines, labeling workflows, versioning, and trust in results across many teams and model changes. This session breaks down the hidden complexity, the common failure modes, and the design principles that make evals credible and usable in day-to-day engineering.

Speaker