Superhuman performance is a shape, not just nines.

SessionLeadership trackconfirmed

Superhuman performance is a shape, not just nines.

Day: Day 3 — Session Day 2
Time: 1:55pm-2:15pm
Room: Leadership 2
Track: AI Architects: Tokenmaxxing

Accessible with the Leadership (All-Access) pass and above.

About this session

I spent 500B tokens structuring & connecting the entire corpus of biopharma drug data for systems in use by 19/20 top pharmas. These systems perform reliably, without catastrophic errors, on PhD tasks at scale, in a rapidly evolving domain. Past a certain point, the shape of production error rates mattes much more than overall accuracy. For example: - A false positive due to name collisions in biology? For our users, this is a forgivable mistake, the kind a human would make, barely a second thought. - A false negative without near force majeure? Years of broken trust. Understanding what error shape delivers superhuman value requires product, domain expertise, and customer feedback. I'll review case examples from our experience, and highlight non-obvious wins - the cross-org meeting structure, taxonomy of errors, and org-wide eval management/triage strategies we used to know what and when to ship.

Topics

Evals & ObservabilityLLM Production InfraAI ArchitectsAI in Enterprise/Fortune 500AI in Healthcare

Speaker

Matthew Jewkes

Standard Cybernetics