Mousepower: agents that can’t be measured, can’t be managed.

SessionEngineering trackconfirmed

Mousepower: agents that can’t be measured, can’t be managed.

Day: Day 3 — Session Day 2
Time: 12:05pm-12:25pm
Room: Track 6
Track: AI Designers/Design Engineers

Accessible with the Engineering pass and above.

About this session

Agents have a measurement problem, which makes them impossible to efficiently manage. You’ve likely heard many say execution is now cheap, but judgement is the new bottleneck. This is because our evaluation frameworks weren’t designed for systems that tirelessly output in parallel. The canary in the coal mine is code generation becoming largely solved at the expense of breaking code review. As agents reverberate across all knowledge work, the same fracture will spread to artifacts, actions, & decisions. Yet without a scalable quality measure, we can’t ascend to a higher level of abstraction because we won’t trust the foundation below. So how do we design measurements that are efficient, intuitive, & trustworthy? Past paradigm shifts offer inspiration, such as James Watt not just building a better engine but also inventing horsepower to map it onto existing mental models. We need an equivalent quantification to communicate the “mousepower” of agents. Information theory gives us the starting point: concepts like entropy, ergodic processes, and Hamiltonian problems point us toward the most tractable trajectories — easier to verify than they are to solve.

Topics

AI Designers/Design EngineeringEvals & ObservabilityAI Product Management (PMs)

Speaker

Maximillian Piras

Yutori