The Problem with Averages

What a LinkedIn puzzle can teach us about the hidden flaws in our engineering and AI metrics.

I play those LinkedIn puzzles every day. As they get harder, my completion time goes up. No surprise there. But I noticed something else. My time was also getting much further away from the average completion time. My first thought was, "Am I slowing down more than everyone else?"

Then it hit me. Maybe the "everyone else" was changing. As the puzzles get harder, fewer people finish them. The average is no longer calculated from the whole group. It’s calculated from the small group of people who actually made it to the end. The metric was hiding a crucial piece of the story.

It's Not Just puzzles

This whole thing reminded me of work, especially when we talk about engineering metrics. We love to track things like velocity or cycle time. And when a team’s velocity drops, my first reaction is often, “They’re slowing down.”

But maybe not. Maybe the team just picked up a task with a ton of hidden complexity. Or maybe they’re tackling tech debt that will speed everyone up later. The difficulty of their "puzzle" went way up, but the metric doesn't show that. It just shows a lower number.

Looking at a single number seldom gives us the full picture. It tells us what happened, but it almost never tells us why.

The Same Trap in AI

Now we have AI that doesn't just answer questions. It takes action. We call them agents. They can plan tasks, use tools, and try to get things done on their own. So how do we measure them? The obvious metric is Task Success. Did it book the flight? Did it summarize the documents? Yes or no. Simple.

But it's the same trap, all over again. An agent could have a 100% success rate. But what if it took 50 steps to do something that should have taken three? What if it spent a ton of money on different tools to get there? Or what if a human had to jump in and help it along the way?

That "100% success" metric looks great on a chart. But it hides how clumsy, expensive, or helpless the agent really was. It’s yet another simple number that makes us feel good, while hiding a much more complicated truth.

Metrics Are a Starting Point

Whether it’s a puzzle average time to complete, a team's velocity, or an AI's accuracy, a single number isn't the answer. It’s a signal. It's a reason to pause and ask questions.