The Real AI Productivity Story Isn't About Your Best People

Feb 24
5 min read

When a Fortune 500 software company quietly rolled out an AI conversational assistant to its customer support team, the productivity results came back strong. Agents resolved 15% more issues per hour on average. Customer sentiment improved. Requests to speak to a manager dropped by 25%.

But the headline number buried the most interesting finding.

The AI barely moved the needle for the company's best, most experienced agents. The people who improved most dramatically - resolving up to 35% more issues per hour - were the newest, least experienced workers on the team. Within two months of using the tool, junior agents were performing at the level it previously took eight months to reach.

The AI hadn't replaced the experts. It had packaged their expertise and handed it to everyone else.

A cluttered desk with papers flying around a computer screen. Piles of documents, books, and colorful pens create a chaotic office scene.

Augmentation Is Not a Buzzword

For the better part of a decade, the dominant public conversation about AI at work has been about displacement - which jobs will survive, which won't, who's next. It's a legitimate concern, and it deserves serious treatment. But it has crowded out an equally important and more immediately actionable question for most leaders: how do you actually design work so that humans and AI make each other better?

That's the question the research is now starting to answer - and the findings are more nuanced, and more useful, than the replacement narrative allows.

The study above, conducted by economists Erik Brynjolfsson, Danielle Li, and Lindsey Raymond and published in the Quarterly Journal of Economics, is one of the most rigorous real-world examinations of generative AI in the workplace to date. Its core finding — that AI augmentation disproportionately benefits less experienced workers by democratizing access to the tacit knowledge of top performers — upends a common assumption about who gains from new technology.

Previous waves of workplace technology, from computers to robotics, consistently favored high-skilled workers. They widened the gap between top performers and everyone else. This study suggests generative AI may work differently: compressing that gap rather than expanding it, at least in certain work contexts.

That is a significant organizational opportunity. And most companies are not yet designed to capture it.

The Complementarity Problem

Here is the central challenge that organizations keep getting wrong. They deploy AI as if it is either a replacement for human work or a simple accelerant of it. The research points to a third, more productive framing: complementarity.

A 2024 paper in Management Science modeled what optimal human-AI collaboration actually looks like across different task types. The findings were precise. AI should automate routine, easily defined tasks. It should augment humans on tasks where human and AI performance are roughly comparable - because combining them produces results better than either could achieve alone. And for the most complex, ambiguous, judgment-intensive tasks, humans should work largely without AI oversight, because that's where human reasoning genuinely outperforms algorithmic pattern-matching.

The implication is that the question "should we use AI for this?" is almost always the wrong question. The right question is "which parts of this task benefit from AI input, and which parts require undistorted human judgment?" Organizations that can make that distinction with precision are the ones extracting real value. Organizations that apply AI uniformly, across all task types, are leaving performance on the table - and in some cases actively degrading it.

The same Brynjolfsson study found a telling detail in that direction: for the most experienced, highest-skilled agents, AI assistance produced small but measurable declines in conversation quality. The AI was getting in the way of expert judgment, not supporting it.

The Skill-Building Paradox

There is a subtler issue underneath the productivity data, and it is one that forward-looking HR leaders should take seriously now.

A 2025 study in Scientific Reports found something troubling: while collaboration with generative AI enhanced immediate task performance, those performance gains did not persist when workers later performed the same tasks independently. People got better results with the AI. They did not get better at the work.

This creates what might be called the skill-building paradox of AI augmentation. Used well, AI is a development accelerant - as the Brynjolfsson study showed, junior workers move down the experience curve faster. Used carelessly, AI can become a developmental shortcut that produces capable-looking output without building the underlying capability that output is supposed to reflect.

The difference between these two outcomes is not the tool. It is the design of the work around the tool. When AI handles the thinking that would have taught a junior employee something, the employee is faster today and no more capable tomorrow. When AI handles the lookup, the formatting, the pattern-matching - and frees the employee to engage with the harder judgment calls - genuine development happens.

This is a new design challenge for managers and L&D teams, and almost no organization has thought it through systematically.

What Great Human-AI Collaboration Actually Requires

Research across organizational behavior, management science, and information systems points to several conditions that determine whether human-AI teaming produces sustained value or just short-term efficiency gains.

Task clarity comes first. Teams that perform best with AI have clearly mapped which parts of their work benefit from algorithmic support and which require human discretion. This sounds straightforward; in practice it requires a level of task analysis that most job designs have never demanded.

Trust has to be calibrated, not assumed. Workers who defer to AI too readily - following every suggestion without independent judgment - actually underperform compared to those who treat AI recommendations as one input among many. Research consistently finds that the sweet spot is appropriate reliance: trusting AI where it demonstrably excels, overriding it where context and experience matter more. The Brynjolfsson study found that agents followed AI suggestions only 38% of the time - and that selective use was associated with better outcomes than either ignoring or always accepting the tool.

Psychological safety determines honest AI use. In organizations where employees fear that admitting AI limitations reflects badly on them, people quietly cover for tool failures rather than flagging them. This prevents the feedback loops that make AI systems better and prevents the organizational learning that makes human-AI collaboration improve over time.

Managers need new skills. Managing a team that works alongside AI requires a different capability set than traditional people management. Understanding what the AI can and can't do, designing work to maximize complementarity, coaching for human judgment rather than just output - these are genuinely new competencies, and most management development programs have not caught up.

The Democratization Opportunity Organizations Are Missing

Step back from the operational detail for a moment and the Brynjolfsson finding points to something strategically important.

If AI can package the tacit knowledge of your best people and make it available to everyone - faster, more consistently, and at scale - then the organizations that will win are not necessarily the ones with the most AI. They are the ones who are most deliberate about whose knowledge gets encoded, how it gets encoded, and what gets left to human judgment.

That is fundamentally a human resources question. It is about understanding what your top performers actually do differently, designing AI tools that reflect those practices, and building the organizational conditions in which everyone else can learn from them.

Most AI deployment conversations today are led by technology teams and focused on cost reduction. The research suggests the bigger opportunity belongs to leaders who are asking a different question: not what can we automate, but what can we learn from our best people - and how do we give that to everyone?