Why Tokenmaxxing Is Both Right and Wrong for Enterprise AI Adoption

By Staff Writer | Published: May 11, 2026 | Category: Leadership

Metas viral AI token leaderboard has exposed a deeper truth about enterprise AI strategy: the real question is not whether to tokenmaxx, but what metrics your organization needs at each stage of AI adoption.

When AI token leaderboards make sense (and when they don’t)

When Meta Platforms’ internal dashboard ranking employees by AI token consumption leaked into public view in April 2026, it triggered a response out of all proportion to its origins as a side project by a single employee. Executives, venture capitalists, and startup founders rushed to stake out positions on “tokenmaxxing,” the practice of maximizing AI token consumption as a measure of AI adoption. The debate that followed, captured in Isabelle Bousquette’s Wall Street Journal report, was vigorous and revealing. It was also, in important ways, misdirected.

Both camps have real arguments. The tokenmaxxing advocates—companies like Writer and Sendbird and investors like Sequoia Capital’s Sonya Huang—are right that aggressive AI usage can drive the behavioral and cultural change organizations need to remain competitive. The critics, including HubSpot’s Yamini Rangan, Jellyfish’s Andrew Lau, and Blitzy’s Brian Elliott, are right that token counts untethered from business outcomes become a measurement failure. The mistake is treating these as mutually exclusive positions when they are, in fact, arguments about different problems at different stages of organizational AI adoption.

The real question is not whether to tokenmaxx. It is what metrics your organization needs right now, and what it will need next.

The cultural problem token leaderboards try to solve

Before dismissing token leaderboards as a misguided vanity-metric exercise, leaders need to understand what problem these tools are designed to solve. Writer’s CEO May Habib is explicit about this: the goal is not precise return on investment measurement. It is a mindset shift. “The second you start thinking about the individual business ROI of one agentic action, you will never do anything agentically,” she told the Journal. The broader goal, she emphasizes, is getting employees to think and operate differently.

This is not a naive position. It addresses a genuine organizational challenge that every company undertaking large-scale technology adoption faces: the gap between tool availability and tool proficiency. Research by McKinsey & Company consistently identifies human adoption, not technology capability, as the primary barrier to capturing AI value. In the firm’s 2024 State of AI report, McKinsey found that while 65 percent of organizations reported regularly using generative AI in at least one business function (double the previous year’s figure), fewer than a third had systematic processes for capturing the financial value of those deployments. The technology was available. The behavioral change was not.

Token leaderboards address adoption inertia directly. They create social proof, generate competitive energy, and normalize the behavior of reaching for AI tools when tackling problems. Sendbird’s Abhi Jothilingam told the Journal that the leaderboard pushed him to test more ideas and build faster—behavioral changes that, while not captured in a raw token count, represent exactly the kind of experimentation that builds durable AI capability over time. The leaderboard was not measuring his productivity. It was changing it.

The Goodhart’s Law problem

Here is where tokenmaxxing advocates need to exercise more intellectual honesty. The moment token consumption becomes a formal target—particularly when attached to public rankings and rewards—it activates one of management science’s most durable principles: Goodhart’s Law.

First articulated by British economist Charles Goodhart in 1975 and later reformulated by anthropologist Marilyn Strathern into the version most managers recognize—“When a measure becomes a target, it ceases to be a good measure”—this principle has been documented across every domain where proxy metrics replace substantive evaluation. Sales teams chase call volume at the expense of relationship quality. Students optimize for test scores at the expense of genuine learning. And now, predictably, employees run AI agents on personal projects to climb a leaderboard.

Habib acknowledges this candidly. She knows the metric is gameable. She knows employees will burn tokens on personal projects. Her position is that the cultural upside outweighs the measurement noise, a defensible stance for an AI startup where urgency is real and tolerance for ambiguity is culturally embedded. It is a substantially riskier position for a large enterprise where the signal-to-noise ratio in tokenmaxxing behavior will be lower, and where the cultural side effects—anxiety, performative AI use, and neglect of tasks that do not generate token activity—could be meaningful.

Research on gamification in enterprise settings supports this concern. Kevin Werbach and Dan Hunter’s foundational work on organizational gamification found that competitive leaderboard mechanics reliably drive short-term behavioral change but frequently create adverse long-term effects: demotivation among lower-ranked participants, metric gaming by higher-ranked ones, and erosion of intrinsic motivation in favor of extrinsic reward. These are not hypothetical risks in AI adoption contexts. They are predictable outcomes that need to be actively managed.

What outcome metrics actually look like

The critics of tokenmaxxing, including Rangan, Lau, and Elliott, are correct that outcomes should be the ultimate measure of AI adoption success. Their position becomes less satisfying, however, when pressed on specifics: what outcomes, measured how, on what timeline. This is harder than it sounds.

AI productivity measurement presents genuine methodological challenges.