Field note · July 3, 2026

Governing AI as operational change, not software activity

Most enterprise AI reporting answers a question nobody funded: are people using the tool? The question leadership actually paid for is whether recurring work got better. Those are different questions, and the gap between them is where AI programs fail.

Microsoft Copilot, OpenAI Enterprise, and comparable platforms expose seat-level telemetry by default: active users, session frequency, feature engagement, adoption curves. Most organizations can produce these reports within days of deployment, and many treat them as evidence of progress. They are evidence of software activity. Leadership funds AI to change recurring work — faster cycle times, more throughput, less rework, better decisions at the point of action — and none of those outcomes appear in an active-user dashboard.

The cost of that gap is now documented at scale. MIT's NANDA initiative found that 95 percent of enterprise generative-AI pilots deliver no measurable P&L impact, against an estimated $30–40 billion in annual spending. Gartner predicted in 2024 that at least 30 percent of generative-AI projects would be abandoned after proof of concept by the end of 2025, citing cost and unclear business value as primary reasons. Whatever the exact numbers turn out to be in hindsight, the direction is consistent: activity everywhere, decision-grade evidence almost nowhere.

Why the usual reporting fails

Activity metrics break down for reasons any portfolio operator will recognize. There is no denominator — a user who opened the tool twice looks identical to one whose weekly workflow depends on it. Power users distort averages, making a narrow footprint look like broad deployment. Baselines rarely exist, so post-launch claims are retrospective and anecdotal. And usage data lives in one system while business outcomes live in Jira, ServiceNow, Salesforce, or a BI layer, with no deliberate bridge between them. The result is visible reporting without decision-grade signal.

A standard worth governing against

Meaningful adoption is a condition, not a rate: AI is in the critical path of recurring work, at sufficient scale and consistency, to produce measurable change in workflow outcomes or decision inputs — across more than a small concentration of power users. Meeting that standard takes three layers working together. Platform telemetry confirms breadth: who is active, how often, whether adoption is concentrated. Workflow performance metrics carry the value question: did cycle time improve, did throughput rise, did quality hold, did rework fall — answers that already exist in the systems organizations use to run operations. And explanatory validation supplies interpretation: pre/post comparison on the same workflow, matched cohorts, manager pulse checks, quality sampling. Perfect attribution is not the goal. Decision-grade evidence is.

What this looked like in practice

At Doosan GridTech, I used AI ahead of each governance cadence to decompose an integrated plan spanning parallel deployment streams, validate portfolio data and dependencies, and flag only the inconsistencies and critical-path risks that needed human attention. The value claim was specific and checkable: cadence time shifted from reconstructing what had happened to deciding what to do next. At T-Mobile, AI scanned inconsistent tracking fields across roughly five hundred initiatives and proposed reclassifications that product managers confirmed, declined, or amended one by one. In both cases the AI work was attached to a defined recurring workflow, the output had a named human owner, and the outcome could be challenged with evidence. That is the whole model.

If your AI program reports monthly and you cannot name the workflow that improved, the baseline it improved against, and the person accountable for that claim — you are reporting software activity. More telemetry won't fix that. Treat AI like every other investment the portfolio governs: funded against an outcome, measured against a baseline, and owned by someone who has to defend the number.

A shorter version of this note circulates on LinkedIn. This is the complete version.