After the Standing Ovation

Align

I keep having the same conversation.

A VP of Engineering calls me. Their exec team just got offstage at the company’s annual conference. Standing ovation. They demo’d “autonomous AI agents” that would transform customer workflows. Stock bumped 3%. Press coverage. LinkedIn posts with fire emojis.

Then someone walked backstage and said four words:

“Ship it by Q3.”

And now the engineering team is staring at a prototype held together with curated data and scripted inputs, wondering how to turn a controlled demo into something that works when real users touch it.

The gap between what gets promised on stage and what it takes to actually deliver a reliable agent is the most dangerous disconnect in enterprise tech right now.

The Data Is Catching Up to the Hype

An S&P Global survey of over 1,000 IT leaders found that the percentage of companies abandoning the majority of their AI initiatives nearly tripled in one year.

Companies abandoning AI initiatives

2024

17%

2025

42%

Nearly 3x in one year

Credit: S&P Global

The Cliff of AI Initiative Abandonment

From 17% to 42% in a single year

Credit: S&P Global

Nearly half of all AI POCs are dying on the vine.
Not because the tech doesn’t work. Because the infrastructure doesn’t exist.

95% of “Agent” Products Are Rebadged Software

Gartner coined a term for what’s happening: agent washing. Of the thousands of vendors claiming agentic AI capabilities, roughly 95% are rebadged existing software.

Each square = 1% of "agent" vendors

~5% Genuinely agentic

~95% Rebadged software

Credit: Gartner

The Agent Washing Field

Genuinely agentic

Rebadged software

Credit: Gartner

When your CTO comes back from Dreamforce convinced agents are a solved problem because every vendor on the floor said so, the false confidence cascades into timelines, staffing, and customer promises your engineering team then has to live with.

Build

The Demo Worked. Production Won’t.

A tech executive goes onstage. The agent performs beautifully: summarizing customer data, generating reports, even making a recommendation. The audience applauds.

Then the executive walks backstage and tells their product and engineering teams to deliver this capability across every workflow. By next quarter.

The teams are thinking: That demo ran against 50 curated test cases. We have 50,000 edge cases in production. These are not the same thing.

On Stage

curated test cases / hand-picked data / scripted inputs

≠

In Production

50,000

edge cases / messy real-world data / unpredictable users

Enterprise Customers Give You One Shot

Getting a Fortune 500 company to adopt a new AI capability takes months of relationship building, security reviews, and proof-of-concept work.

When the agent finally goes live, the team gets one shot.

If the agent hallucinates on a customer’s first interaction — if it surfaces wrong data, makes a nonsensical recommendation, or loses context mid-conversation — that customer doesn’t file a bug report and wait for v2. They shut it off.

“We don’t get an alpha and a beta with these customers. We get one shot. And right now I can’t tell you with any confidence what will happen when we take the guardrails off.”

— VP of Engineering, enterprise AI company

Agent Reliability Has a Formula

And your denominator is killing you.

Eval Precision × Ontology Depth × Trace Coverage

Time Pressure

High numerator + managed pressure

Reliable agent that compounds daily. Every failure is a lesson.

Low numerator + high pressure

Brittle agent that embarrasses the company after being hyped on stage.

Investing in evals feels slower upfront.
It’s the only thing that makes you faster.

Get the full newsletter, free.

Join founders and builders who read Self Aligned every week.

After the Standing Ovation

The Data Is Catching Up to the Hype

Companies abandoning AI initiatives

The Cliff of AI Initiative Abandonment

95% of “Agent” Products Are Rebadged Software

Each square = 1% of "agent" vendors

The Agent Washing Field

The Demo Worked. Production Won’t.

Enterprise Customers Give You One Shot

Agent Reliability Has a Formula

Get the full newsletter, free.

Same Agent. Different Users. Wildly Different Results.

The Performance Canyon

The People Who Sang the Land Into Being

Get the full newsletter, free.