The Agentic Thinking Divide
Anthropic just put a name to the thing that separates companies getting real returns from AI and companies stuck at the demo stage. It's not the model. It's what you teach it.
Anthropic published a guide last week called Building AI Agents for the Enterprise. I read it the way I read most things that confirm what I've been saying in client rooms for a year: relieved that someone with more reach finally drew the picture.
The picture is a fork in the road they call the agentic thinking divide.
Here's the setup. As of September 2025, 40% of U.S. employees report using AI at work — double the 20% from 2023. So the adoption question is settled. The open question is whether all that usage produces a lasting advantage or fizzles into incremental gains that plateau by the end of the quarter.
Their answer, and mine: it depends entirely on scope.
Point solutions get point-solution results
"Organizations that treat AI as a point solution get point-solution results. A chatbot here, a summarizer there, a pilot that impresses as a demo but never scales beyond the team that built it."
That's the report's line, and it's the most honest sentence in it. I've watched this exact movie. A team buys seats, runs a slick pilot, the demo gets applause in the all-hands — and six months later nothing about how the company operates is different. The tool sits to the side of the work instead of inside it.
I wrote about this last year as the difference between AI addition and AI integration. Anthropic frames the same gap as chatbot vs. agent. A chatbot answers a question. An agent plans, makes decisions, uses tools across multiple steps, and applies your domain expertise to actually finish the task. One responds. The other operates.
The companies pulling ahead aren't the ones with the best chatbot. They're the ones who rethought three things at once: how their people work, how their processes run, and what their products can do.
The cases are impressive. The mechanism is the point.
The report leans on three customer stories. Worth repeating, because the numbers are real and they came from companies you've heard of:
- L'Oréal built an internal platform that routes plain-English questions through 15+ specialized agents and returns sourced answers with visualizations. 44,000 monthly users, 2.5M messages a month, and accuracy on conversational analytics that went from 90% with their previous GenAI approach to 99.9% with Claude.
- Lyft put Claude behind customer support. Resolution time dropped 87% — what used to be a 30-to-40-minute wait now resolves in seconds — and decision accuracy rose over 30%. They reinvested the savings into their support agents.
- Rakuten offloaded their agent infrastructure to a managed harness and went from shipping major releases once a quarter to every two weeks, with critical errors down 97%.
It would be easy to read those as "Claude is good." That's not the lesson. Two companies running the identical model get wildly different results depending on one variable: how much of their own context they encoded into it.
A general-purpose model gives you general-purpose output — the kind your team has to edit, deepen, and fact-check before it's usable. The gains show up when the system knows your standards, your terminology, your tools, your institutional knowledge. That's the difference between AI that drafts a document and AI that drafts a document your team can actually ship.
This is the part I spend most of my time on. The model is a commodity you can rent by the token. The advantage is the encoded knowledge sitting on top of it — your sales methodology, your compliance framework, your brand voice, your chart of accounts. That's not something a competitor can buy.
Why early movers compound
The best systems in the report do something subtle: they feed human expertise back into the AI's knowledge base. Every time a subject-matter expert reviews and corrects an output, that correction becomes the new baseline for everyone.
In a normal process, every project starts from scratch and demands the same review effort. In a compounding system, every expert review makes every future output faster and more accurate. Tribal knowledge stops living in your most experienced person's head and becomes infrastructure that every new hire inherits on day one.
That's why this is a head-start game. Every month of accumulated approvals, feedback, and refinement widens the gap. The company that starts narrow today isn't just ahead by one project — it's on a different curve.
If you're not L'Oréal
Here's where I'd push past the report. Most of the businesses I work with don't have an engineering org that can stand up a 15-agent orchestration layer. They read case studies like these and conclude the whole thing is for companies with a different budget.
It isn't. The mechanism scales down cleanly, and the rest of the report is essentially a recipe for doing it without a custom build — through tools like Claude Cowork and shared plugins that package a team's expertise once and distribute it to everyone. The four principles they end on are the same ones I'd give a 40-person company:
- Start with specificity, not scale. Give the system your real context from the first interaction. Generic output on day one and people never come back to the tool.
- Pick pilots with a measurable finish line. "Improve productivity" produces results easy to dismiss. "Cut contract review from five days to one" does not. Define the number before you start.
- Build for reuse from the beginning. Encode the knowledge once; let every team that needs it install it. The marginal cost of sharing is zero and the marginal value is enormous.
- Don't skip governance. Admin controls, audit trails, and approved-tool catalogs are prerequisites for rollout, not features you bolt on after things sprawl.
Their deployment arc runs about six months: a few weeks defining success criteria, two months on a real production pilot with two or three teams, then months four through six scaling to the rest of the org — where each new wave moves faster because it inherits the context the last wave built.
The actual takeaway
The closing line of the report is the one I'd underline:
"Your organization doesn't need a perfect plan. It only needs a specific starting point, quantifiable success criteria, and the willingness to learn from what happens next."
The most common mistake is waiting until the strategy is comprehensive before taking the first step. The companies winning with this started narrow, learned fast, and expanded with conviction. They picked one process where the pain was obvious, gave the system enough context to do real work, and measured honestly.
That's the whole job. The divide isn't between companies with AI and companies without it — almost everyone has it now. It's between companies that bolted it on and companies that built it in.
The report is Anthropic's "Building AI Agents for the Enterprise." The framing, the cases, and the metrics are theirs. If you want to figure out where your specific starting point is — the one process with an obvious finish line — let's talk.