Post 1 of 5 in “Enterprise AI from the Inside Out”
In 2024, enterprises built nearly half their AI solutions in-house. By 2025, 76% were buying off the shelf (Wharton GBK AI Adoption Report, 2025) [1]. That’s the fastest build-to-buy collapse in recent enterprise software history. But engineering leaders are still trying to build their way into AI — and they might be right to. The question is why.
Over the past year, I’ve been talking to engineering leaders who are standing up enterprise AI teams. I manage embedded platform engineering for 200K IoT devices, so I expected these conversations to orbit around model architectures and prompt engineering — the technical stuff. Instead, every single conversation landed on legal review timelines, data residency requirements, hallucination liability, and who owns the output when an LLM generates it. One engineering director told me their legal team spent four months approving a Slack bot that summarizes meeting notes. Four months. For an internal tool. That gap — between the technical capability we have and the organizational readiness we don’t — broke open a question: what if the entry point to enterprise AI isn’t a model? What if it’s a governed environment to run one in?
That’s the thesis behind this series: the fastest path to enterprise AI isn’t picking a model — it’s building the internal platform infrastructure to govern one.
The scaling gap nobody talks about
McKinsey’s March 2025 State of AI report [2] found that only 7% of enterprises have fully scaled AI enterprise-wide. Seven percent. And early 2026 data isn’t more encouraging — ModelOp’s March 2026 Governance Benchmark [11] found that 67% of enterprises now have 101–250 proposed AI use cases, but 94% have fewer than 25 in production. The ideas aren’t the problem.
Depending on how strictly you define “fail,” the picture gets worse. MIT’s 2025 research [3] found that 95% of enterprise AI pilots fail to deliver measurable financial returns. That’s their definition: no measurable financial returns. You can argue it’s too strict, and I think it probably is. Plenty of pilots teach organizations something valuable without showing up in a quarterly report. But even with generous accounting, the number is brutal.
The bottleneck isn’t model capability. MIT’s same research [3] found that 73% of enterprise data leaders cite data quality and completeness as the primary barrier to scaling AI. Not compute. Not talent. Data quality. And the realistic timelines reflect this: Deloitte’s 2025–2026 State of AI in Enterprise report [4] pegs a typical pilot at 10–14 weeks, with enterprise deployment taking 4–9 months on top of that.
This is an infrastructure problem. Data lineage, access controls, governance frameworks — the boring stuff. The stuff platform engineers have been solving in other domains for years.
Dev Note: I assumed the scaling problem was about compute. After a dozen conversations with people actually shipping enterprise AI, I realized it’s about data lineage and access controls — the same boring problems I’ve solved for IoT telemetry for years.
Why the biggest companies all started the same way
Goldman Sachs rolled out an internal AI assistant firm-wide in June 2025 after testing with smaller pilot groups. Their CIO emphasized amplifying internal workflows before anything client-facing (Fortune, June 2025) [5]. BCG gave 33,000 employees ChatGPT Enterprise access. What happened? Employees created 18,000 custom GPTs internally, without top-down direction (Medium consulting industry analysis, 2024–2025). Stripe built internal tools — a Radar Assistant for fraud rules, a Sigma Assistant for SQL generation — before shipping any customer-facing AI. Their Head of Data & AI put it simply: the functions requiring specialized skills are increasingly augmentable by AI (Latent Space podcast, 2024–2025).
Three very different companies. Same playbook.
There’s a compliance reason this works that every engineering leader I’ve talked to understands instinctively: internal AI tools face lighter regulatory scrutiny. Contained exposure. No third-party liability. Under the EU AI Act, effective February 2025, internal-only systems don’t carry the same reporting requirements as customer-facing ones. Every compliance team I’ve talked to approves internal tools faster.
There’s a CFO reason too. Workday’s 2024 research [6] found that 50% of CFOs will cut AI funding if it can’t prove ROI within 12 months. Internal tools clear that bar because the math is simple: time saved multiplied by fully-loaded hourly rate equals hard savings. Don’t pitch “AI transformation.” Pitch “we eliminated 5 hours per week of policy lookup per employee.” One number lands in a budget review. The other gets a polite nod and a follow-up that never gets scheduled.
Dev Note: I keep coming back to the BCG number — 18,000 custom GPTs built by employees when you just give them access. That’s not a top-down AI strategy. That’s what happens when you remove the friction and let domain experts find their own use cases.
The trap: when internal tools become pilot purgatory
Here’s where my own argument starts to fall apart.
Remember that opening stat? In 2024, 47% of enterprise AI solutions were built internally versus 53% purchased. By 2025, that flipped hard: 76% purchased versus 24% built (Wharton GBK AI Adoption Report, 2025) [1].

That’s not a gradual shift. That’s a market telling you something.
However, contrarian data is also visible. Menlo Ventures’ 2025 State of Generative AI report [12] found that internal-facing and customer-facing AI use cases actually move through the pipeline at nearly identical rates — which undercuts the assumption that internal is the easier on-ramp. TechTarget’s analysis of MIT deployment data [13] found that enterprises building AI entirely in-house were twice as likely to fail as those using external platforms. The very thing I’m arguing for — building it yourself — has a measurably worse track record.
It gets more uncomfortable from there. ISACA’s 2025 research [7] found that 68% of enterprise employees using GenAI at work are accessing publicly available tools through personal accounts. Fifty-seven percent admitted to entering sensitive company data into those tools. The irony stings: “internal tools first” as a safety play creates its own safety problem if you move too slowly. Employees will route around you. They already are. Shadow IT AI is here, it’s either you are ignoring it or embracing the demand.
Then there’s the cost problem. MIT and Fortune’s 2025 reporting found that 85% of organizations underestimate AI costs by more than 10%, with 24% off by 50% or more (CIO Magazine, 2025) [3][8]. I’ll dig deeper into this in a later post, but the numbers suggest most teams don’t know what they’re signing up for when they decide to build.
I don’t have a clean resolution for this section. The data genuinely pulls in two directions.
Dev Note: I’ll be honest — this is the section that made me question my own thesis. If 76% of enterprise AI is now purchased, should I even bother building from scratch? Is the instinct to build just a builder’s bias? Maybe — but I think the learning is in the building, not the tool. The rest of this series is me finding out.
Platform, not tool — what the survivors did differently
The enterprises that escaped pilot purgatory share one thing: they built platforms, not tools.
Pfizer’s PACT platform with AWS spans 14 AI and ML projects, saved 16,000 hours annually, and cut infrastructure costs by 55% (AWS/Pfizer case study, 2025) [10]. They built a centralized platform during their internal phase and deployed projects on top of it. Guardian Life centralized data and AI governance tied to measurable outcomes before scaling enterprise-wide (RT Insights, 2025). Stanford Health Care’s ChatEHR showed 30–40% faster chart reviews built on a governed data layer (RT Insights, 2025).
The distinction matters. Building a tool teaches you prompting. Building a platform teaches you governance, data pipelines, cost management, and organizational adoption patterns. The second set of skills is what scales. The tool is the excuse. The platform is the point.
This isn’t a new pattern for platform engineers. The progression from “one device integration” to “a managed IoT platform” mirrors “one AI chatbot” to “an enterprise AI platform” almost exactly. The technology changes. The scaling pattern doesn’t. Enterprises that treat each IoT device as a one-off project never scale past a handful. The ones that built a platform scaled to thousands. Same logic applies here.
Some argue enterprises should skip internal tool infrastructure entirely. a16z’s “One Prompt, Zero Engineers” (2026) [9] makes the case for vibe-coding replacements on demand. It’s worth taking seriously — but it conflates the tool with the platform underneath it. You can vibe-code a chatbot. You can’t vibe-code data governance.
I should be honest about one gap: most documented platform successes are large enterprise — Pfizer, Guardian Life. Whether platform thinking scales down to mid-market companies with smaller teams and tighter budgets is a fair question I don’t have a satisfying answer for yet.
Dev Note: This is the insight that reframed everything for me. The question isn’t “should I build or buy my first AI tool?” The question is “am I learning platform patterns or just building a demo?” Every enterprise that escaped pilot purgatory answered that question honestly.
What comes next
Does the internal-tools-first strategy actually create the organizational muscle for customer-facing AI, or does it just create a comfortable optimization loop that never graduates? The data can’t tell you yet. The only way to know is to do it and be honest about what happens.
That’s what this series is about. I’m going to build a policy Q&A bot starting at $0 in AWS costs to learn the four enterprise patterns — compliance, data integrity, hallucination mitigation, adoption — that every conversation pointed me toward. Then I’ll upgrade it to production on Bedrock, break down what it costs, and build a second tool to see if the patterns hold.
Stay tuned…
References
- Wharton GBK AI Adoption Report (2025) — Enterprise AI spending and build-vs-buy trends
- McKinsey State of AI Report (March 2025) — Enterprise AI scaling and deployment statistics
- MIT/Fortune Report (2025) — Enterprise AI pilot failure rates and cost estimation gaps
- Deloitte State of AI in Enterprise (2025–2026) — Deployment timelines and readiness data
- Fortune (June 2025) — Goldman Sachs internal AI assistant firm-wide rollout
- Workday (2024) — CFO expectations for AI ROI timelines
- ISACA (2025) — Shadow AI and enterprise employee GenAI usage patterns
- CIO Magazine (2025) — Enterprise AI cost overruns
- a16z “One Prompt, Zero Engineers” (2026) — Contrarian take on vibe-coding enterprise tools
- AWS/Pfizer PACT Platform Case Study (2025) — Centralized AI platform results
- ModelOp 2026 AI Governance Benchmark Report (March 2026) — AI use case explosion vs. production gap
- Menlo Ventures State of Generative AI in the Enterprise (December 2025) — Internal vs. customer-facing AI pipeline conversion rates
- TechTarget/MIT (2025) — In-house AI build failure rates vs. external platforms

Leave a Reply