The Four Layers Agents Cannot Run Without

Q: What is the dependency layer in AI?

The dependency layer is the infrastructure that AI agents depend on but cannot generate themselves. It includes four components: identity and attestation, orchestration and handoff, memory and state, and observability and audit. Without this layer, agents fail in production regardless of model capability.

Q: What infrastructure do AI agents need to run in production?

AI agents need four infrastructure layers to run in production: (1) Identity and attestation — verifiable credentials and cryptographic proof of actions, (2) Orchestration and handoff — coordination between agents in multi-step workflows, (3) Memory and state — persistent context across sessions and tasks, (4) Observability and audit — monitoring of agent decisions, not just performance metrics.

Q: Why do most AI agent POCs fail to reach production?

Most AI agent POCs fail because they start with model capability and try to bolt on infrastructure later. The missing pieces — trust verification, agent identity, decision monitoring, and physical-world attestation — cannot be retrofitted. Enterprises that succeed build the dependency layer first, then deploy intelligence on top.

← Back to Blog

October 21, 2016. Half the internet went dark. Twitter, Netflix, GitHub, PayPal, The New York Times — dozens of the world's largest sites became unreachable for six to eight hours. Their servers were running. Their code was fine. But a company called Dyn, one of the major DNS providers, was drowning under a 1.2-terabit-per-second DDoS attack from the Mirai botnet.

The sites did not fail. The invisible infrastructure they depended on failed. And that made the sites useless. Most users had never heard of DNS. Most users had never heard of Dyn. The protocol was so good at its job that billions of people used it every day without knowing it existed — until it broke.

The agent economy is building its own DNS right now. Beneath the visible layer of AI applications — the chatbots, the copilots, the impressive demos — is an invisible foundation I call the dependency layer. It has four components. The companies that build it will be the most valuable in the agent economy. And almost nobody is paying attention.

The Model Is Not the Product

Every technology era follows the same law: the infrastructure layer becomes more valuable than the application layer built on top of it. Railroads were worth more than the cargo companies. The electrical grid outlasted the early appliance makers. Cisco — which builds the routers nobody sees — hit $555 billion in market cap. Verisign — managing just two domain extensions — is worth $20 billion. Cloudflare delivers 15-30% of all web traffic. Most consumers have never heard of any of them.

Cloud infrastructure versus SaaS tells the recent story. Three companies — AWS, Azure, GCP — control roughly 65-70% of a $370 billion infrastructure market. Their parent companies are worth $2-3 trillion each. The largest pure SaaS company, Salesforce, is worth $260 billion. The infrastructure layer consolidated. The application layer fragmented. Per-company value of infrastructure players vastly exceeds the application layer.

The agent economy is following the same pattern. In 2024, AI venture capital hit $97-100 billion — and the overwhelming majority went to applications. Cognition AI raised $175 million. Sierra AI raised $110 million. Meanwhile, the companies building agent infrastructure — LangChain ($25M), CrewAI ($18M), Langfuse ($4M) — are an order of magnitude less funded. The visibility bias is operating at industrial scale. Everyone funds what they can see. Almost nobody funds what intelligence depends on.

The model is not the product. The model is the most visible part of a system that cannot function without the invisible layers beneath it. Here are those layers.

Layer 1 — Identity and Attestation

When Agent A calls Agent B, how does B know A is authorized? When a fleet of agents from one company interacts with a fleet from another, what protocol governs authentication? The answer today is: API keys and prayer. OAuth was built for humans clicking buttons in browsers. It was not built for agent fleets authenticating at machine speed across organizational boundaries.

Google announced the Agent2Agent (A2A) protocol in early 2025. Anthropic released the Model Context Protocol (MCP) in late 2024. But A2A focuses on communication, not identity. MCP focuses on tool access, not agent-to-agent authentication. A true "OAuth for agents" does not exist. Every enterprise deploying fifty agents across procurement, legal, compliance, and customer support is running hundreds of agent-to-agent interactions per day — each one a handshake in the dark.

Attestation is even harder. When an AI agent reports that a building passed a safety inspection, where is the proof? When an agent claims environmental sensors showed compliant readings, where is the cryptographic link between the sensor data and the agent's report? This is the gap between the digital world where agents operate and the physical world where consequences land. Almost no company is working on it. I spent years looking for attestation infrastructure and found nothing — not a product, not a protocol, not a credible attempt.

The closest working analogue lives outside Silicon Valley. India built population-scale identity and settlement protocols — Aadhaar and UPI — that look very much like what the agent identity layer needs to be: open, neutral, identity-first, real-time, architected for billions. The architecture transfers. The question is whether the agent economy adopts the pattern by design or relearns it the slow way.

Layer 2 — Orchestration and Handoff

A single agent performing a single task works in isolation. The moment you need agents to work together — one researches, another writes, a third fact-checks, a fourth publishes — you need orchestration. Something has to manage the sequence, handle handoffs, and decide what happens when one agent fails mid-workflow.

The landscape is fragmented across 15+ frameworks. LangChain has 95,000+ GitHub stars. CrewAI manages role-based multi-agent teams. Microsoft backs AutoGen and Semantic Kernel. OpenAI released an Agents SDK. Google has Vertex AI Agent Builder. Amazon has Bedrock Agents. No single player commands more than 20-25% of developer mindshare. This is the classic pre-consolidation phase — identical to cloud computing between 2006 and 2010, before AWS emerged as the clear leader.

Without orchestration, complex workflows — the kind that create real business value — are impossible. You get impressive demos of individual agents but cannot wire them into the coordinated systems that replace actual business processes. A company automating customer onboarding needs five agents executing in sequence, passing data, handling exceptions. Without orchestration, you have five disconnected tools.

Layer 3 — Memory and State

An agent that forgets everything between sessions is a parlour trick. Production agents need persistent memory — what happened in the last customer interaction, what decisions were made three weeks ago, what context matters for this specific workflow. The current state is ad-hoc: vector databases bolted onto agent frameworks, custom retrieval pipelines, fragile context windows that collapse under load.

The trust problem lives here too. When one of my agent businesses runs a security audit, I face a question no tool can answer: how do I know the agent actually checked everything it claims to have checked? The agent produces a text output and a confidence score. If it says it scanned 10,000 lines of code and found no critical vulnerabilities, I have two choices: believe it or redo the work myself. Neither scales. State verification — not just state storage — is the missing piece. Companies like Patronus AI and Galileo AI build evaluation tools, but they verify what agents say. Nobody is verifying what agents do.

Layer 4 — Observability and Audit

Traditional application monitoring — Datadog, New Relic — monitors calls: latency, token usage, error rates. But agents do not merely process requests. They make choices. When an agent decides to skip a step in a security check, or hallucinates a data point in a financial report, or chooses the wrong tool for a task, traditional APM does not detect it. The metrics are green. The latency is fine. The error rate is zero. And the output is wrong.

Datadog launched "LLM Observability" in 2024. New Relic launched "AI Monitoring." LangSmith captures execution traces. These monitor agent performance, not the quality of agent decisions. A fast, reliable agent that makes bad decisions is worse than a slow one that makes good ones. The result is silent failure — hallucinations propagate undetected, and an agent that is confidently wrong looks identical to one that is confidently right.

Why Every Failed POC Broke at One of These Layers

77% of enterprise AI agent projects never reach production. The default explanation — the models aren't ready — is wrong. The models are extraordinary. They reason, plan, write code, and analyze at levels that would have been science fiction in 2022. I've watched this pattern across 150+ client engagements over a decade: the models keep getting better, the production rate barely moves.

Every failure maps to one of these four layers. A financial services firm built a code agent better than their junior engineers — it sat in staging for six months because nobody could verify who built it or audit its 400 decisions. A procurement agent secured 12-18% better terms — killed by the vendor's legal team because it had no verifiable identity. A logistics agent reduced costs 23% in simulations — collapsed when an insurance claim required physical proof of a warehouse inspection.

These are not intelligence problems. GPT-5 will not fix them. Claude 5 will not fix them. You can make the agent 10x smarter and it still cannot get through a wall that is not an intelligence wall. These are infrastructure problems — the same kind that held back e-commerce before SSL and mobile payments before payment rails.

What to Build First If You're Starting Now

The enterprises that get agents to production flip the sequence. Instead of starting with the model and bolting on infrastructure later, they build the dependency layer first.

Identity and governance first. Define boundaries, liability, kill switches, and verifiable agent credentials before the agent writes its first line of code. Attestation second. Build cryptographic proof mechanisms for decisions and physical-world events. Then deploy intelligence into an environment where the trust architecture already exists.

The agent dependency layer will exceed $100 billion by 2030. The window to build it is open now — comparable to cloud computing not in 2015, when winners were evident, but in 2006, when most enterprises thought "the cloud" was a marketing term. By 2029, the dominant players will have locked in their positions. The infrastructure window does not stay open forever.

We are in the equivalent of 1995. Everyone sees the applications. Almost nobody sees the protocols. The people building the agent economy's plumbing in the next three years will be the Ciscos, the Verisigns, and the Cloudflares of the next era.

The invisible workforce needs an invisible foundation. That foundation is being built right now — by the people who can see it. If you're building across this layer, I want to hear what you've hit. Reach out on LinkedIn or X. The deeper dive is in the book.

Frequently Asked Questions

What is the dependency layer in AI?

The dependency layer is the infrastructure AI agents depend on but cannot generate themselves. It includes four components: identity and attestation, orchestration and handoff, memory and state, and observability and audit. Without this layer, agents fail in production regardless of model capability.

What infrastructure do AI agents need to run in production?

AI agents need four infrastructure layers: (1) identity and attestation for verifiable credentials and proof of actions, (2) orchestration and handoff for multi-agent coordination, (3) memory and state for persistent context, and (4) observability and audit for monitoring decision quality.

Why do most AI agent POCs fail to reach production?

Most POCs fail because they start with model capability and try to bolt on infrastructure later. The missing pieces — trust verification, agent identity, decision monitoring, physical-world attestation — cannot be retrofitted. The 77% failure rate reflects infrastructure gaps, not intelligence gaps.

How is the dependency layer different from the application layer?

The application layer is the visible part — chatbots, copilots, AI features. The dependency layer is the invisible infrastructure underneath. Historically, infrastructure layers (TCP/IP, DNS, cloud) become more valuable than the applications built on top. The same pattern is playing out in the agent economy.

What are the four layers every AI agent depends on?

The four layers are: (1) Identity and Attestation — how agents prove who they are and what they did, (2) Orchestration and Handoff — how agents coordinate multi-step workflows, (3) Memory and State — how agents maintain context across sessions, (4) Observability and Audit — how you monitor decision quality, not just performance metrics.

dependency-layer ai-agents infrastructure production-ai

The Four Layers Agents Cannot Run Without

The Model Is Not the Product

Layer 1 — Identity and Attestation

Layer 2 — Orchestration and Handoff

Layer 3 — Memory and State

Layer 4 — Observability and Audit

Why Every Failed POC Broke at One of These Layers

What to Build First If You're Starting Now

Frequently Asked Questions

What is the dependency layer in AI?

What infrastructure do AI agents need to run in production?

Why do most AI agent POCs fail to reach production?

How is the dependency layer different from the application layer?

What are the four layers every AI agent depends on?

Related reading

The Dependency Layer Thesis

When Intelligence Is Free, What's Left?

Why 77% Can't Get AI Agents to Production

Glossary: Dependency Layer

The AI Agent Economy — Book 1

The Four Layers Agents Cannot Run Without

The Model Is Not the Product

Layer 1 — Identity and Attestation

Layer 2 — Orchestration and Handoff

Layer 3 — Memory and State

Layer 4 — Observability and Audit

Why Every Failed POC Broke at One of These Layers

What to Build First If You're Starting Now

Frequently Asked Questions

What is the dependency layer in AI?

What infrastructure do AI agents need to run in production?

Why do most AI agent POCs fail to reach production?

How is the dependency layer different from the application layer?

What are the four layers every AI agent depends on?

Related reading

The Dependency Layer Thesis

When Intelligence Is Free, What's Left?

Why 77% Can't Get AI Agents to Production

Glossary: Dependency Layer

The AI Agent Economy — Book 1

Get predictions before they're published