There's a conversation happening in engineering circles right now that sounds something like this:

"If AI agents can provision infrastructure by calling cloud APIs directly, why do we need Terraform anymore? Why bother with infrastructure as code at all?"

It's a fair question. And honestly, I get why people are asking it.

We're watching generative AI reshape how we write code, debug systems, and interact with complex platforms. Agentic workflows, where AI agents autonomously execute tasks, are moving from research papers into production environments. So yeah, it's reasonable to wonder if IaC is about to become legacy tech.

But here's what I think is actually happening: AI isn't going to kill infrastructure as code. It's going to make IaC absolutely essential.

Let me explain why.

The Case Against IaC (And Why It Sounds Convincing)

The argument goes something like this:

An AI agent with access to AWS, Azure, or GCP APIs can provision a VPC, spin up compute instances, configure security groups, and wire everything together. It can do this conversationally. You describe what you need, the agent figures out the API calls, executes them, and your infrastructure exists.

No Terraform to write. No state files to manage. No modules to version. Just natural language → cloud resources.

Why would we maintain thousands of lines of HCL when an agent can handle it dynamically?

It's a seductive vision. But it falls apart the moment you think about how production infrastructure actually works.

What Happens When AI Calls Cloud APIs Directly

Let's play this out.

You ask an AI agent to "provision a production-ready Kubernetes cluster with autoscaling, monitoring, and network policies that comply with our security standards."

The agent makes 47 API calls across your cloud provider. Resources appear. Everything seems fine.

Now answer these questions:

What exactly did the agent create? Can you see a declarative representation of those 47 API calls? Or do you need to reverse-engineer it from the console?

How do you audit changes? When the agent modifies infrastructure next week, what's your diff? How do you know what changed and why?

What's your rollback strategy? If something breaks, are you asking the agent to "undo what you did" and hoping it remembers? How do you rollback to a known-good state?

How do you enforce policies? Before those API calls executed, did anyone validate them against your security requirements, cost budgets, or compliance frameworks?

How do you prevent catastrophic mistakes? What stops an agent from deleting your production database because it misunderstood context in a conversation?

How does this work across teams? When the agent provisions infrastructure, how do other engineers discover it? How do they modify it? Do they also need to talk to the agent?

Here's the thing: these aren't edge cases. They're the entire job of infrastructure management. And this is exactly why agentic CloudOps will amplify IaC, not replace it.

IaC Is the Control Plane AI Needs

Think about infrastructure as code differently for a moment.

IaC isn't just a way for humans to manage infrastructure. It's a codified, auditable, version-controlled representation of desired state. It's the governance layer between intent and execution.

When an AI agent works with IaC instead of calling cloud APIs directly, something powerful happens:

The agent's work becomes visible. Instead of ephemeral API calls that vanish into cloud provider logs, you get Terraform files, Pulumi code, or Crossplane manifests. You can read exactly what the agent intends to create before it happens.

Source control becomes your audit trail. Every infrastructure change, whether made by a human or an AI agent, gets committed to Git. You have a complete history. You can see who (or what) made changes, when, and why. This isn't just nice to have. It's how you maintain control in an agentic world.

Shift-left guardrails become possible. When agents generate IaC, you can run the same CI pipeline checks you run on human-written code. Policy validation. Security scanning. Cost estimation. Compliance checks. The agent's proposed changes get validated before they touch production.

Rollback becomes trivial. Something went wrong? Revert the commit. Apply the previous state. You're back to a known-good configuration. No need to ask an agent to remember and undo its actions.

Collaboration stays sane. Other engineers can discover, understand, and modify infrastructure by reading code. They don't need to interrogate an agent about what it did three months ago.

IaC isn't the bottleneck in this workflow. It's the control plane that makes autonomous infrastructure management safe.

How Agentic Workflows Actually Use IaC

Let's get concrete about what this looks like.

Scenario 1: AI-Assisted Infrastructure Provisioning

A developer asks an AI agent: "I need a staging environment for our new microservice with a PostgreSQL database, Redis cache, and API gateway."

Here's what happens in an IaC-first agentic workflow:

  1. The agent analyzes the request and your existing infrastructure patterns
  2. It generates Terraform code (or Pulumi, or Crossplane manifests) that defines the required resources
  3. The generated code gets committed to a feature branch
  4. CI pipeline runs: policy checks validate security rules, cost estimation flags the PostgreSQL instance size, compliance scanning ensures encryption at rest
  5. A human reviews the PR (or an approval agent validates against predefined rules)
  6. The code merges and CD pipeline applies the Terraform plan
  7. Infrastructure provisions, and the agent reports back with endpoints and credentials

The agent did the heavy lifting, translating requirements into infrastructure code. But the actual IaC provisioning happened through your existing IaC toolchain with full governance.

Scenario 2: Autonomous Drift Remediation

An AI agent continuously monitors infrastructure drift. It detects that someone manually modified a security group in the AWS console (we've all been there at 2am).

In an IaC-first workflow:

  1. The agent detects the drift
  2. It determines whether the manual change should be codified or reverted
  3. If codifying: it generates a code update that matches the current state
  4. If reverting: it creates a remediation plan that restores the original configuration
  5. Both options generate PRs with clear explanations
  6. After approval, the agent runs a plan and applies the fix through the IaC pipeline

The agent isn't making blind API calls. It's working through the same IaC control plane your team uses for smarter drift management, maintaining a single source of truth.

Scenario 3: Multi-Cloud Resource Orchestration

An agentic workflow needs to provision resources across AWS, Azure, and GCP while maintaining consistent networking, security policies, and observability.

With a tool like Crossplane, the agent:

  1. Generates Kubernetes-native resource definitions (XRDs) that abstract cloud providers
  2. Composes infrastructure using declarative manifests
  3. Lets Crossplane handle the actual cloud API orchestration
  4. Maintains everything in version-controlled YAML that other systems can read and validate

The agent gets the power of multi-cloud management and abstraction. Your organization gets auditable, governed infrastructure changes.

Why Terraform (and Friends) Aren't Going Anywhere

Let's talk about the IaC tools themselves.

Terraform remains the most widely adopted IaC tool because it's cloud-agnostic, has mature state management, and a massive provider ecosystem. AI agents generating Terraform code can leverage thousands of existing modules and patterns. They're not starting from scratch.

Crossplane is increasingly important in agentic scenarios because it turns infrastructure into Kubernetes resources. Agents can manage infrastructure using the same patterns they use for application workloads. Everything becomes declarative, composable, and GitOps-friendly.

Pulumi and other code-first IaC tools give agents the ability to generate infrastructure in general-purpose languages. An agent fluent in TypeScript or Python can generate Pulumi code that feels natural to application developers.

These tools aren't competitors to AI. They're the interfaces through which AI safely manages infrastructure.

The Governance Problem No One's Talking About

Here's what keeps me up at night about direct cloud API access for AI agents:

Without IaC as an intermediary layer, how do you govern autonomous infrastructure changes at scale?

You can't rely on cloud provider audit logs alone. They're verbose, hard to query, and don't capture intent. They tell you what happened, not why or whether it aligned with your policies.

You can't trust agents to self-govern. Even the most sophisticated AI models make mistakes, hallucinate, or misinterpret context. You need systematic guardrails.

Source control + IaC gives you both.

When every infrastructure change flows through code, you get:

  • Pre-execution validation (shift-left security, policy-as-code)
  • Human-readable diffs that show exactly what's changing
  • Approval workflows that can be human, automated, or hybrid
  • Complete audit trails that survive beyond cloud provider retention windows
  • The ability to replay, rollback, or branch infrastructure changes

This isn't theoretical. This is how you run infrastructure safely in an agentic world.

What This Means for Cloud Engineers

If you're a cloud engineer wondering how AI fits into your workflow, here's my take:

Don't fear AI replacing IaC. Start thinking about AI that amplifies your IaC practice.

The engineers who thrive in the next few years will be the ones who:

  • Understand how to design IaC that's agent-friendly (composable modules, clear abstractions, well-documented patterns)
  • Build CI/CD pipelines that validate agent-generated code with the same rigor as human-written code
  • Implement policy-as-code frameworks that work regardless of whether a human or agent wrote the Terraform
  • Treat source control as the authoritative system of record for all infrastructure state

You're not becoming obsolete. You're becoming the architects of systems that let agents work safely and effectively.

The Future Is Agentic IaC, Not Agentic Cloud APIs

Look, I'm not saying AI agents should never call cloud APIs directly. For ephemeral resources, development environments, or exploratory work, direct API access makes sense.

But for production infrastructure? For systems that need auditability, governance, and control?

IaC is the answer. It's the layer that turns autonomous agent actions into manageable, reviewable, reversible infrastructure changes.

Generative AI and agentic workflows aren't making infrastructure as code obsolete. They're making it more critical than ever. Because the more autonomous our systems become, the more we need codified, version-controlled representations of what they're doing.

The question isn't whether AI will kill IaC.

It's whether your IaC practice is ready for AI to make it infinitely more powerful.