Terraform works because of one core concept that often gets misunderstood: state.
Most tutorials mention it briefly, but in real environments, Terraform state is one of the most important parts of your entire setup.
If you don’t understand it properly, you’ll eventually run into problems, especially when working in teams or deploying production infrastructure.
This post explains what Terraform state actually is, why it exists, and what really matters when you’re using Terraform in real-world environments.
What Terraform State Actually Is
At a basic level, Terraform state is a record of the infrastructure Terraform has created.
When you run terraform apply, Terraform provisions resources in Azure (or any other provider). At the same time, it writes a state file that contains information about those resources.
This file is not just a list of resources. It is a mapping between your Terraform configuration and the real infrastructure running in the cloud.
For example, if you define a virtual network in Terraform, the state file stores details such as:
- The resource ID in Azure
- The properties of the resource
- How that resource maps to your Terraform code
Without this mapping, Terraform would have no way of knowing what it created or how to manage it.
Why Terraform Needs State
Terraform is declarative. You describe what you want the end result to look like, rather than writing step-by-step instructions to build it.
Because of that, Terraform needs a way to understand the difference between what you’ve defined in your code and what actually exists in the cloud.
That comparison is what allows Terraform to decide what needs to be created, updated, or removed.
This is where state comes in.
When you run terraform plan, Terraform doesn’t just look at your code. It also looks at the current state file and queries the real infrastructure through the provider.
By combining these three views, it can work out what changes are required to move from the current state to the desired state.
Without state, Terraform wouldn’t be able to do this. It wouldn’t know what it has already created, what still exists, or what needs to change. It would effectively be operating without any memory of previous deployments.
Local State vs Remote State
When you first start using Terraform, the state is stored locally in a file called terraform.tfstate.
For small projects or learning environments, this works perfectly well.
However, as soon as you move into a real environment, local state quickly becomes a problem.
If more than one person is working on the same infrastructure, each person having their own local state file creates inconsistencies.
Terraform can lose track of resources, and in the worst cases, changes can overwrite or destroy existing infrastructure because the state is out of sync.
This is why real environments use remote state.
In Azure, this is typically stored in a Storage Account using Blob Storage.
Instead of sitting on someone’s laptop, the state is centralised and shared.
This means everyone is working from the same source of truth, pipelines can access it reliably, and changes become far more predictable.
State Locking (Why It Matters More Than You Think)
One of the biggest risks when using Terraform in a team is multiple people making changes at the same time.
If two engineers run terraform apply against the same environment simultaneously, you can run into all kinds of issues.
Changes can conflict, the state file can become inconsistent, and deployments can end up in a partially completed state.
Remote backends help solve this by introducing state locking.
When Terraform starts an operation, it locks the state so that no one else can modify it until the process finishes.
This prevents overlapping changes and ensures that deployments happen in a controlled, predictable way.
In Azure, this behaviour is handled automatically when you configure a proper backend. It’s easy to overlook, but in real-world environments, it’s one of the most important safeguards you have.
State Drift (Where Things Go Wrong)
State drift is one of the most common issues you’ll encounter with Terraform.
It happens when the real infrastructure changes outside of Terraform.
This could be something as simple as a setting being modified in the Azure Portal, or as serious as a resource being deleted manually.
When that happens, Terraform’s state no longer reflects reality.
The next time you run terraform plan, you may see unexpected changes as Terraform tries to reconcile the difference between what it thinks exists and what actually exists.
If you don’t understand what’s happening, this can lead to confusion or even unintended changes being applied.
In production environments, this is where problems tend to surface.
Why Manual Changes Are a Problem
In an ideal setup, Terraform acts as the single source of truth for your infrastructure.
In reality, teams often make manual changes in the portal, especially when something needs to be fixed quickly. While this might solve an immediate problem, it introduces inconsistency between the real infrastructure and the Terraform state.
Terraform doesn’t automatically know about these changes unless it refreshes and reconciles the state.
As a result, you can end up with confusing plan outputs, unexpected updates, or even outages if Terraform attempts to revert or overwrite those manual changes.
The more disciplined a team is about managing infrastructure through Terraform rather than the portal, the more reliable and predictable the environment becomes.
What Actually Matters in Real Environments
In practice, Terraform state is less about the file itself and more about how you manage it.
What matters is using remote state from the beginning, ensuring that state locking is in place, and avoiding manual changes outside of Terraform wherever possible.
It’s also important to understand what Terraform is doing during the plan and apply phases, rather than treating it as a black box.
When these things are handled properly, Terraform becomes a very predictable and safe tool, even in complex environments.
When they’re not, issues tend to appear quickly and can be difficult to diagnose.
Final Thoughts
Terraform state isn’t just an internal detail. It’s a fundamental part of how Terraform works.
Once you understand that state is the link between your code and the real infrastructure, everything else starts to fall into place.
Plan outputs make more sense, drift becomes easier to recognise, and team-based workflows become much more manageable.
If you’re using Terraform seriously, getting this right early will save you time, reduce risk, and make your deployments far more reliable.