Terraform Security

I think the Terraform state is great – and I’m tired of pretending it’s not!

I think the Terraform state is great – and I’m tired of pretending it’s not!

TL;DR The Terraform state has more benefits than drawbacks, and it should not be considered a disadvantage in a well managed infrastructure as code environment!

Jump to recipe

What is Terraform state?

Terraform is used for creating infrastructure. When Terraform creates a resource, an exact representation of said resource is almost always created in a text based «database». This database is refreshed whenever you run terraform plan, terraform apply, or the specific terraform refresh command. When we run terraform plan the actual infrastructure is compared with the state to determine what has changed and what needs to be changed. This is why Terraform can generate the nicely formatted plan, with all the well explained steps needed for your infrastructure to be reconciled with the “code” you have written.

All in all a good thing 👍

Why do we need it?

There are several reasons why Terraform needs a state file. Some of them are:

  • Keeping track of metadata about the resources created
  • Seeing which changes happened outside of your IaC
  • Determining resources are not changed from run to run = idempotency
  • Showing detailed information of subservices created, removed, or updated
  • Allowing for predictability in deployments
  • Making sure there are no concurrent updates to the same resources from different sources
  • Knowing what the current terraform working directory is actually managing and making sure it dosn’t manage anything else.

Lots of reasons why we need the Terraform state! There are more, but I think these cover the most important aspects.

How is it stored?

Creating a Terraform state is not something you manually have to do. This is done for you the first time you run terraform plan in a Terraform folder and you have some form of resources in your Terraform config.

The default state storage option is a local state file. This is a file named terraform.tfstate in the working directory where the init command is run.

A local state file could look something like this (without actual resource content):

{
  "version": 4,
  "terraform_version": "x.y.z",
  "serial": 8,
  "lineage": "<some guid>",
  "outputs": {},
  "resources": [],
  "check_results": null
}

This is a very simple state file, but you can see the skeleton structure of such a file. As you might have noticed, the state file is a json file. This does not change even if you use remote state.

Remote state

When you are working on your own, the local state file will work as an initial configuration. The challenges appear when you need to work together with other people. They can’t see your local state file, and you can’t see theirs. This can lead to some interesting situations, where resources could exist in several different state files. The resource to state file mapping must be 1:1, and a single resource can never be predictably managed by two different state files.

This is where storing your state in a remote backend comes in. A remote backend for your state makes sure the actual state file is created and stored in a remote location. In my case this is almost always an azurerm backend in a storage account.

You can find different backends here.

How is it secured?

You need to protect your state file! This is achieved in several ways but the most important are network restriction, authentication, and encryption.

Network restriction

Store the state file in a secure location where it is not publicly available. This can be an Azure storage account with either the network firewall enabled, or with private endpoints for the most secure configuration.

Terraform Cloud is an option, but only if you are using the Hashicorp Cloud Platform. I haven’t actually used HCP, so this is not covered here.

Whatever network option you choose - your CI/CD runners/agents or your local client need network access on HTTPS to access the state for read and update operations. It goes without saying that you should allow only https traffic to your storage account.

Authentication

The state file storage account should be configured with the most secure form of authentication. In the case of Azure storage account this is Entra ID (previously Azure Active Directory) authentication with shared access keys disabled. Read more on authentication methods here.

The easy way out for authentication here is shared access keys, but this is not the most secure way. You should always avoid using keys when necessary.

Encryption

The state file is not encrypted by default. The content is plaintext json. This means that you need to encrypt this file by other means. OpenTofu now supports native encryption, but this is not in scope of my post.

The most user friendly way of making sure your file is encrypted at rest is to keep it in an Azure storage account which has encryption by default. To increase the security you can also add infrastructure encryption, and going even further you can use your own encryption keys. More on encryption here.

Managing the state

You should never manually edit the state file! If you are in a situation where you need to make the state match your environment, you should use the special blocks I mention below.

These Terraform blocks can be used:

Moved is used to move resources from one resource to another. Modules can also be moved to other modules.

Removed can be used to remove resources from the state if you are not managing with Terraform anymore. You also need to combine this with a lifecycle block to prevent Terraform from trying to remove the resource for you (unless you want it to, in which case you could just remove the resource from your code).

Import is used to import existing resources you want to start managing with Terraform. Please note that you can actually import resources and have Terraform generate the configuration for you. Not perfect, but can be a good starting point.

The legacy commands terraform state mv/rm/import are not optimal for use anymore – unless you are using older Terraform versions which lack these features. The new blocks are much more user friendly and predictable. Less chance of making a mistake, and you get the possibility of seeing the changes in your plan before applying them with apply.

What benefits do we get from the Terraform state?

This is a list of what I consider benefits of the Terraform state. They are from my recollection, and might be opinionated.

  • Predictable changes
  • Know what exactly changed between runs (Configuration drift)
  • Implementation control (what am I actually changing with this apply operation?)
  • Prevents multiple concurrent deployments with state locking
  • Easy to use with CI/CD pipelines
  • Easily accessible review of changes (plain text explanations of what is happening)
  • Use resource information as source for other values (resource names, ids, ip addresses – without reading from the actual resources)
  • Keep track of generated values (uuids, random strings, pet names)
  • Use as data source in other relevant environments
  • Versioned infrastructure «snapshots» that can be backed up and looked at

What are the drawbacks?

The fact that my list of drawbacks is shorter than the list of benefits might indicate that I am biased. I might be biased for all I know, and I hope someone will let me know if I am dead wrong. Getting feedback is a great way to learn!

  • Secrets are stored plaintext in state
  • Variables are stored plaintext in state
  • Can become corrupt if not maintained correctly or too large
  • Requires encrypted storage — not natively encrypted in Terraform from Hashicorp
  • CI/CD agents/runners need network access; can lead to catch 22 situations when configuring storage firewall or authentication methods

These are some of my recommendations for handling the Terraform state. It is not something you play with and treat lightly, but is something to protect and secure. Treat it as sensitive information, and use your best judgement. When in doubt, enable the firewall.

  • Use remote state!
  • Least privilege with granular RBAC
  • Entra ID authentication only
  • Use backend with native lease support for more benefits
  • Use ephemeral resources for e.g. key vault secrets
  • Remember that terraform plan output is also plaintext with content from your state
  • Enable multiple levels of encryption if possible
  • Add the state file to your .gitignore together with the other files better ignored
  • Use partial configuration for your state files to avoid storing secrets in your code

In conclusion

You get both advantages and disadvantages with Terraform state. If you are using Terraform, you need to handle state in one form or another. It is a core part of the Terraform experience. In my opinion, the advantages far outweigh the disadvantages.

Terraform state is a good thing. Use it for what it is worth, and protect it as it was intended. There are advantages and disadvantages, but in no way do this make Terraform any less of a tool compared to any other.

Please leave a comment on LinkedIn if you have questions. Also leave a comment if you disagree and think I have missed something! I am working on adding comments to my blog, but it has proved difficult to get working.