Testing Terraform code with Go and Terratest

Towards high-quality infrastructure-as-code part 1

A muddy quarry with vehicle tracks
Photo by Ivan Bandura on Unsplash

As a cloud engineer, I love Terraform. With Terraform, I don’t have to worry about keeping track of infrastructure changes or compute dependencies between each component. Terraform is also cloud-agnostic, so all the Terraform knowledge I’ve accrued over the years can quickly transfer between cloud providers and even into Kubernetes clusters.

While Terraform protects the user against many common mistakes, errors still creep up. An error I’ve encountered many times was a network security group misconfiguration that prevented VMs from communicating inside a Vnet. The Terraform code was syntactically correct but did not work as intended.

Given a spec, it’s valuable to check whether your infrastructure-as-code (IaC) works as expected. For example, you’d like to make sure you can reach your public-facing webservers from the internet, but not your databases. Or perhaps ensure your network security rules do not expose SSH/RDP ports to the internet.

This blog post will walk you through a methodology for testable IaC and provide an example implementation to get you started.

Terratest

I’ve chosen Terratest to define my IaC test. In essence, it wraps Terraform’s CLI interface into a Go API. Terratest allows you to programmatically pass inputs as variables to Terraform and retrieve outputs from the Terraform state.

It’s important to note that his approach is not exclusive to Go. For example, pytest-terraform implements a similar functionality in python.

Overall, it’s the team’s choice what tool they should use for testing infrastructure. This blog post’s advice is still relevant no matter the library used and the cloud provider targeted.

High-level description of a testable IaC

The testing framework will follow this script:

  1. Define a test fixture in Terraform referencing the code.
  2. Use Terratest to pass in inputs to the test fixture.
  3. Use Terratest to init and apply the test fixture to a lab subscription.
  4. Use Terratest to retrieve the outputs from the test fixture.
  5. Take the Terraform outputs and use them to retrieve the actual behavior of the deployed resources. Retrieving the cloud resources can be done using either the cloud’s SDK or connecting directly to the compute resources, for example, SSH’ing into a VM.
  6. Verify the actual behavior matches with the expected behavior.

This methodology has some similarities to unit testing, but they behave more like integration tests or end-to-end tests. The tests tend to run for upwards of 30 minutes, create cloud resources, and are susceptible to cloud errors or network issues. Due to the long deployment times, you might not test all possible combinations of inputs. You must carefully choose what tests to run and maximize the number of tests run in parallel.

Example

The following is an example deployment with some tests. By no means this represents a real workload, nor it follows best practices of Go/Terraform. It is, however, a helpful template to get you started. We’ll be using the Azure cloud for this example, but this advice should work for all cloud providers.

Consider the task of implementing a network solution for a generic workload with the following constraints:

  • The network must be in the `West Europe` region.
  • The network must be in the CIDR block `10.20.0.0/16`.
  • The network must be inside a new resource group.
  • The resource group’s name must be in the format `${WORKLOAD}-rg`, where WORKLOAD is a user-input.

Deciding what and how to test is an art in itself. For this example, we will focus on testing if we are meeting the requirements.

We’ll organize the code in the following structure:

example_network_module
├── src
│ └── main.tf
└── test
└── spec_test.go

According to the spec, we defined `src/main.tf` as such:

So far, so good. The code creates a resource group with the appropriate naming scheme and in the right region. Inside that resource group, the code also creates a network with a suitable CIDR block.

Now, onto testing the code. Using Terratest, we pass the inputs to Terraform, run a `terraform apply` operation, and retrieve the terraform outputs. After, we use Azure’s Go SDK to assert that the resources exist and respect the requirements.

Depending on the cloud provider, it may be necessary to authenticate the user account. In Azure, the easiest way is to install the `az` CLI tool and login with the command:

$ az login

Make sure the subscription you’ll use is a non-production one. On Azure, you can set the default subscription with the following command:

$ az account set — subscription "My-test-subscription"

On the `test` folder, install the Go dependencies:

# -t is for installing test dependencies; -v is for verbose output
$ go get -t -v ./...

Finally, run the tests with:

$ go test -timeout 30m -parallel 10

I’ve hosted the complete example here.

Conclusion

Like business code, building testable infrastructure creates more straightforward and modular code. Good tests prevent a broader class of infrastructure errors and give the infrastructure team (and their stakeholders) more confidence when deploying IaC.

Unfortunately, testable infrastructure is not without disadvantages. Adding a testing methodology means the infrastructure team now has to know an additional programming language on top of Terraform. Testable infrastructure is also harder to change and takes longer to develop. In the example from the last chapter, adding tests doubled the number of lines of code.

I believe, in most situations, the tradeoff leans towards having at least a few tests. If your team wants to embrace testable IaC, make sure you discuss the choice with the stakeholders and stress the need to balance the speed of development and quality-of-service.

Happy coding!

Founder and Software Engineer at Unicoeding. I help organizations with their cloud and data infrastructure. You can find us at www.unicoeding.com.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store