dev-resources.site
for different kinds of informations.
Testing Framework in Terraform 1.6: A deep-dive
In my previous blog post A Comprehensive Guide to Testing in Terraform: Keep your tests, validations, checks, and policies in order I went through all the options for testing and validation that are available to you when you write your Terraform configurations and modules. We saw check blocks, pre-conditions and post-conditions related to a resource's lifecycle, custom conditions for input variables and output values, and more. The latest and greatest topic I covered was the testing framework that arrived in Terraform 1.6.
In this post I want to focus more on the testing framework to uncover the possibilities that it brings.
Background on testing with Terraform
If this is the first time you hear about the new testing framework for Terraform I would like to give you a short introduction to what it is.
Terraform 1.6 brought a new testing framework to general availability after it had been available as an experimental feature for a period of time. The release notes for version 1.6 listed a single new feature1:
terraform test
: Theterraform test
command is now generally available. This comes with a significant change to how tests are written and executed, based on feedback from the experimental phase.Terraform tests are written in
.tftest.hcl
files, containing a series ofrun
blocks. Eachrun
block executes a Terraform plan and optional apply against the Terraform configuration under test and can check conditions against the resulting plan and state.
What we learn from the release notes is that we can now write tests for our Terraform configurations by including one or more .tftest.hcl
files that each contain a series of run
blocks. We also learn that a run
block runs terraform plan
or terraform apply
commands. This means that these tests could create real infrastructure from your configuration. For me this is a good thing. I am a strong believer in when it comes to testing infrastructure-as-code there is no way to be sure it will work unless you actually deploy it for real. Why is that? There are just too many things that could go wrong. There might be hidden dependencies that you have no idea of before you actually try to create new infrastructure.
What is not clear from the release notes is who this testing framework is intended for. Is it meant for all Terraform users? Should you immediately jump on the TDD-with-Terraform train and start writing tests for all Terraform configuration? This is not the case. At least this is not the primary intended case. The testing framework is designed for module producers.
Are you in charge of creating infrastructure modules internally for your organization or publicly for the Terraform community? Then you are a module producer and this testing framework is for you.
Are you consuming modules in order to create the infrastructure for your application? Then this testing framework is not primarily intended for you. However, there is nothing stopping you from using it if you think it makes sense for your situation.
Module producers write code that other users will consume. Users of your modules depend on the contract that your module exposes. What is part of the module contract? Generally this includes the following items:
- The input variables your module expects.
- The output values your module produces.
- Any externally visible resources your module creates. This could include configuration files, application gateways in Azure, network loadbalancers in AWS, a Cloud Bigtable table in GCP, or it could be pretty much anything else.
That last point might make you wonder if there are any resources created in a module that is not part of the contract? There definitely could be. Some resources could be internal implementation details that are required in order to construct the rest of the infrastructure. If a resource could be swapped out with a different resource without module consumers noticing then it is an internal implementation detail and not part of the contact.
If you are a module producer and you make an update of your module where you unintentionally make a significant change to the contract your module exposes, then this mistake could end up causing trouble for your module consumers.
This is exactly the reasoning behind any other kind of test in software development, testing Terraform modules is no different!
One last point to make about the new testing framework is that you write your tests in HashiCorp Configuration Language (HCL). This means there is no need to learn a new language in order to write tests for your Terraform code. There is no need to install an additional test tool that you need to keep track of and update and scan for vulnerabilities and so on. There is no need to mix your Terraform configuration with a bunch of test-related framework files. Run your tests and deploy your infrastructure using one and the same Terraform binary.
With all that background out of the way, let's move on to seeing all the nitty-gritty details of what this testing framework can do.
Nomenclature
Sometimes I need to remind myself of the nomenclature of the Terraform HCL code. To make sure we are all on the same page I introduce the nomenclature I use here:
-
A block are containers of other content. A block have a block type and zero or more labels. The generic block looks like this:
<BLOCK TYPE> "<BLOCK LABEL 1>" "<BLOCK LABEL 2>" ... "<BLOCK LABEL N>" { # block content }
In Terraform the number of labels is zero, one, or two. A block can contain other blocks. Some common block types are
terraform
,provider
,resource
,data
, andrun
. -
An expression represents a literal value such as a string or a number, or they could be referencing other values such as the name of a resource. An expression could also be more complex consisting of function calls, string interpolations, and references. Some examples of expressions:
"this is a string value" 12345 azurerm_storage_account.my_account.name "rg-${var.name}-${var.location}"
-
An argument is the assignment of a value (from an expression) to a name. Arguments appear inside of blocks. Some examples of arguments:
resource_group_name = "rg-my-resource-group" type = list(string) count = length(var.number_of_regions)
Even if the details of the HCL were familiar from before the nomenclature might be unfamiliar.
Test framework patterns
The run
block is what executes a given test when we run terraform test
. This block is central to the testing framework, so this is a block you need to become familiar with.
In the following subsections I will go through three testing patterns2 that you might see in Terraform.
Before we look at the patterns let's briefly look at a typical directory structure for Terraform tests.
First of all, remember that your test files should use the
.tftest.hcl
file ending. If not, theterraform test
command will not execute therun
blocks for your tests.When you execute
terraform test
you should be located in the module root directory. This is the directory where your primarymain.tf
file exists. The Terraform binary will look for your test files in the root directory or in a directory namedtests
. I recommend you place your tests in thetests
directory, and do not place test files in the module root directory.A typical directory structure for a simple module with tests is this:
$ tree . . āāā main.tf āāā outputs.tf āāā providers.tf āāā tests āĀ Ā āāā main.tftest.hcl āāā variables.tf
If you place your test files somewhere else you need to add
-test-directory=path/to/tests
to theterraform test
command. But once again I recommend that you keep your test files in thetests
directory to avoid confusing future contributors to your module.How many test files should you have? The simple answer is it depends! If you are building a large module with many moving parts you will probably need to have several test files divided up into coherent and related parts that test a certain part of your module. If your module have a small interface (variables and outputs) it might suffice with a single test file. Use common sense here, if it feels like a file has too many tests then it probable does.
Pattern 1: Assertions
The first pattern is simple, it consists of a run
block with a nested assert
block:
run "file_contents_should_be_valid_json" {
command = plan
assert {
condition = try(jsondecode(local_file.config.content), null) != null
error_message = "The file is not valid JSON"
}
}
Let's break down this run
block:
- The
run
block has one label. This label is the name of the test. You should give your test a self-explanatory name that describes what it does. If the test fails you should immediately know why it failed. In this example the name isfile_contents_should_be_valid_json
. If this test fails I know that the contents of a file was not valid JSON. - This
run
block executes aterraform plan
command. You specify what command you would like the test to execute in thecommand = <command>
argument. If you leave this out it will default to execute anapply
command. Personally I think it is a good idea to be clear and always add thecommand = <command>
argument to be explicit in what the test does. - The
run
block can contain zero or more nestedassert
blocks. Eachassert
block has acondition = <expression>
argument where<expression>
should evaluate totrue
orfalse
to indicate if the test passes (true
) or fails (false
). If<expression>
evaluates tofalse
then the expression inerror_message = <expression>
will be displayed to the terminal (or in Terraform Cloud). In this case the error message isThe file is not valid JSON
.
Although this example showed a single run
block containing a single assert
block, remember that you could include multiple run
blocks, each containing multiple assert
blocks.
Pattern 2: Expecting failures
The second pattern concerns tests where we expect the test to fail, and we want the test to report success if it does. This is a common testing strategy. The following run
block has a nested variable
block and an expect_failures = [ ... ]
argument:
run "bad_input_url_should_stop_deployment" {
command = plan
variables {
config_url = "http://example.com"
}
expect_failures = [
var.config_url
]
}
There are some new things to look at in this run
block so let's break it down:
- The
variables
block allows you to provide input values for any variables that your module expects. In this case thevariables
block is provided as a nested block to therun
block, but it could also be provided as a standalone block outside of anyrun
blocks. In that case the values would apply to allrun
blocks in the entire file. If you include a standalonevariables
block you can still include nestedvariables
block inside of arun
block to override the global values. - The
expect_failures = [ ... ]
specifies that we expect this test to fail, and we list the reasons for failure in the array expression of the argument. In this particular example I say that I expect this test to fail due to the variable namedconfig_url
. This basically mean that I validate the value provided for theconfig_url
variable in my Terraform module, and the value provided in this test (http://example.com
) should result in a failing validation. If the plan can proceed as normal without any failures, then this test would fail.
It is worth spending some time discussing expect_failures
. The values in this array must be checkable objects with associated custom conditions. In my previous article I wrote a lot about custom conditions. Objects that can include custom conditions are variables, outputs, resources, data sources, and check
blocks.
An important point about these custom conditions is that all of them except for the check
block will actually cause Terraform to halt the execution of a plan or apply operation. What does this mean for your tests? It means that if you want to combine expect_failures
with assert
blocks you have to be careful in how you construct your module and your corresponding tests. If you include a variable in the expect_failures
array of values and at the same time have an assert
block that expects a plan to finish, then the assert
block would never even be evaluated because the custom condition for the variable would halt the execution of the plan.
For this reason I suggest you keep your tests to either use one or more assert
blocks, or use the expect_failures = [ ... ]
argument, but not both unless you really know what you are doing.
Note that the array value to expect_failures
could contain multiple values. But you most likely would not want to mix the type of checkable objects you include in this array because of the reason discussed above.
Pattern 3: Using helper modules
Sometimes it is necessary to create supporting infrastructure before you can test your module. This could be the case if your module creates resources in Azure and it expects that you use an existing resource group for all the resources. In order to test a module like that there must be an existing resource group you can use. A simple solution to this is to create a resource group up front and just let it sit there in your cloud environment for as long as required. A better solution is to create the resource group when you launch the terraform test
command.
To illustrate what this looks like we have the following directory structure:
$ tree .
.
āāā main.tf
āāā outputs.tf
āāā testing
āĀ Ā āāā setup
āĀ Ā āāā main.tf
āāā tests
āĀ Ā āāā main.tftest.hcl
āāā variables.tf
4 directories, 5 files
I have created a testing
directory that contains a setup
directory with a main.tf
file. The contents of this file is:
// testing/setup/main.tf
variable "resource_group_name" {
type = string
}
variable "location" {
type = string
}
resource "azurerm_resource_group" "rg" {
name = var.resource_group_name
location = var.location
}
It is a simple file that uses the azurerm
provider to create a resource group. The module under test is:
// main.tf
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.0.0"
}
}
}
locals {
resource_name_suffix = "${var.name_suffix}-${var.location}"
}
resource "azurerm_service_plan" "plan" {
name = "plan-${local.resource_name_suffix}"
resource_group_name = var.resource_group_name
location = var.location
os_type = "Linux"
sku_name = var.appservice_plan_sku
}
resource "azurerm_linux_web_app" "app" {
name = "app-${local.resource_name_suffix}"
service_plan_id = azurerm_service_plan.plan.id
resource_group_name = var.resource_group_name
location = var.location
https_only = true
site_config {
always_on = contains(["Free", "F1", "D1"], var.appservice_plan_sku) ? false : true
http2_enabled = true
minimum_tls_version = "1.2"
}
}
This module creates two resources. An App Service plan and a Linux Web App. The variables.tf
file has the following content:
// variables.tf
variable "name_suffix" {
type = string
}
variable "resource_group_name" {
type = string
}
variable "location" {
type = string
}
variable "appservice_plan_sku" {
type = string
validation {
condition = contains([
"B1", "B2", "B3", "D1", "F1", "S1", "S2", "S3"
], var.appservice_plan_sku)
error_message = "Please provide a valid App Service Plan SKU"
}
}
How do we create the resource group module before we run our tests? The test file main.tftest.hcl
looks like this:
// tests/main.tftest.hcl
provider "azurerm" {
features {}
}
variables {
resource_group_name = "rg-app-service-test"
location = "swedencentral"
appservice_plan_sku = "F1"
name_suffix = "apptest"
}
run "setup" {
module {
source = "./testing/setup"
}
}
run "http_should_not_be_allowed" {
command = plan
assert {
condition = azurerm_linux_web_app.app.https_only == true
error_message = "Web App accepts HTTP traffic"
}
}
run "confirm_always_on" {
command = plan
variables {
name_suffix = "testingalwayson"
appservice_plan_sku = "S1"
}
assert {
condition = azurerm_linux_web_app.app.site_config[0].always_on == true
error_message = "Always-On is off for S1 SKU"
}
}
There are few new things to look at in this test file. Let's break it down.
First of all we configure the azurerm
provider:
provider "azurerm" {
features {}
}
This allows you to configure the provider in any way that fits your tests. In this case I use default settings (an empty features
block is required). Note that you could also configure the provider to use a separate test subscription instead of any other default subscription you have configured.
The next piece in the test file defines global variables:
variables {
resource_group_name = "rg-app-service-test"
location = "swedencentral"
appservice_plan_sku = "F1"
name_suffix = "apptest"
}
These variables will be used for the setup module and all the following tests, unless the tests override these values.
Next we have the setup module:
run "setup" {
module {
source = "./testing/setup"
}
}
A setup (or helper) module is created in its own run
block. I set the label of this block to setup
, but you can set it to whatever fits your context. The run
block only contains a nested module
block that specifies the source of the module to be my module located in the testing/setup
directory. This run
block is the first run
block in the test file, so it will be run first (they are run in order). If I place the setup run
block somewhere else in the file then the tests defined above the setup block would fail because the resource group would not exist.
The rest of the file contains two tests in two separate run
blocks. The first block is similar to what we have seen before, but notice that we have a nested variables
block in the other run
block:
run "confirm_always_on" {
command = plan
variables {
name_suffix = "testingalwayson"
appservice_plan_sku = "S1"
}
assert {
condition = azurerm_linux_web_app.app.site_config[0].always_on == true
error_message = "Always-On is off for S1 SKU"
}
}
This means that for this test we override the name_suffix
and appservice_plan_sku
variables.
I can run the tests with terraform test
:
$ terraform test
tests/main.tftest.hcl... in progress
run "setup"... pass
run "http_should_not_be_allowed"... pass
run "confirm_always_on"... pass
tests/main.tftest.hcl... tearing down
tests/main.tftest.hcl... pass
Success! 3 passed, 0 failed.
Notice that the output says 3 passed
even though we only really had two tests. How come? This is because the setup module runs inside a run
block, so it is considered to be a test from Terraform's point of view. I think this is a bit unfortunate, but for now we'll have to live with it.
My tests in this case used command = plan
, so they are relatively fast to run. When you use command = apply
you have to prepare for a potentially long test run, depending on what resources your module creates. Imagine if you have a module that creates multiple Kubernetes clusters and installs various components in these clusters, then an apply
could take some time. Especially if you run multiple independent tests.
Terraform test state file and resource destruction
How does Terraform know what resources it should remove and in what order it should do it? If you are familiar with Terraform you know that there is usually a state file somewhere. When you run tests Terraform keeps the state files in memory, so you won't see any state files appearing in your module directory.
Terraform creates one state file for the main configuration under test, and one state file for each alternate module that you create through a run
block. An example of an alternate module is the setup module we saw in an example above.
The state files are created in the order of the tests, and they are destroyed in the reverse order. An illustrative sample of what state files are created, updated, and destroyed and in what order, is shown below:
// first alternate module call
// creates state file for modules/setup-1.tf
run "setup-1" {
module {
source = "modules/setup-1.tf"
}
}
// second alternate module call
// creates state file for modules/setup-2.tf
run "setup-2" {
module {
source = "modules/setup-2.tf"
}
}
// test the main configuration
// creates the main statefile for the module's main.tf
run "test-1" {
assert {
...
}
}
// third alternate module call, once again to the setup-2 module
// updates the state file for modules/setup-2.tf
run "setup-2-again" {
module {
source = "modules/setup-2.tf"
}
}
// second test for the main configuration
// updates the main statefile for the module's main.tf
run "test-2" {
assert {
...
}
}
// After everything is run clean-up starts:
// 1. The module's main.tf state file is destroyed
// 2. The alternate modules state files are destroyed in
// reverse order from how they were created
// - first the modules/setup-2 state file is destroyed
// - then the modules/setup-1 state file is destroyed
A question that came to my mind when I first heard about the test framework was what happens if the test fails the destruction of resources? Let's see what happens!
I will run a test where an Azure App Service is created. I will use the setup module from before where I created a resource group. To make the test fail I will issue the following Azure CLI command in order to lock the resource group so that Terraform can't destroy it:
$ az lock create \
--name failure \
--resource-group rg-app-service-test \
--lock-type ReadOnly
The test output is the following:
$ terraform test
tests/main.tftest.hcl... in progress
run "setup"... pass
.
. (output truncated)
.
Terraform left the following resources in state after executing
tests/main.tftest.hcl/http_should_not_be_allowed, and they need to
be cleaned up manually:
- azurerm_linux_web_app.app
- azurerm_service_plan.plan
There we go!
We are instructed that a number of resources need to be cleaned up manually. In Azure this is usually relatively simple if you have put all resources in the same resource group. However, if you are working with AWS you might be in for a tedious cleanup process if your module created a lot of resources!
I can see that this behavior could be an issue during development of your module and tests where you are not sure if everything works as intended. You will most likely end up with a few failing test cleanups.
Exploring the test command in the CLI
To cover everything we can about the test framework let's see what else we can do with the terraform test
command:
$ terraform test -h
Usage: terraform [global options] test [options]
[ background description truncated for brevity ... ]
Options:
-cloud-run=source If specified, Terraform will execute this test run
remotely using Terraform Cloud. You must specify the
source of a module registered in a private module
registry as the argument to this flag. This allows
Terraform to associate the cloud run with the correct
Terraform Cloud module and organization.
-filter=testfile If specified, Terraform will only execute the test files
specified by this flag. You can use this option multiple
times to execute more than one test file.
-json If specified, machine readable output will be printed in
JSON format
-no-color If specified, output won't contain any color.
-test-directory=path Set the Terraform test directory, defaults to "tests".
-var 'foo=bar' Set a value for one of the input variables in the root
module of the configuration. Use this option more than
once to set more than one variable.
-var-file=filename Load variable values from the given file, in addition
to the default files terraform.tfvars and *.auto.tfvars.
Use this option more than once to include more than one
variables file.
-verbose Print the plan or state for each test run block as it
executes.
There are a few interesting flags we can use. I want to highlight a few:
-
-cloud-run=source
is useful if you have your module in a private registry in Terraform Cloud, and you want to trigger a test run in the cloud. I will cover testing in Terraform Cloud in a future post. -
-filter
is useful if you have a lot of test files and you would only want one or a few of the files to run. This is especially useful if you are testing a large module where your tests execute apply operations that take a long time. -
-test-directory
is useful if you place your test files somewhere else than in thetests
directory. But as I mentioned earlier in this article you should probably stick to using thetests
directory.
Summary
In this post we have looked at most of what the new testing framework for Terraform 1.6 has to offer. That is in fact not true. There are more we can say about the testing framework when it comes to Terraform Cloud. In a future post I will cover how we run tests in Terraform Cloud and some of the unique features that are available there.
The example patterns in this post have intentionally been left relatively simple. In reality creating good tests for your modules will require a lot of work. My purpose of this post has been to illustrate what we can do, what syntax is available, and a few of the behaviors we can expect from this framework.
I expect that there will be additional features added as HashiCorp receives feedback from users of this framework. We live in exciting times!
-
Apart from this single new feature there were enhancements and bug fixes included as well.Ā ā©
-
I call them patterns here to use a familiar nomenclature. As with all patterns you will most likely not see them isolated in the real world. All patterns I present are most likely mixed and matched for real Terraform modules. The idea with patterns here is to introduce the testing framework piece by piece.Ā ā©
Featured ones: