dev-resources.site
for different kinds of informations.
Test permutations with Terraform and GitHub Actions
I have been exploring the new test framework for Terraform 1.6 extensively since HashiConf in October this year. I have already written two very long posts on the topic of testing and validation with Terraform that you can read here:
- A Comprehensive Guide to Testing in Terraform: Keep your tests, validations, checks, and policies in order
- Testing Framework in Terraform 1.6: A deep-dive
In this post I want to illustrate a pattern for scaling up your testing using GitHub Actions.
Scenario
You are part of a platform team developing a Terraform module that sets up an Azure storage account according to a specification appropriate for your organization.
Your module has dependency on a module created by a different team in your organization. That module sets up an Azure resource group where the storage account your module produces is placed.
The source code for your module is stored in a GitHub repository and you want to use GitHub Actions to perform testing before you publish new versions of your module.
The scenario is illustrated in the figure below:
You expect that new versions of your module should be compatible with the three most recent minor versions of the other team's resource group module that you depend on. You also expect that your module works as intended for each of the Azure regions that your organization is operating in. These regions are swedencentral
, northeurope
, and westeurope
.
Your team has already figured out that there will be many tests to write to cover the permutation of the above criteria. To be precise, each individual test you write will need to be repeated 3x3=9 times (three locations, three versions).
The Terraform module your team is developing consists of a single main.tf
file (to keep this scenario simple):
// main.tf
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "3.80.0"
}
random = {
source = "hashicorp/random"
version = "3.5.1"
}
}
}
provider "azurerm" {
features {}
}
variable "resource_group_name" {
type = string
}
resource "random_id" "this" {
keepers = {
resource_group_name = var.resource_group_name
}
byte_length = 6
}
data "azurerm_resource_group" "this" {
name = var.resource_group_name
}
resource "azurerm_storage_account" "this" {
name = "st${random_id.this.dec}"
access_tier = "Hot"
account_kind = "StorageV2"
account_replication_type = "LRS"
account_tier = "Standard"
resource_group_name = data.azurerm_resource_group.this.name
location = data.azurerm_resource_group.this.location
enable_https_traffic_only = true
tags = data.azurerm_resource_group.this.tags
}
The important parts to notice in this module is that it takes an input variable named resource_group_name
:
variable "resource_group_name" {
type = string
}
This variable is used in a data source for a resource group:
data "azurerm_resource_group" "this" {
name = var.resource_group_name
}
And this data source is later referenced in arguments of the storage account resource:
resource "azurerm_storage_account" "this" {
// ...
resource_group_name = data.azurerm_resource_group.this.name
location = data.azurerm_resource_group.this.location
tags = data.azurerm_resource_group.this.tags
}
For illustrative purposes your team is interested in running the following test as defined in tests/main.tftest.hcl
:
// tests/main.tftest.hcl
run "setup" {
variables {
location = "swedencentral"
name_suffix = "tftest-swedencentral-1.1.0"
}
module {
source = "app.terraform.io/your-tf-org/resource-group-module/azurerm"
version = "1.1.0"
}
}
run "proper_tags_should_be_propagated" {
variables {
resource_group_name = run.setup.resource_group.name
}
command = apply
assert {
condition = alltrue([
contains(keys(azurerm_storage_account.this.tags), "source"),
contains(keys(azurerm_storage_account.this.tags), "module")
])
error_message = "Proper tags are not propagated to the storage account"
}
}
There are two run
blocks in this test file. The first run
block named setup
uses the module you depend on. In this case it specifically uses version 1.1.0
of the module. This run
block defines a location
variable that is set to swedencentral
. This covers one of the nine cases we want to test.
The second run
block is our actual test. It makes sure that appropriate tags are propagated to the storage account from the resource group. Specifically we require that two tags are set, source
and module
.
Now we turn to the solution to the issue of how to test all permutations of versions and locations.
Solution
There are a number of options for how to solve the permutation of tests you need to run in this scenario. In this post I present a simple solution using a strategy
in GitHub Actions.
Before we look at the GitHub Actions workflow, let's look at a modified version of our test file, this one stored in templates/main.tftest.hcl.tpl
:
// templates/main.tftest.hcl.tpl
run "setup" {
variables {
location = "{{LOCATION}}"
name_suffix = "tftest-{{LOCATION}}-{{VERSION}}"
}
module {
source = "app.terraform.io/your-tf-org/resource-group-module/azurerm"
version = "{{VERSION}}"
}
}
run "proper_tags_should_be_propagated" {
variables {
resource_group_name = run.setup.resource_group.name
}
command = apply
assert {
condition = alltrue([
contains(keys(azurerm_storage_account.this.tags), "source"),
contains(keys(azurerm_storage_account.this.tags), "module")
])
error_message = "Proper tags are not propagated to the storage account"
}
}
The only difference to the test file shown before is that the explicit version is replaced by a placeholder value {{VERSION}}
, and each explicit location is replaced by {{LOCATION}}
.
The GitHub Actions workflow is located in the file .github/workflows/tftest.yaml
in our repo. We start building our workflow like this:
on: workflow_dispatch
permissions:
id-token: write
contents: read
To start with we only want to trigger the workflow manually, that is why we have on: workflow_dispatch
as the trigger. We add a few permissions that will be required for the Azure login action where we use a federated identity in Azure (see the documentation for details on how to set this up).
Next we start defining our job:
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
version: ["1.1.0", "1.2.0", "1.3.0"]
location: ["swedencentral", "westeurope", "northeurope"]
We have a single job named test
. We will run the job on Ubuntu, using the latest version available. Next we define the strategy
with the following configurations:
-
fail-fast: false
, this is required to not have all tests cancelled if a single test fails. The default value forfail-fast
istrue
. -
matrix
is used to configure a few settings that we want to vary between tests. In this case we varyversion
andlocation
. There will be one run for each combination of version and location, for a total of nine runs.
The last part of the workflow consists of the steps we want to run for each test:
steps:
- uses: azure/login@v1
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- uses: actions/checkout@v4
- run: |
sed 's/{{VERSION}}/${{ matrix.version }}/g; s/{{LOCATION}}/${{ matrix.location }}/g' \
templates/main.tftest.hcl.tpl > tests/main.tftest.hcl
- uses: hashicorp/setup-terraform@v2
with:
terraform_wrapper: false
- run: terraform init
env:
TF_TOKEN_app_terraform_io: ${{ secrets.TF_TOKEN }}
- run: terraform test
The first step uses the azure/login@v1
action to sign in to Azure. This is required because we will be creating resources in Azure using Terraform (remember: the Terraform test framework runs actual plan and apply operations, creating actual resources!)
The second step of the workflow uses the actions/checkout@v4
action to check out the source code. If you come from an Azure DevOps background you might be surprised that you need to explicitly add this step. I prefer that GitHub Actions requires that you add this step if you intend to do something with the source code in the repository, I find the behind-the-scenes checkout in Azure DevOps to be confusing.
The third step requires some explanation:
- run: |
sed 's/{{VERSION}}/${{ matrix.version }}/g; s/{{LOCATION}}/${{ matrix.location }}/g' \
templates/main.tftest.hcl.tpl > tests/main.tftest.hcl
Here we use sed
to:
- Find each occurrence of
{{VERSION}}
in the filetemplates/main.tftest.hcl.tpl
and replace it by${{ matrix.version }}
which in turn is replaced by GitHub Actions with a value fromversion
in thematrix
we defined above. - Find each occurrence of
{{LOCATION}}
in the filetemplates/main.tftest.hcl.tpl
and replace it by${{ matrix.location }}
which in turn is replaced by GitHub Actions with a value fromlocation
in thematrix
we defined above.
The result of these search-and-replace operations are stored in tests/main.tftest.hcl
.
Here we opted for a simple sed
search-and-replace, which is fine when the number of variables we need to replace are few. If we needed to replace many more variables than two we would probably look into using a templating tool for this instead. Use common sense here, if sed
works for your purposes then there is no need to involve any other tool.
The last few steps sets up the Terraform CLI tool, initializes Terraform, and finally runs the tests:
- uses: hashicorp/setup-terraform@v2
with:
terraform_wrapper: false
- run: terraform init
env:
TF_TOKEN_app_terraform_io: ${{ secrets.TF_TOKEN }}
- run: terraform test
The terraform init
step requires an environment variable TF_TOKEN_app_terraform_io
with the value of a Terraform Cloud token. This is because the module we are using is located in a private Terraform registry (see the documentation for details on this).
The full workflow for reference:
on: workflow_dispatch
permissions:
id-token: write
contents: read
jobs:
test:
strategy:
fail-fast: false
matrix:
version: ["1.1.0", "1.2.0", "1.3.0"]
location: ["swedencentral", "westeurope", "northeurope"]
runs-on: ubuntu-latest
steps:
- name: Azure login
uses: azure/login@v1
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- uses: actions/checkout@v4
- run: |
sed 's/{{VERSION}}/${{ matrix.version }}/g; s/{{LOCATION}}/${{ matrix.location }}/g' \
templates/main.tftest.hcl.tpl > tests/main.tftest.hcl
- uses: hashicorp/setup-terraform@v2
with:
terraform_wrapper: false
- run: terraform init
env:
TF_TOKEN_app_terraform_io: ${{ secrets.TF_TOKEN }}
- run: terraform test
When we trigger this single workflow we can see that nine individual jobs are started:
Jumping into Azure after a short while we can confirm that nine different resource groups have been created:
Finally, after a few minutes all jobs are finished1:
The results are clearly indicating that a regression have been introduced in version 1.3.0 of the resource group module that we depend on. We can conclude that our module is not ready to be release as a new version, and it is time to talk to the team responsible for the resource group module to see what changes they have made.
-
When you use a
matrix
with many different variables (I use two) the number of billable minutes in GitHub Actions can quickly escalate. In the simple example I used 43 billable minutes.Ā ā©
Featured ones: