1. Overview
Previously, we've covered Terraform's basic concepts and usage. Now, let's dig deeper and cover some of the best practices when using this popular DevOps tool.
2. Resource Files Organization
When we start using Terraform, it's not uncommon to put every resource definition, variable, and output in a single file. This approach, however, quickly leads to code that is hard to maintain and even harder to reuse.
A better approach is to take benefit from the fact that, within a module, Terraform will read any “.tf” file and process its contents. The order in which we declare resources in those is not relevant – this is Terraform's job, after all. We should keep them organized so that we can better understand what's going on.
Here, consistency is more important than how we choose to organize resources in our files. A common practice is to use at least three files per module:
- variables.tf: All module's input variables go here, along with their default values when applicable
- main.tf: This is where we'll put our resource definitions. Assuming that we're using the Single Responsibility principle, its size should stay under control
- modules: If our module contains any sub-modules, this is where they'll go
- outputs.tf: Exported data items should go here
- providers.tf: Used only on the top-level directory, declares which providers we'll use in the project, including their versions
This organization allows any team member that wants to use our module to locate which are the required variables and output data quickly.
Also, as our module evolves, we must keep an eye on the main.tf file's size. A good sign that we should consider refactoring it into sub-modules is when it starts to grow in size. At this point, we should refactor it by moving tightly-coupled resources, such as an EC2 instance and an attached EBS volume, into nested modules. In the end, the chances are that our top-level main.tf file contains only module references stitched together.
3. Modules Usage
Modules are a powerful tool, but, as in any larger software project, it takes some time to get the right level of abstraction so we can maximize reuse across projects. Given the fact that Terraform, as the whole infrastructure-as-code practice, is relatively new, this is an area where we see many different approaches.
That said, we can still reuse some lessons learned from application codebases that can help with proper module organization. Among those lessons, the Single Responsibility from the S.O.L.I.D set of principles is quite useful.
In our context, this means a module should focus on a single aspect of the infrastructure, such as setting up a VPC or creating a virtual machine – and just that.
Let's take a look at a sample Terraform project directory layout that uses this principle:
$ tree . ├── main.tf ├── modules │ ├── ingress │ │ └── www.petshop.com.br │ │ ├── main.tf │ │ ├── outputs.tf │ │ └── variables.tf ... other services omitted │ └── SvcFeedback │ ├── main.tf │ ├── outputs.tf │ └── variables.tf ├── outputs.tf ├── terraform.tfvars └── variables.tf
Here, we've used modules for each significant aspect of our infrastructure: database, ingress, messaging, external services, and backend services. In this layout, each folder containing .tf files is a module containing three files:
- variables.tf – Input variables for the module
- main.tf – Resource definitions
- outputs.tf – Output attributes definitions
This convention has the benefit that module consumers can get straight to its “contract” – variables and outputs – if they want to, skipping implementation details.
4. Provider Configuration
Most providers in Terraform require us to provide valid configuration parameters so it can manipulate resources. For instance, the AWS provider needs an access key/secret and a region so it can access our account and execute tasks.
Since those parameters contain sensitive information and deployment-target-specific information, we should avoid including them as part of our project's code. Instead, we should use variables or provider-specific methods to configure them.
4.1. Using Variables to Configure a Provider
In this approach, we define a project variable for every required provider parameter:
variable "aws_region" { type = string } variable "aws_access_key" { type = string }a variable "aws_secret_key" { type = string }
Now, we use them in our provider declaration:
provider "aws" { region = var.aws_region access_key = var.aws_access_key secret_key = var.aws_secret_key }
Finally, we provide actual values using a .tfvar file:
aws_access_key="xxxxx" aws_secret_key="yyyyy" aws_region="us-east-1"
We can also combine .tfvar files and environment variables when running Terraform commands such as plan or apply:
$ export TF_VAR_aws_region="us-east-1" $ terraform plan -var="access_key=xxxx" -var-file=./aws.tfvars
Here, we've used a mix of environment variables and command-line arguments to pass variable values. In addition to those sources, Terraform will also look at variables defined in a terraform.tfvars file and any file with the “.auto.tfvars” extension in the project's folder.
4.2. Using Provider-Specific Configuration
In many cases, Terraform providers can pick credentials from the same place used by the native tool. A typical example is the Kubernetes provider. If our environment already has the native utility kubectl configured to point to our target cluster, then we don't need to provide any extra information.
5. State Management
Terraform state files usually contain sensitive information, so we must take proper measures to secure it. Let's take a look at a few of them:
- Always use an exclusion rule for *.tfstate files in our VCS configuration file. For Git, this can go in a global exclusion rule or our project's .gitignore file.
- Adopt as soon as possible a remote backend instead of the default local backend. Also, double-check access restrictions to the chosen backend.
Moving from the default state backend – local files – to a remote is a simple task. We just have to add a backend definition in one of our project's files:
terraform { backend "pg" {} }
Here, we're informing Terraform that it will use the PostgreSQL backend to store state information. Remote backends usually require additional configuration. Much like providers, the recommended approach is to pass the needed parameters through environment variables or “.auto.tfvars” files
The main reason to adopt a remote backend is to enable multiple collaborators/tools to run Terraform on the same target environment. In those scenarios, we should avoid more than one Terraform run on the same target environment — that can cause all sorts of race conditions and conflicts and will likely create havoc.
By adopting a remote backend, we can avoid those issues, as remote backends support the concept of locking. This means that only a single collaborator can run commands such as terraform plan or terraform apply in turn.
Another way to enforce proper management of state files is to use a dedicated server to run Terraform. We can use any CI/CD tools for this, such as Jenkins, GitLab, and others. For small teams/organizations, we can also use the free-forever tier of Terraform's SaaS offering.
6. Workspaces
Workspaces allow us to store multiple state files for a single project. Building on the VCS branch analogy, we should start using them on a project as soon as we must deal with multiple target environments. This way, we can have a single codebase to recreate the same resources no matter where we point Terraform.
Of course, environments can and will vary in some way or another — for example, in machine sizing/count. Even so, we can address those aspects with input variables passed at apply time.
With those points in mind, a common practice is to name workspaces after environment names. For instance, we can use names such as DEV, QA, and PRD, so they match our existing environments.
If we had multiple teams working on the same project, we could also include their names. For instance, we could have a DEV-SQUAD1 workspace for a team working on new features and a DEV-SUPPORT for another team to reproduce and fix production issues.
7. Testing
As we start adopting standard coding practices to deal with our infrastructure, it is natural that we also adopt one of its hallmarks: automated testing. Those tests are particularly useful in the context of modules, as they enhance our confidence that they'll work as expected in different scenarios.
A typical test consists of deploying a test configuration into a temporary environment and running a series of tests against it. What should tests cover? Well, it largely depends on the specifics of what we're creating, but some are quite common:
- Accessibility: Did we create our resources correctly? Are they reachable?
- Security: Did we leave any open non-essential network ports? Did we disable default credentials?
- Correctness: Did our module use its parameters correctly? Did it flag any missing parameters?
As of this writing, Terraform testing is still an evolving topic. We can write our tests using whatever framework we want, but those that focus on integrated tests are generally better suited for this task. Some examples include FitNesse, Spock, and Protractor, among others. We can also create our tests using regular shell scripts and add them to our CI/CD pipeline.
8. Conclusion
In this article, we've covered some best practices while using Terraform. Given this is still a relatively new field, we should take those just as a starting point. As more people adopt infrastructure-as-code tools, we're likely to see new practices and tools emerge.
As usual, all code is available over on GitHub.