Infrastructure as Code (OpenTofu)
We use OpenTofu (open-source Terraform fork) for all Infrastructure-as-Code (IaC). It manages resources across GCP, AWS, and Azure.
Installation
Install OpenTofu using Homebrew:
Verify the installed version:
Note
OpenTofu is a drop-in replacement for Terraform. All terraform commands have an equivalent tofu command with the same syntax.
Our Infrastructure Repositories
dt-gcp-infrastructure
Manages all Data Team GCP resources in the dw-prod-gwiiag project.
What it manages:
BigQuerydatasets, tables, and permissionsGCSbuckets andPubSubnotifications for file ingestion- Service accounts and their IAM bindings
Secret Managersecrets (e.g., keys shared withAWS)Cloud Functionstriggers and configurations
Environments: dev, prod
When to use: Adding new GCS buckets, creating service accounts, managing BigQuery permissions, or rotating keys.
dt-aws-infrastructure
Manages all Data Team AWS resources in the Majority BI account.
What it manages:
S3buckets for data storage and file deliveryKinesisstreams for event ingestion (e.g.,mParticle)Lambdafunctions for event processingIAMusers and roles (e.g.,majority-braze,majority-adjust,majority-mparticle)Secrets Managersecrets (e.g.,GCPservice account keys shared across clouds)
Environments: dev, prod
When to use: Adding new S3 buckets, managing Kinesis streams, creating IAM users for third-party integrations, or configuring Lambda functions.
platform-infrastructure
Shared infrastructure repository managed together with the Platform team. Requires 2 PR approvals.
What it manages:
AKS(Azure Kubernetes Service) cluster configurationsAWSIAM roles for cross-cloud trust (e.g.,Airflowpods accessingAWS)- Kubernetes service accounts
When to use: Setting up cross-cloud access for Airflow DAGs or requesting new AWS roles for AKS workloads. See this PR for reference.
Shared Repository
Changes to platform-infrastructure require coordination with the Platform team and 2 PR approvals before deployment.
Workflow
1. Clone and Navigate
2. Initialize
Download providers and configure the backend:
3. Plan
Preview changes before applying:
Review the output carefully:
+resources to be created~resources to be modified-resources to be destroyed
4. Apply
Apply the changes:
OpenTofu will show the plan again, ask for confirmation (yes), execute the changes, and display the results.
5. Targeted Operations
To apply changes to specific resources only:
tofu plan -target='module.my_module.resource_type.resource_name'
tofu apply -target='module.my_module.resource_type.resource_name'
Use -target Sparingly
Targeted applies skip dependency checks. Only use them for key rotations or isolated changes where you know the full impact.
Project Structure
Typical layout of our OpenTofu repositories:
dt-gcp-infrastructure/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ └── prod/
│ ├── main.tf
│ ├── variables.tf
│ └── terraform.tfvars
├── modules/
│ ├── bigquery/
│ ├── storage/
│ ├── service_accounts/
│ └── pubsub/
└── README.md
environments/— per-environment configurations (dev,prod) with their own statemodules/— reusable modules shared across environments
Good Practices
State Management
- State is stored remotely (never commit
.tfstatefiles) - Each environment has its own state file
- Run
tofu initwhen switching between environments
Code Organization
- Use modules for reusable resource definitions
- Keep environment-specific values in
terraform.tfvars - Use variables with
descriptionandtype— avoid hardcoding values in resource blocks - Use
localsfor computed values and to reduce repetition
Planning and Applying
- Always run
tofu planbeforetofu applyand review the output - Never use
-auto-approveinprodunless it's a well-understood, targeted operation (e.g., key rotation) - Test changes in
devbefore applying toprod
No Direct Prod Changes
Always validate changes in dev first. The only exception is key rotation, where you apply to prod locally and then verify with an empty plan in the pipeline.
Naming Conventions
- Resources:
snake_case(e.g.,sa_aws_lambda_kinesis) - Modules: descriptive names matching their purpose (e.g.,
data_warehouse_service_accounts) - Variables:
snake_casewith clear descriptions
Pull Requests and CI/CD
- All infrastructure changes go through PRs
- The
GitHub Actionspipeline runstofu planon PRs so reviewers can see the impact - After merge, the pipeline runs
tofu applyforprod - For key rotations, apply locally first, then verify the pipeline shows an empty plan
Security
- Never commit secrets, key files, or
.tfstatefiles - Use
Secret Manager(GCP) orKey Vault(Azure) for sensitive values - Use
.gitignoreto exclude*.tfstate,*.tfstate.backup,*.jsonkey files, and.terraform/ - Comment out key generation blocks after rotation and delete local key files
Common Operations
Import Existing Resources
To bring an existing resource under OpenTofu management:
View Current State
Remove from State (Without Destroying)
Warning
state rm only removes the resource from OpenTofu tracking. The actual cloud resource remains untouched.
Authentication
OpenTofu uses the CLI credentials of each cloud provider:
| Provider | Auth Command | Docs |
|---|---|---|
GCP |
gcloud auth application-default login |
GCP Setup |
AWS |
aws sso login |
AWS Setup |
Azure |
az login |
Azure Setup |
Make sure you're authenticated with the relevant provider before running tofu plan or tofu apply.