Skip to content
Pipelines and Pizza 🍕
Go back

GitHub Actions for Terraform and Ansible

12 min read

Imagine opening a pull request and seeing your entire infrastructure change laid out in the comments — every resource to be created, modified, or destroyed — before anyone clicks merge. A teammate approves the PR, it merges to main, and the infrastructure deploys itself. No one SSHed into a jump box. No one ran terraform apply from their laptop at 4:55 PM on a Friday.

That’s the dream, and GitHub Actions makes it achievable without a dedicated platform team or an expensive CI/CD product. I’ve been running this pattern across Terraform and Ansible workloads for a while now, and the results speak for themselves: fewer misconfigurations, faster reviews, and engineers who actually trust the deployment process.

Let’s build it.


The Core Pattern: Plan in the PR, Apply on Merge

The fundamental workflow for Terraform CI/CD in GitHub Actions follows a two-phase pattern:

  1. On pull request — run terraform plan, post the output as a PR comment
  2. On merge to main — run terraform apply against the same code

This gives reviewers visibility into exactly what will change before they approve, and it ensures that only reviewed, approved changes reach your infrastructure. No surprises.

For Ansible, the pattern is similar in spirit but different in mechanics:

  1. On pull request — lint, syntax-check, and run Molecule tests
  2. On merge to main — execute the playbook against target environments
ToolPR PhaseMerge Phase
Terraformplan + PR commentapply
Ansibleansible-lint + molecule testansible-playbook against target

OIDC Authentication: No More Stored Secrets

Before we write any workflows, let’s talk authentication. If you’re still storing cloud credentials as long-lived GitHub secrets, stop. GitHub Actions supports OpenID Connect (OIDC) federation, which means your workflow gets a short-lived token from GitHub’s OIDC provider, presents it to your cloud (Azure, AWS, GCP), and receives temporary credentials in return.

No secrets to rotate. No credentials to leak. Nothing stored in your repository.

For Azure, the setup looks like this:

  1. Create or use an existing App Registration (Service Principal) in Entra ID
  2. Add a Federated Identity Credential that trusts tokens from your GitHub repo
  3. Scope the trust to specific branches or environments (e.g., only main, only the production environment)
  4. Set id-token: write in your workflow permissions

The Terraform provider picks this up automatically with ARM_USE_OIDC=true. Your workflow authenticates without a single secret stored in GitHub.


Complete Terraform CI/CD Workflow

Here’s a complete, production-ready workflow. It plans on PRs, posts the output as a comment, and applies on merge — with concurrency guards and OIDC auth baked in.

name: "Terraform CI/CD"

on:
  pull_request:
    branches: [main]
    paths: ["infra/terraform/**"]
  push:
    branches: [main]
    paths: ["infra/terraform/**"]

permissions:
  contents: read
  pull-requests: write
  id-token: write # Required for OIDC

concurrency:
  group: terraform-${{ github.ref }}
  cancel-in-progress: false # Never cancel an in-progress apply

env:
  TF_DIR: infra/terraform
  ARM_USE_OIDC: true
  ARM_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
  ARM_SUBSCRIPTION_ID: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
  ARM_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}

jobs:
  plan:
    name: "Terraform Plan"
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.9.x"

      - name: Azure Login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Terraform Init
        working-directory: ${{ env.TF_DIR }}
        run: terraform init -input=false

      - name: Terraform Validate
        working-directory: ${{ env.TF_DIR }}
        run: terraform validate

      - name: Terraform Plan
        id: plan
        working-directory: ${{ env.TF_DIR }}
        run: terraform plan -no-color -input=false -out=tfplan
        continue-on-error: true

      - name: Post Plan to PR
        uses: borchero/terraform-plan-comment@v2
        with:
          token: ${{ github.token }}
          planfile: ${{ env.TF_DIR }}/tfplan
          working-directory: ${{ env.TF_DIR }}

      - name: Fail on Plan Error
        if: steps.plan.outcome == 'failure'
        run: exit 1

  apply:
    name: "Terraform Apply"
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production # Requires approval
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.9.x"

      - name: Azure Login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Terraform Init
        working-directory: ${{ env.TF_DIR }}
        run: terraform init -input=false

      - name: Terraform Apply
        working-directory: ${{ env.TF_DIR }}
        run: terraform apply -auto-approve -input=false -lock-timeout=5m

A few things worth calling out:

  • concurrency group prevents two applies from running simultaneously. The cancel-in-progress: false setting ensures a running apply is never killed mid-execution — that’s how you end up with half-deployed infrastructure and a corrupted state file.
  • -lock-timeout=5m on the apply step tells Terraform to wait up to five minutes for the state lock instead of failing immediately. This is your second layer of protection if concurrency controls somehow overlap.
  • environment: production on the apply job ties it to a GitHub environment, which you can configure with required reviewers, wait timers, and branch restrictions.

Complete Ansible CI/CD Workflow

Ansible pipelines need a different approach. There’s no plan equivalent, so we lean harder on linting, syntax validation, and Molecule testing before anything touches a real host.

name: "Ansible CI/CD"

on:
  pull_request:
    branches: [main]
    paths: ["infra/ansible/**"]
  push:
    branches: [main]
    paths: ["infra/ansible/**"]

concurrency:
  group: ansible-${{ github.ref }}
  cancel-in-progress: false

jobs:
  lint-and-test:
    name: "Lint & Test"
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"

      - name: Install Dependencies
        run: |
          pip install ansible ansible-lint molecule molecule-docker yamllint

      - name: YAML Lint
        run: yamllint -s infra/ansible/

      - name: Ansible Lint
        run: ansible-lint infra/ansible/

      - name: Ansible Syntax Check
        run: |
          ansible-playbook infra/ansible/site.yml --syntax-check

      - name: Molecule Test
        working-directory: infra/ansible/roles/webserver
        run: molecule test
        env:
          MOLECULE_DISTRO: ubuntu2204

  deploy:
    name: "Deploy to ${{ matrix.environment }}"
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: ${{ matrix.environment }}
    strategy:
      max-parallel: 1
      matrix:
        environment: [staging, production]
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"

      - name: Install Ansible
        run: pip install ansible

      - name: Write Vault Password
        run: echo "${{ secrets.ANSIBLE_VAULT_PASSWORD }}" > .vault_pass
        shell: bash

      - name: Write SSH Key
        run: |
          echo "${{ secrets.SSH_PRIVATE_KEY }}" > deploy_key
          chmod 600 deploy_key

      - name: Run Playbook
        run: |
          ansible-playbook infra/ansible/site.yml \
            -i infra/ansible/inventory/${{ matrix.environment }}.yml \
            --vault-password-file .vault_pass \
            --private-key deploy_key \
            -e "target_env=${{ matrix.environment }}"

      - name: Cleanup Secrets
        if: always()
        run: rm -f .vault_pass deploy_key

Key design decisions:

  • max-parallel: 1 with matrix strategy deploys to staging first, then production. This gives you sequential promotion through environments without duplicating the entire job.
  • Vault password handling writes the password to a temporary file, uses it, and cleans up in an always() step so it’s removed even if the playbook fails.
  • Environment-specific secrets live in GitHub environments. Staging and production each have their own SSH keys, vault passwords, and inventory files.

Environment Protection Rules: Your Safety Net

GitHub environments are the backbone of safe IaC deployments. Here’s how I configure them:

EnvironmentProtection RulesPurpose
devNone — auto-deployFast feedback loop
stagingBranch restriction (main only)Only tested code reaches staging
productionRequired reviewers + 10-min wait timerHuman approval + cool-off period

To set this up, go to Settings > Environments in your repo. Create each environment and configure its protection rules. The production environment should require at least one reviewer from your infrastructure team, and the wait timer gives you a window to cancel if someone spots an issue after approval.

Each environment gets its own secrets too. Your staging Azure subscription ID is different from production, your Ansible inventory points to different hosts, and your vault passwords can be rotated independently. GitHub Actions only exposes environment secrets to jobs that reference that environment and pass all protection rules.


Multi-Environment Promotion: Dev to Staging to Prod

For organizations running multiple environments, here’s the promotion pattern I recommend:

# Simplified multi-environment Terraform workflow
jobs:
  plan-dev:
    if: github.event_name == 'pull_request'
    uses: ./.github/workflows/terraform-reusable.yml
    with:
      environment: dev
      tf_var_file: environments/dev.tfvars

  apply-dev:
    if: github.event_name == 'push'
    needs: []
    uses: ./.github/workflows/terraform-reusable.yml
    with:
      environment: dev
      tf_var_file: environments/dev.tfvars
      apply: true

  apply-staging:
    if: github.event_name == 'push'
    needs: [apply-dev]
    uses: ./.github/workflows/terraform-reusable.yml
    with:
      environment: staging
      tf_var_file: environments/staging.tfvars
      apply: true

  apply-production:
    if: github.event_name == 'push'
    needs: [apply-staging]
    uses: ./.github/workflows/terraform-reusable.yml
    with:
      environment: production
      tf_var_file: environments/production.tfvars
      apply: true

The needs chain creates a linear promotion: dev must succeed before staging starts, and staging must succeed before production. Combined with environment protection rules, production won’t even begin until someone manually approves it.

Use reusable workflows (uses: ./.github/workflows/...) to avoid duplicating your Terraform init/plan/apply logic across every environment. One workflow, parameterized by environment.


Hands-On Lab: Combined Terraform + Ansible Pipeline

Let’s build a realistic pipeline where Terraform provisions infrastructure and Ansible configures it. This is the pattern I use in production: Terraform creates the VMs, and Ansible installs the software.

Step 1: Repository Structure

my-infra/
  .github/workflows/
    infra-cicd.yml
  terraform/
    main.tf
    outputs.tf
    dev.tfvars
  ansible/
    site.yml
    inventory/
      dev.yml
    roles/
      webserver/

Step 2: The Combined Workflow

name: "Infrastructure CI/CD"

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

permissions:
  contents: read
  pull-requests: write
  id-token: write

concurrency:
  group: infra-${{ github.ref }}
  cancel-in-progress: false

jobs:
  terraform-plan:
    name: "Terraform Plan"
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - name: Init & Plan
        working-directory: terraform
        run: |
          terraform init -input=false
          terraform plan -no-color -var-file=dev.tfvars -out=tfplan

  ansible-lint:
    name: "Ansible Lint & Syntax"
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install ansible ansible-lint yamllint
      - run: yamllint ansible/
      - run: ansible-lint ansible/
      - run: ansible-playbook ansible/site.yml --syntax-check

  terraform-apply:
    name: "Provision Infrastructure"
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: dev
    outputs:
      host_ip: ${{ steps.output.outputs.host_ip }}
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_wrapper: false
      - name: Apply
        working-directory: terraform
        run: |
          terraform init -input=false
          terraform apply -auto-approve -var-file=dev.tfvars
      - name: Get Outputs
        id: output
        working-directory: terraform
        run: echo "host_ip=$(terraform output -raw vm_public_ip)" >> "$GITHUB_OUTPUT"

  ansible-configure:
    name: "Configure Infrastructure"
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    needs: [terraform-apply]
    runs-on: ubuntu-latest
    environment: dev
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install ansible
      - name: Write Dynamic Inventory
        run: |
          cat > ansible/inventory/dynamic.yml <<EOF
          all:
            hosts:
              web:
                ansible_host: ${{ needs.terraform-apply.outputs.host_ip }}
                ansible_user: azureuser
          EOF
      - name: Write SSH Key
        run: |
          echo "${{ secrets.SSH_PRIVATE_KEY }}" > deploy_key
          chmod 600 deploy_key
      - name: Run Playbook
        run: |
          ansible-playbook ansible/site.yml \
            -i ansible/inventory/dynamic.yml \
            --private-key deploy_key
      - name: Cleanup
        if: always()
        run: rm -f deploy_key

This is the real power of combining Terraform and Ansible in CI/CD: Terraform provisions the VM, passes its IP address as a job output, and Ansible picks it up to configure the host. One merge, two tools, fully automated.


Troubleshooting Guide

ProblemCauseFix
Error acquiring the state lockConcurrent apply or crashed run left a stale lockAdd concurrency group to prevent parallel runs. Use -lock-timeout=5m. As a last resort, terraform force-unlock LOCK_ID
Plan comment not appearing on PRMissing pull-requests: write permissionAdd permissions: pull-requests: write at the workflow level
OIDC login fails with “No matching federated credential”Subject claim mismatchVerify the federated credential entity type matches (branch, environment, or PR) and the repo/org are correct
Ansible vault decryption failsVault password secret is empty or has trailing newlineRe-save the secret in GitHub without trailing whitespace. Use echo -n when creating it
Molecule tests pass locally but fail in CIMissing Docker socket or different Python versionUse molecule-docker driver and pin your Python version in setup-python
Apply job runs but skips environment approvalJob doesn’t reference the environment keyAdd environment: production to the job definition
Terraform plan shows changes but apply says “no changes”Code changed between plan and applyEnable branch protection: require branches to be up-to-date before merging

State Locking: Your Double Safety Net

Terraform state locking deserves special attention in CI/CD. When you run terraform apply, Terraform acquires a lock on the state file to prevent concurrent modifications. In a pipeline, this matters a lot because merge commits can trigger overlapping runs.

The defense-in-depth approach:

  1. GitHub Actions concurrency groups — prevent two workflow runs from executing the same Terraform directory simultaneously
  2. Terraform state locking — the backend (Azure Storage, S3, etc.) provides an additional lock at the state file level
  3. -lock-timeout=5m — if a lock is held, wait instead of failing immediately

If you hit a stale lock from a crashed pipeline run, check that the run actually failed (don’t unlock a state that’s actively being applied). Then use terraform force-unlock <LOCK_ID> — but only after confirming no apply is in progress.


What’s Next

Next post: Dependabot and Supply Chain Security. We’ll dig into automated dependency updates, how to configure Dependabot for Terraform providers and Ansible collections, and what supply chain attacks actually look like in the infrastructure world.

Happy automating!