Today’s Weather: A brisk 32 F in Dallas, TX @ 3:30PM. Sunny, but I’m glad to be inside.
Background
I’ve been spending a lot of time with Terraform recently and wanted to build something with some real world use. I had done some light reading on Azure Automation Runbooks and Hybrid Workers, and decided to give it a try
All built with Terraform, NO GUI clicking required.
Here’s a link to the Github repo in case you want to check it out. I’ll be breaking it down block by block here.
Architcture & Design
Azure Automation Accounts and Key Vaults are PaaS services that run in Azure’s managed infra. If you wanted to execute code and authenticate with certificates in Key Vault, you wouldn’t even need to use a VM. However, I wanted to disable public access to my Key Vault.
Once a Key Vault is set to private, you will need to connect it to your subnet with a Private Endpoint, and use a VM with the Hybrid Worker Extension to run your code.
The Automation Account holds the runbooks (in this case containing PowerShell and Azure CLI scripts) and passes them to the VM for execution. The Automation Account also defines my Hybrid Worker Groups and Hybrid Workers. I only have one VM in this example, but in theory you could have dozens of VMs in dozens of Worker Groups to distribute the load.
At runtime, the Automation Account first authenticates with the Key Vault via RBAC linked to the Automation Account’s Managed Identity. The Automation Account then passes the script to the VM for actual execution. The VM reaches out to the Key Vault via the Private Endpoint to retrieve any secrets or certificates requested in the runbook. From there the VM runs whatever script it was sent, and returns any outputs to the Automation Account job log.
One interesting thing of note here is the way authentication to the Key Vault works. When I was testing this, I was giving the VM’s Managed Identity the “Key Vault Secrets User” role, but it kept failing authentication. It seems that even though the script is running on the VM, and the VM is actually reaching out to the Key Vault for secrets, the Automation Accounts Managed Identity is what needs to be given those “Secrets User” or “Certificates User” roles.
Costs
These services are fairly low cost. If you use the serverless version of the automation runbooks, you get 500 free minutes of runtime each month. So your compute would be totally free if you kept it under that amount. However, we are using an Azure VM for compute, so we can’t take advantage of that.
I defined several schedules to keep costs as low as possible. Lets say I want my script to run once a week at 3pm. I have a runbook that will start the VM a few minutes ahead of 3pm. The script then runs at 3pm and then another runbook deallocates the VM a few minutes later. While this may not be ideal for larger workloads, deploying it this way will keep costs as low as possible.
Code Breakdown
First our opening Terraform block. Mostly standard. We will use the data “azurerm_client_config” “current” {} later to reference our Azure connection context
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.0"
}
}
required_version = ">=1.1.0"
}
data "azurerm_client_config" "current" {}
provider "azurerm" {
features {}
subscription_id = "e7cc5b12-3e04-4af0-a26f-30657aa9395f"
}
provider "random" {}
|
The following block creates our Resource Group, Virtual Network, and Subnet. We also use the “random” provider to generate a random string to use later
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
#----------------------------------------------------
# Core
#----------------------------------------------------
resource "azurerm_resource_group" "rg" {
name = "hybrid-worker-test-rg"
location = var.location
}
resource "azurerm_virtual_network" "vnet" {
name = "hybrid-worker-test-vnet"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = ["10.0.0.0/24"]
}
resource "azurerm_subnet" "subnet" {
name = "hybrid-worker-test-subnet1"
resource_group_name = azurerm_resource_group.rg.name
virtual_network_name = azurerm_virtual_network.vnet.name
address_prefixes = ["10.0.0.0/26"]
private_endpoint_network_policies = "Disabled"
}
resource "random_uuid" "random" {}
|
Now we create our Key Vault, add a test secret to it and create a few role assignments. We give the Automation Account the “Virtual Machine Contributor” role so it can start/deallocate the VM. We use the earlier mentioned azurerm_client_config to reference the account used to authenticate to Azure so we can give that account the Key Vault Administrator role. This will allow us to actually add secrets to the Key Vault.
Lastly we give the Automation Account Secrets Reader, so our runbooks can access Key Vault secrets.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
#----------------------------------------------------
# Key Vault and IAM
#----------------------------------------------------
resource "azurerm_key_vault" "kv" {
name = "hybrid-worker-test-kv"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
tenant_id = data.azurerm_client_config.current.tenant_id
sku_name = "standard"
rbac_authorization_enabled = true
public_network_access_enabled = false
}
resource "azurerm_key_vault_secret" "secret" {
name = "test-secret"
value = "Hello there! "
key_vault_id = azurerm_key_vault.kv.id
}
resource "azurerm_role_assignment" "aa_vm" {
scope = azurerm_windows_virtual_machine.vm.id
role_definition_name = "Virtual Machine Contributor"
principal_id = azurerm_automation_account.aa.identity[0].principal_id
}
resource "azurerm_role_assignment" "kv_self_admin" {
scope = azurerm_key_vault.kv.id
role_definition_name = "Key Vault Secrets Officer"
principal_id = data.azurerm_client_config.current.object_id
resource "azurerm_role_assignment" "aa_secrets_reader" {
scope = azurerm_key_vault.kv.id
role_definition_name = "Key Vault Secrets User"
principal_id = azurerm_automation_account.aa.identity[0].principal_id
}
|
Now we create the VM. The important parts here relate to the blocks for “powershell_modules and “hybridworkerextension. Because we are running the scripts on an actual VM, that VM must have the necessary PowerShell modules that you plan to reference in your runbooks. Instead of connecting to the VM after creation and manually installing the modules you need, lines 124 - 127 pass a PowerShell script to the VM to download the modules I need for this test case.
Lines 131 - 141 install the Hybrid Worker Extension on the VM, which allows us to add our VM as a Hybrid Worker in the Automation Account
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
|
#----------------------------------------------------
# Compute
#----------------------------------------------------
resource "azurerm_windows_virtual_machine" "vm" {
name = "hybridworkertest-vm"
computer_name = "workervm"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
zone = var.vm_zone
size = var.vm_size
admin_username = var.vm_admin_un
admin_password = var.vm_admin_pw
network_interface_ids = [
azurerm_network_interface.vnic.id
]
os_disk {
caching = "None"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "MicrosoftWindowsServer"
offer = "windowsserver"
sku = "2022-datacenter-azure-edition"
version = "latest"
}
identity {
type = "SystemAssigned"
}
}
resource "azurerm_virtual_machine_extension" "powershell_modules" {
name = "PowerShellModules"
virtual_machine_id = azurerm_windows_virtual_machine.vm.id
publisher = "Microsoft.Compute"
type = "CustomScriptExtension"
type_handler_version = "1.10"
settings = jsonencode({
commandToExecute = "powershell -ExecutionPolicy Unrestricted -Command \"Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force; Install-Module Az.Accounts -Scope AllUsers -Force -AllowClobber; Install-Module Az.KeyVault -Scope AllUsers -Force -AllowClobber\""
})
}
resource "azurerm_virtual_machine_extension" "hybridworkerextension" {
name = "HybridWorkerExtension"
virtual_machine_id = azurerm_windows_virtual_machine.vm.id
publisher = "Microsoft.Azure.Automation.HybridWorker"
type = "HybridWorkerForWindows"
type_handler_version = "1.1"
settings = jsonencode({
"AutomationAccountURL" = azurerm_automation_account.aa.hybrid_service_url
})
}
|
Next we create a Virtual NIC for the VM and create a Private Endpoint linking our Key Vault to our subnet.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
#----------------------------------------------------
# Networking
#----------------------------------------------------
resource "azurerm_network_interface" "vnic" {
name = "hybrid-worker-test-vnic"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
ip_configuration {
name = "internal"
subnet_id = azurerm_subnet.subnet.id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_private_endpoint" "pe" {
name = "hybrid-worker-test-pe"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
subnet_id = azurerm_subnet.subnet.id
private_service_connection {
name = "connection-to-kv"
is_manual_connection = false
private_connection_resource_id = azurerm_key_vault.kv.id
subresource_names = ["vault"]
}
}
|
Lastly we have the automation and scheduling blocks. Here we create the automation account, create runbooks within the automation account, define schedules, and apply those schedules to the runbooks. Instead of showing the entire block, which you surely wont read, I will just include the creation of the Automation Account, a runbook, and a schedule.
To apply a schedule to a runbook, you first create the schedule and then use azurerm_automation_job_schedule to define which runbook the schedule should be applied on.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
resource "azurerm_automation_account" "aa" {
name = "hybrid-worker-test-aa"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
sku_name = var.automation_account_sku
identity {
type = "SystemAssigned"
}
}
resource "azurerm_automation_runbook" "startvm" {
name = "startvm"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
automation_account_name = azurerm_automation_account.aa.name
log_verbose = true
log_progress = true
description = "Allocates VM before script runtime"
runbook_type = "PowerShell"
content = <<-EOT
Connect-AzAccount -Identity
$vmName = "hybridworkertest-vm"
$resourceGroup = "hybrid-worker-test-rg"
Write-Output "Starting VM"
Start-AzVM -ResourceGroupName $resourceGroup -Name $vmName
EOT
}
resource "azurerm_automation_schedule" "start_vm_schedule" {
name = "Start-vm-schedule"
resource_group_name = azurerm_resource_group.rg.name
automation_account_name = azurerm_automation_account.aa.name
frequency = "OneTime"
start_time = var.start_vm_time
timezone = "America/Chicago"
}
resource "azurerm_automation_job_schedule" "start_vm_job" {
resource_group_name = azurerm_resource_group.rg.name
automation_account_name = azurerm_automation_account.aa.name
runbook_name = azurerm_automation_runbook.startvm.name
schedule_name = azurerm_automation_schedule.start_vm_schedule.name
}
|
Lessons Learned
This was my largest Terraform project so far. I wrote all of it under a single main.tf file and it quickly became a huge mess. I relied on the awesome Terraform Style Guide to help me clean it up. I also split my providers and backend sections into different .tf files to reduce the noise in the main.tf file.
Going forward I plan to be much more modular with my Terraform projects, and trying to to cram everything into an unreadable main.tf
Thanks for reading!