Google Cloud Managed Instance Groups (MIGs) are groups of identical virtual machine instances that serve the same purpose.
Instances are created based on an Instance Template which defines the configuration that all instances will use including image, instance size, network, etc.
MIGs that host services are fronted by a load balancer which distributes client requests across the instances in the group.
MIG instances can also run batch processing applications which do not serve client requests and do not require a load balancer.
MIGs can be configured for autoscaling to increase the number of VM instances in the group based on CPU load or demand.
They can also auto-heal by replacing failed instances. Health checks are used to make sure each instance is responding correctly.
MIGs should be Regional and use VM instances in at least two different zones of a region. Regional MIGs can have up to 2000 instances.
Terraform
Two different modules authored by Google can be used to create an Instance Template and MIG:
- Instance template:
terraform-google-modules/vm/google//modules/instance_template
- Multi-version MIG:
terraform-google-modules/vm/google//modules/mig_with_percent
To optionally create an Internal HTTP load balancer, use: GoogleCloudPlatform/lb-internal/google
The following examples below create a service account, two instance templates, a MIG, and an Internal HTTP load balancer.
Pre-requisites
- A custom image should be created with nginx installed and running at boot.
- A VPC with a proxy-only subnet is required.
- The instance template requires a service account.
- The API iam.googleapis.com must be enabled on the project
# Enable IAM API
resource "google_project_service" "project" {
project = "my-gcp-project-1234"
service = "iam.googleapis.com"
disable_on_destroy = false
}
# Service Account required for the Instance Template module
resource "google_service_account" "sa" {
project = "my-gcp-project-1234"
account_id = "sa-mig-test"
display_name = "Service Account MIG test"
depends_on = [ google_project_service.project ]
}
Update project
, account_id
, and display_name
with appropriate values.
Instance Templates
The instance template defines the instance configuration. This includes which network to join, any labels to apply to the instance, the size of the instance, network tags, disks, custom image, etc.
The MIG deployment requires an instance template.
The instance template requires that a source image have already been created.
In this terraform code example, two instance templates are created:
- “A” template – initial version to use in the MIG
- “B” template – future upgrade version to use with an optional canary update method
During the initial deployment, each instance template can point to the same custom image for the source_image
value. In the future, each instance template should point to a different custom image.
# Instance Template "A"
# Module src: https://github.com/terraform-google-modules/terraform-google-vm/blob/master/modules/instance_template
# Registry: https://registry.terraform.io/modules/terraform-google-modules/vm/google/latest/submodules/instance_template
# Creates google_compute_instance_template
module "instance_template_A" {
source = "terraform-google-modules/vm/google//modules/instance_template"
region = "us-central1"
project_id = "my-gcp-project-1234"
subnetwork = "us-central-01"
service_account = {
email = google_service_account.sa.email
scopes = ["cloud-platform"]
}
name_prefix = "nginx-a"
tags = ["nginx"]
labels = { mig = "nginx" }
machine_type = "f1-micro"
startup_script = "sed -i 's/nginx/'$HOSTNAME'/g' /var/www/html/index.nginx-debian.html"
source_image_project = "my-gcp-project-1234"
source_image = "image-nginx"
disk_size_gb = 10
disk_type = "pd-balanced"
preemptible = true
}
# Instance Template "B"
module "instance_template_B" {
source = "terraform-google-modules/vm/google//modules/instance_template"
region = "us-central1"
project_id = "my-gcp-project-1234"
subnetwork = "us-central-01"
service_account = {
email = google_service_account.sa.email
scopes = ["cloud-platform"]
}
name_prefix = "nginx-b"
tags = ["nginx"]
labels = { mig = "nginx" }
machine_type = "f1-micro"
startup_script = "sed -i 's/nginx/'$HOSTNAME'/g' /var/www/html/index.nginx-debian.html"
source_image_project = "my-gcp-project-1234"
source_image = "image-nginx"
disk_size_gb = 10
disk_type = "pd-balanced"
preemptible = true
}
Update the following with appropriate values:
- Module name
region
project_id
subnetwork
– the VPC subnet to use for instances deployed via the templatename_prefix
– prefix the name of instance template, it will have a version attached to the name.- Be sure to include any specific versioning to indicate what is in the custom image.
- Lowercase only.
tags
– any required tagslabels
– network labels to apply to instances deployed via the templatemachine_type
– machine size to usestartup_script
– startup script to run on each boot (not just deployment)source_image_project
– project where the image residessource_image
– image namedisk_size_gb
– size of the boot diskdisk_type
– type of boot diskpreemptible
– if set to true, instances can be pre-empted as needed by Google Cloud.- Preemptible instances can run for up to 24 hours before being stopped.
- The MIG will recreate replacements when preemptible capacity is available again.
- This is a cost-saving measure and should be used where possible.
- See Instance groups | Compute Engine Documentation | Google Cloud
More instance template module options are available:
- See the module source for
variables.tf
andmain.tf
- See the resource definition for google_compute_instance_template
Changes to the instance template will result in a new version of the template. The MIG will be modified to use the new version. All MIG instances will be recreated. See the update_policy
section of the MIG module definition (below) to control the update behavior.
Managed Instance Group
The MIG creates the set of instances using the same custom image and image template. Instances are customized as usual during first boot.
A custom startup script can run every time the instance starts and configure the VM further. See Overview | Compute Engine Documentation | Google Cloud
In this Regional MIG terraform example, the initial set of instances are deployed using the “A” template set as the instance_template_initial_version
.
The same “A” template is also set for the instance_template_next_version
with a value of 0 for the next_version_percent
.
In a future canary update, set the instance_template_next_version
to the “B” template with an appropriate value for next_version_percent
.
# Regional Managed Instance Group with support for canary updates
# Module src: https://github.com/terraform-google-modules/terraform-google-vm/tree/master/modules/mig_with_percent
# Registry: https://registry.terraform.io/modules/terraform-google-modules/vm/google/latest/submodules/mig_with_percent
# Creates google_compute_health_check.http (optional), google_compute_health_check.https (optional), google_compute_health_check.tcp (optional), google_compute_region_autoscaler.autoscaler (optional), google_compute_region_instance_group_manager.mig
module "mig_nginx" {
source = "terraform-google-modules/vm/google//modules/mig_with_percent"
project_id = "my-gcp-project-1234"
hostname = "mig-nginx"
region = "us-central1"
target_size = 4
instance_template_initial_version = module.instance_template_A.self_link
instance_template_next_version = module.instance_template_A.self_link
next_version_percent = 0
//distribution_policy_zones = ["us-central1-a", "us-central1-f"]
update_policy = [{ # See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_region_instance_group_manager#nested_update_policy
type = "PROACTIVE"
instance_redistribution_type = "PROACTIVE"
minimal_action = "REPLACE"
max_surge_percent = null
max_unavailable_percent = null
max_surge_fixed = 4
max_unavailable_fixed = null
min_ready_sec = 50
replacement_method = "SUBSTITUTE"
}]
named_ports = [{
name = "web"
port = 80
}]
health_check = {
type = "http"
initial_delay_sec = 30
check_interval_sec = 30
healthy_threshold = 1
timeout_sec = 10
unhealthy_threshold = 5
response = ""
proxy_header = "NONE"
port = 80
request = ""
request_path = "/"
host = ""
}
autoscaling_enabled = "false"
/*
max_replicas = var.max_replicas
min_replicas = var.min_replicas
cooldown_period = var.cooldown_period
autoscaling_cpu = var.autoscaling_cpu
autoscaling_metric = var.autoscaling_metric
autoscaling_lb = var.autoscaling_lb
autoscaling_scale_in_control = var.autoscaling_scale_in_control
*/
}
Update the following with appropriate values:
- Module name
project_id
hostname
– the prefix for provisioned VM names/hostnames. Will have a random set of 4 characters appended to the end.region
target_size
– number of instances to create in the MIG. Does not need to equal the number of zones indistribution_policy_zones
.instance_template_initial_version
– template to use for initial deploymentinstance_template_next_version
– template to use for future canary updatenext_version_percent
– percentage of instances in the group (oftarget_size
) that should use the canary updatedistribution_policy_zones
– zone names in the region where VMs should be provisioned.- Optional. If not specified, the Google-authored terraform module will automatically select each zone in the region.
- Example:
us-central1
region has 4 zones so each zone will be populated in this field. This directly impacts theupdate_policy
and itsmax_surge_fixed
value.
- Example:
- This value cannot be changed later. The module will ignore any changes.
- The MIG will need to be destroyed and recreated to update the zones to use.
- More than two zones can be specified.
- The
target_size
does not need to match the number of zones specified. - See About regional MIGs | Compute Engine Documentation | Google Cloud .
- Optional. If not specified, the Google-authored terraform module will automatically select each zone in the region.
update_policy
– specifies how instances should be recreated when a new version of the instance template is available.type
set toPROACTIVE
will update all instances in a rolling fashion.- Leave
max_unavailable_fixed
asnull
which results in a value of 0, meaning no live instances can be unavailable. - Recommended
- Leave
OPPORTUNISTIC
means “only when you manually initiate the update on selected instances or when new instances are created. New instances can be created when you or another service, such as an autoscaler, resizes the MIG. Compute Engine does not actively initiate requests to apply opportunistic updates on existing instances.”- Not recommended
max_surge_fixed
indicates the number of additional instances that are temporarily added to the group during an update.- These new instances will use the updated template.
- Should be greater than or equal to the number of zones in
distribution_policy_zones
. If there are no zones specified indistribution_policy_zones
, as mentioned previously, the Google-authored MIG module will automatically select all the zones in the region.
replacement_method
can be set to either of the following values:RECREATE
instance name is preserved by deleting the old instance and then creating a new one with the same name.SUBSTITUTE
will create new instances with new names.- Results in a faster upgrade of the MIG – instances are available sooner than using
RECREATE
. - Recommended.
- Results in a faster upgrade of the MIG – instances are available sooner than using
- See Terraform Registry and Automatically apply VM configuration updates in a MIG | Compute Engine Documentation | Google Cloud
named_ports
– set the port name and port number as appropriatehealth_check
– set the checktype
,port
, andrequest_path
as appropriate- Autoscaling can also be configured. See Autoscaling groups of instances | Compute Engine Documentation | Google Cloud
More MIG module options are available:
- See the module source of variables.tf and main.tf
- See the resource definition for google_compute_region_instance_group_manager
Changes to the MIG may result in a VMs needing to update. See the update_policy
section of the MIG module definition (above) to configure the behavior when updating the MIG members.
Load Balancer
An Internal Load Balancer can make a MIG highly available to internal clients.
module "ilb_nginx" {
source = "GoogleCloudPlatform/lb-internal/google"
version = "~4.0"
project = "my-gcp-project-1234"
network = module.vpc_central.network_name
subnetwork = module.vpc_central.subnets["us-central1/central-01-subnet-ilb"].name
region = "us-central1"
name = "ilb-nginx"
ports = ["80"]
source_tags = ["nginx"]
target_tags = ["nginx"]
backends = [{
group = module.mig_nginx.instance_group
description = ""
failover = false
}]
health_check = {
type = "http"
check_interval_sec = 30
healthy_threshold = 1
timeout_sec = 10
unhealthy_threshold = 5
response = ""
proxy_header = "NONE"
port = 80
request = ""
request_path = "/"
host = ""
enable_log = false
port_name = "web"
}
}
Update the following with appropriate values:
- Module name
project
network
andsubnetwork
– the VPC and proxy-only subnet to useregion
name
ports
– the port to listen onsource_tags
andtarget_tags
– network tags to use, should be present on the MIG members via the instance template.backends
– points to the MIGhealth_check
– should generally match the MIG healthcheck.
More options are available, : see the module source for variables.tf and main.tf
Be sure to consider any necessary firewall rules, especially if using network tags.
The Google-authored MIG module has create_before_destroy
set to true, so a new MIG can replace an existing one as a backend behind the load balancer via a very minimal outage (less than 10 seconds). The new MIG will be created and added as a backend, and then the old MIG will be destroyed.
Day 2 operations
Changing size of MIG
If needed, adjust the target_size
value of the MIG module to increase or decrease the number of instances. Adjustments take place right away.
If increasing the number of instances and a new template is in place and the update_policy
is OPPORTUNISTIC
, the new instances will be deployed using the new template.
Changing the zones to use for a MIG
Cannot be changed after creation. MIG must be destroyed and recreated.
Deleting MIG members
Deleting a MIG member automatically reduces the target number of instances for the MIG. The deleted member is not replaced.
Restarting MIG members
Do not manually restart a MIG instance from within the VM itself. This will cause its healthcheck to fail and the MIG will delete/recreate the VM using the template.
Use the RESTART/REPLACE button in the Cloud Console and choose the Restart option. This affects all instances in the group, but can be limited to only acting against a maximum number at a time (“Maximum unavailable instances”).
The “Replace” option within RESTART/REPLACE will delete and recreate instances using the current template.
Updating MIG instances to a new version
When a new version of the custom image is released, such as when it has been updated with new software, the MIG can be updated in a controlled fashion until all members are running the updated version, without any outage.
The MIG module update_policy
setting is very important for this process to ensure there is no outage:
max_surge_fixed
is the number of additional instances created in the MIG and verified healthy before the old ones are removed.- Should be set to greater than or equal to the number of zones in
distribution_policy_zones
- Should be set to greater than or equal to the number of zones in
max_unavailable_fixed
should be set to null which equals 0: no live instances will be unavailable during the update.
The MIG module has options for two different instance templates in order to support performing a canary update where only a percentage of instances are upgraded to the new version:
instance_template_initial_version
– template to use for initial deploymentinstance_template_next_version
– template to use for future canary updatenext_version_percent
– percentage of instances in the group (oftarget_size
) that should use the canary update
Initially, both options may point to the same template and 0% is allocated to the “next” version.
If a load balancer is used, newly created instances that are verified healthy will automatically be selected to respond to client requests.
Canary update
To move a percentage of instances to the “next” version via a “canary” update:
- Set the
instance_template_next_version
to point to an instance template which uses an updated custom image - Set the
next_version_percent
to an appropriate percentage of instances in the group that should use the “next” template. - Make sure
update_policy
hastype
set toPROACTIVE
– this will cause the change to take effect right away.
When applied via terraform, all instances will be recreated (adhering to the update_policy
) but a percentage of instances will be created using the “next” template.
After the canary update has been validated and all instances should be upgraded, see the steps below for a Regular update.
Regular update
To update all instances at once (adhering to the update_policy
):
- Set both the
instance_template_initial_version
andinstance_template_next_version
to point to an instance template which uses an updated custom image - Set the
next_version_percent
to 0. - Make sure
update_policy
hastype
set toPROACTIVE
– this will cause the change to take effect right away.
When applied via terraform, all instances will be recreated (adhering to the update_policy
).