Working with Google Cloud Managed Instance Groups

Google Cloud Managed Instance Groups (MIGs) are groups of identical virtual machine instances that serve the same purpose.

Instances are created based on an Instance Template which defines the configuration that all instances will use including image, instance size, network, etc.

MIGs that host services are fronted by a load balancer which distributes client requests across the instances in the group.

MIG instances can also run batch processing applications which do not serve client requests and do not require a load balancer.

MIGs can be configured for autoscaling to increase the number of VM instances in the group based on CPU load or demand.

They can also auto-heal by replacing failed instances. Health checks are used to make sure each instance is responding correctly.

MIGs should be Regional and use VM instances in at least two different zones of a region. Regional MIGs can have up to 2000 instances.

Terraform

Two different modules authored by Google can be used to create an Instance Template and MIG:

  • Instance template: terraform-google-modules/vm/google//modules/instance_template
  • Multi-version MIG: terraform-google-modules/vm/google//modules/mig_with_percent

To optionally create an Internal HTTP load balancer, use: GoogleCloudPlatform/lb-internal/google

The following examples below create a service account, two instance templates, a MIG, and an Internal HTTP load balancer.

Pre-requisites

  • A custom image should be created with nginx installed and running at boot.
  • A VPC with a proxy-only subnet is required.
  • The instance template requires a service account.
# Enable IAM API
resource "google_project_service" "project" {
 project = "my-gcp-project-1234"
 service = "iam.googleapis.com"
 disable_on_destroy = false
}

# Service Account required for the Instance Template module
resource "google_service_account" "sa" {
 project = "my-gcp-project-1234"
 account_id = "sa-mig-test"
 display_name = "Service Account MIG test"
 depends_on = [ google_project_service.project ]
}

Update project, account_id, and display_name with appropriate values.

Instance Templates

The instance template defines the instance configuration. This includes which network to join, any labels to apply to the instance, the size of the instance, network tags, disks, custom image, etc.

The MIG deployment requires an instance template.

The instance template requires that a source image have already been created.

In this terraform code example, two instance templates are created:

  • “A” template – initial version to use in the MIG
  • “B” template – future upgrade version to use with an optional canary update method

During the initial deployment, each instance template can point to the same custom image for the source_image value. In the future, each instance template should point to a different custom image.

# Instance Template "A"
# Module src: https://github.com/terraform-google-modules/terraform-google-vm/blob/master/modules/instance_template
# Registry: https://registry.terraform.io/modules/terraform-google-modules/vm/google/latest/submodules/instance_template
# Creates google_compute_instance_template
module "instance_template_A" {
 source = "terraform-google-modules/vm/google//modules/instance_template"
 region = "us-central1"
 project_id = "my-gcp-project-1234"
 subnetwork = "us-central-01"
 
 service_account = {
  email = google_service_account.sa.email
  scopes = ["cloud-platform"]
 }

 name_prefix = "nginx-a"
 tags = ["nginx"]
 labels = { mig = "nginx" }
 machine_type = "f1-micro"
 startup_script = "sed -i 's/nginx/'$HOSTNAME'/g' /var/www/html/index.nginx-debian.html"

 source_image_project = "my-gcp-project-1234"
 source_image = "image-nginx"
 disk_size_gb = 10
 disk_type = "pd-balanced"
 preemptible = true
}

# Instance Template "B"
module "instance_template_B" {
 source = "terraform-google-modules/vm/google//modules/instance_template"
 region = "us-central1"
 project_id = "my-gcp-project-1234"
 subnetwork = "us-central-01"
 
 service_account = {
  email = google_service_account.sa.email
  scopes = ["cloud-platform"]
 }

 name_prefix = "nginx-b"
 tags = ["nginx"]
 labels = { mig = "nginx" }
 machine_type = "f1-micro"
 startup_script = "sed -i 's/nginx/'$HOSTNAME'/g' /var/www/html/index.nginx-debian.html"

 source_image_project = "my-gcp-project-1234"
 source_image = "image-nginx"
 disk_size_gb = 10
 disk_type = "pd-balanced"
 preemptible = true
}

Update the following with appropriate values:

  • Module name
  • region
  • project_id
  • subnetwork – the VPC subnet to use for instances deployed via the template
  • name_prefix– prefix the name of instance template, it will have a version attached to the name.
    • Be sure to include any specific versioning to indicate what is in the custom image.
    • Lowercase only.
  • tags – any required tags
  • labels – network labels to apply to instances deployed via the template
  • machine_type – machine size to use
  • startup_script – startup script to run on each boot (not just deployment)
  • source_image_project – project where the image resides
  • source_image – image name
  • disk_size_gb – size of the boot disk
  • disk_type – type of boot disk
  • preemptible – if set to true, instances can be pre-empted as needed by Google Cloud.

More instance template module options are available:

Changes to the instance template will result in a new version of the template. The MIG will be modified to use the new version. All MIG instances will be recreated. See the update_policy section of the MIG module definition (below) to control the update behavior.

Managed Instance Group

The MIG creates the set of instances using the same custom image and image template. Instances are customized as usual during first boot.

A custom startup script can run every time the instance starts and configure the VM further. See Overview  |  Compute Engine Documentation  |  Google Cloud

In this Regional MIG terraform example, the initial set of instances are deployed using the “A” template set as the instance_template_initial_version.

The same “A” template is also set for the instance_template_next_version with a value of 0 for the next_version_percent.

In a future canary update, set the instance_template_next_version to the “B” template with an appropriate value for next_version_percent.

# Regional Managed Instance Group with support for canary updates 
# Module src: https://github.com/terraform-google-modules/terraform-google-vm/tree/master/modules/mig_with_percent 
# Registry: https://registry.terraform.io/modules/terraform-google-modules/vm/google/latest/submodules/mig_with_percent 
# Creates google_compute_health_check.http (optional), google_compute_health_check.https (optional), google_compute_health_check.tcp (optional), google_compute_region_autoscaler.autoscaler (optional), google_compute_region_instance_group_manager.mig 

module "mig_nginx" { 
 source = "terraform-google-modules/vm/google//modules/mig_with_percent" 
 project_id = "my-gcp-project-1234" 
 hostname = "mig-nginx" 
 region = "us-central1" 
 target_size = 4
 
 instance_template_initial_version = module.instance_template_A.self_link 

 instance_template_next_version = module.instance_template_A.self_link 
 next_version_percent = 0 
 
 //distribution_policy_zones = ["us-central1-a", "us-central1-f"]
 
 update_policy = [{ # See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_region_instance_group_manager#nested_update_policy 
  type = "PROACTIVE" 
  instance_redistribution_type = "PROACTIVE" 
  minimal_action = "REPLACE" 
  max_surge_percent = null 
  max_unavailable_percent = null 
  max_surge_fixed = 4 
  max_unavailable_fixed = null 
  min_ready_sec = 50 
  replacement_method = "SUBSTITUTE" 
  }] 
 
 named_ports = [{ 
  name = "web" 
  port = 80 
 }] 
 
 health_check = { 
  type = "http" 
  initial_delay_sec = 30 
  check_interval_sec = 30 
  healthy_threshold = 1 
  timeout_sec = 10 
  unhealthy_threshold = 5 
  response = "" 
  proxy_header = "NONE" 
  port = 80 
  request = "" 
  request_path = "/" 
  host = "" 
 } 
 
 autoscaling_enabled = "false" 
 /* 
 max_replicas = var.max_replicas 
 min_replicas = var.min_replicas 
 cooldown_period = var.cooldown_period 
 autoscaling_cpu = var.autoscaling_cpu 
 autoscaling_metric = var.autoscaling_metric 
 autoscaling_lb = var.autoscaling_lb 
 autoscaling_scale_in_control = var.autoscaling_scale_in_control 
 */ 
} 

Update the following with appropriate values:

  • Module name
  • project_id
  • hostname – the prefix for provisioned VM names/hostnames. Will have a random set of 4 characters appended to the end.
  • region
  • target_size – number of instances to create in the MIG. Does not need to equal the number of zones in distribution_policy_zones.
  • instance_template_initial_version – template to use for initial deployment
  • instance_template_next_version – template to use for future canary update
  • next_version_percent – percentage of instances in the group (of target_size) that should use the canary update
  • distribution_policy_zones – zone names in the region where VMs should be provisioned.
    • Optional. If not specified, the Google-authored terraform module will automatically select each zone in the region.
      • Example: us-central1 region has 4 zones so each zone will be populated in this field. This directly impacts the update_policy and its max_surge_fixed value.
    • This value cannot be changed later. The module will ignore any changes.
      • The MIG will need to be destroyed and recreated to update the zones to use.
    • More than two zones can be specified.
    • The target_size does not need to match the number of zones specified.
    • See About regional MIGs  |  Compute Engine Documentation  |  Google Cloud .
  • update_policy – specifies how instances should be recreated when a new version of the instance template is available.
    • type set to
      • PROACTIVE will update all instances in a rolling fashion.
        • Leave max_unavailable_fixed as null which results in a value of 0, meaning no live instances can be unavailable.
        • Recommended
      • OPPORTUNISTIC means “only when you manually initiate the update on selected instances or when new instances are created. New instances can be created when you or another service, such as an autoscaler, resizes the MIG. Compute Engine does not actively initiate requests to apply opportunistic updates on existing instances.”
        • Not recommended
    • max_surge_fixed indicates the number of additional instances that are temporarily added to the group during an update.
      • These new instances will use the updated template.
      • Should be greater than or equal to the number of zones in distribution_policy_zones. If there are no zones specified in distribution_policy_zones, as mentioned previously, the Google-authored MIG module will automatically select all the zones in the region.
    • replacement_method can be set to either of the following values:
      • RECREATE instance name is preserved by deleting the old instance and then creating a new one with the same name.
      • SUBSTITUTE will create new instances with new names.
        • Results in a faster upgrade of the MIG – instances are available sooner than using RECREATE.
        • Recommended.
    • See Terraform Registry and Automatically apply VM configuration updates in a MIG  |  Compute Engine Documentation  |  Google Cloud
  • named_ports – set the port name and port number as appropriate
  • health_check – set the check type, port, and request_path as appropriate
  • Autoscaling can also be configured. See Autoscaling groups of instances  |  Compute Engine Documentation  |  Google Cloud

More MIG module options are available:

Changes to the MIG may result in a VMs needing to update. See the update_policy section of the MIG module definition (above) to configure the behavior when updating the MIG members.

Load Balancer

An Internal Load Balancer can make a MIG highly available to internal clients.

module "ilb_nginx" {
 source = "GoogleCloudPlatform/lb-internal/google"
 version = "~4.0"
 project = "my-gcp-project-1234"
 network = module.vpc_central.network_name
 subnetwork = module.vpc_central.subnets["us-central1/central-01-subnet-ilb"].name
 region = "us-central1"
 name = "ilb-nginx"
 ports = ["80"]
 source_tags = ["nginx"]
 target_tags = ["nginx"]

 backends = [{
  group = module.mig_nginx.instance_group
  description = ""
  failover = false
 }]

 health_check = {
  type = "http"
  check_interval_sec = 30
  healthy_threshold = 1
  timeout_sec = 10
  unhealthy_threshold = 5
  response = ""
  proxy_header = "NONE"
  port = 80
  request = ""
  request_path = "/"
  host = ""
  enable_log = false
  port_name = "web"
 }
}

Update the following with appropriate values:

  • Module name
  • project
  • network and subnetwork – the VPC and proxy-only subnet to use
  • region
  • name
  • ports – the port to listen on
  • source_tags and target_tags – network tags to use, should be present on the MIG members via the instance template.
  • backends – points to the MIG
  • health_check – should generally match the MIG healthcheck.

More options are available, : see the module source for variables.tf and main.tf

Be sure to consider any necessary firewall rules, especially if using network tags.

The Google-authored MIG module has create_before_destroy set to true, so a new MIG can replace an existing one as a backend behind the load balancer via a very minimal outage (less than 10 seconds). The new MIG will be created and added as a backend, and then the old MIG will be destroyed.

Day 2 operations

Changing size of MIG

If needed, adjust the target_size value of the MIG module to increase or decrease the number of instances. Adjustments take place right away.

If increasing the number of instances and a new template is in place and the update_policy is OPPORTUNISTIC, the new instances will be deployed using the new template.

Changing the zones to use for a MIG

Cannot be changed after creation. MIG must be destroyed and recreated.

Deleting MIG members

Deleting a MIG member automatically reduces the target number of instances for the MIG. The deleted member is not replaced.

Restarting MIG members

Do not manually restart a MIG instance from within the VM itself. This will cause its healthcheck to fail and the MIG will delete/recreate the VM using the template.

Use the RESTART/REPLACE button in the Cloud Console and choose the Restart option. This affects all instances in the group, but can be limited to only acting against a maximum number at a time (“Maximum unavailable instances”).

The “Replace” option within RESTART/REPLACE will delete and recreate instances using the current template.

Updating MIG instances to a new version

When a new version of the custom image is released, such as when it has been updated with new software, the MIG can be updated in a controlled fashion until all members are running the updated version, without any outage.

The MIG module update_policy setting is very important for this process to ensure there is no outage:

  • max_surge_fixed is the number of additional instances created in the MIG and verified healthy before the old ones are removed.
    • Should be set to greater than or equal to the number of zones in distribution_policy_zones
  • max_unavailable_fixed should be set to null which equals 0: no live instances will be unavailable during the update.

The MIG module has options for two different instance templates in order to support performing a canary update where only a percentage of instances are upgraded to the new version:

  • instance_template_initial_version – template to use for initial deployment
  • instance_template_next_version – template to use for future canary update
  • next_version_percent – percentage of instances in the group (of target_size) that should use the canary update

Initially, both options may point to the same template and 0% is allocated to the “next” version.

If a load balancer is used, newly created instances that are verified healthy will automatically be selected to respond to client requests.

Canary update

To move a percentage of instances to the “next” version via a “canary” update:

  1. Set the instance_template_next_version to point to an instance template which uses an updated custom image
  2. Set the next_version_percent to an appropriate percentage of instances in the group that should use the “next” template.
  3. Make sure update_policy has type set to PROACTIVE – this will cause the change to take effect right away.

When applied via terraform, all instances will be recreated (adhering to the update_policy) but a percentage of instances will be created using the “next” template.

After the canary update has been validated and all instances should be upgraded, see the steps below for a Regular update.

Regular update

To update all instances at once (adhering to the update_policy):

  1. Set both the instance_template_initial_version and instance_template_next_version to point to an instance template which uses an updated custom image
  2. Set the next_version_percent to 0.
  3. Make sure update_policy has type set to PROACTIVE – this will cause the change to take effect right away.

When applied via terraform, all instances will be recreated (adhering to the update_policy).

Leave a Reply

Your email address will not be published. Required fields are marked *