Similar to other public cloud providers, Google Cloud offers the ability to attach local high-speed disks to virtual machines (VMs). This offering is called Local SSD.
The “Local” in Local SSD means that the disks are physically attached to the hypervisor host where the VM is executing. This provides the VM with high-speed/low-latency data operations over the NVME interface.
On Windows GCE instances (VMs), Local SSD volumes are typically used for temporary or ephemeral data such as the Windows Pagefile, Microsoft SQL Server Temp database, or other high-speed caching needs.
Certain GCE instance machine sizes come with different quantities of Local SSD disks, which are always 375 GB in size. They can be striped together to create larger volumes with higher performance using logical volume management tools in the Operating System.
However, again like other public clouds, Local SSD storage is ephemeral and comes with some risks.
Data on volumes created with Local SSD disks can be irretrievably lost and no guarantee is made as to the safety or durability of that data.
Specifically, if the OS is shut down from within the Guest Operating System, the GCE instance will power off and the Local SSD disks and the data they contain will be lost. When the VM is started back up via the Cloud Console, it will have new, empty Local SSD disks which must be initialized again in order to be used.
There are other situations where data on Local SSD disks can be lost, such as if the hypervisor host has an issue and Google Cloud cannot migrate the Local SSD data along with the VM to a new host.
So, generally speaking, Local SSDs offer great performance but the usage of them needs to be understood and the loss of data should not impact the application.
Usage with localssd_init
In Google Cloud, Local SSD disks show up as raw disks that need to be initialized and formatted before they can be used.
To automate the initialization of Local SSDs on Windows GCE instances and create usable volumes, I have developed a simple PowerShell script called localssd_init.ps1 which can be found in my GitHub repository gce-localssd-init-powershell.
The script is configured to check if certain drive letters are present. If the drive letters are missing, it will recreate them using a specific configuration of Local SSD disks.
The script uses Windows Storage Spaces to create a new Storage Pool using the necessary quantity of Local SSD volumes using “Simple” resiliency which is the same as Striping and then creates a new Volume using the maximum size of the pool and formats it using NTFS with the required allocation size and mounts it at the required drive letter.
During server creation, the localssd_init.ps1 script can be used to create the Local SSD-backed volumes and then run at every subsequent boot to make sure they exist and recreate them if they are missing.
Simply run the script and the disks will be created. The script will log to the standard output and also create entries in the System Event Log.
To enable the script to run at every boot, use this snippet to configure a new Windows Task Manager job to run at startup. Be sure to update the path to the script:
If you are on the fence about using Local SSD on Windows GCE instances because those disks might need to be initialized again after an outage, check out localssd_init.ps1 to make sure they’re always available for use.
Private Google Access (PGA) is the method by which resources in Google Cloud such as Google Cloud Compute Engine (GCE), Google Cloud VMware Engine (GCVE) or on-premise, can access Google Cloud APIs without having Internet access or needing an assigned external IP address.
For on-premise clients, this connectivity can take place through private hybrid connections such as Interconnect which may be faster than the subscribed Internet service.
To enable clients to use PGA to access common APIs, two things need to be configured:
Routing: Clients need to have a network route to reach these IP addresses.
DNS: Clients need to resolve the public API fully-qualified domain name (FQDN, such as storage.googleapis.com) to one of the 4 IP addresses in the 126.96.36.199/30 range: 188.8.131.52, .9, .10, and .11.
In order to access the 184.108.40.206/30 range for private.googleapis.com, we need to first determine where our clients are located on the network and how they get to the Internet.
Google Compute Engine
Compute Engine clients on a VPC must meet the requirements to use PGA. Two of these requirements are the VM must not have an external public IP address and it must use a subnet which has PGA enabled. Enabling PGA is done per-subnet, not per-VPC. Clients will access the 220.127.116.11/30 range through the VPC’s standard default internet gateway route.
If the default internet gateway route has been overridden by a custom route that directs Internet-bound traffic (0.0.0.0/0) to a firewall appliance, then an additional custom route must be created for the 18.104.22.168/30 range with a next hop of default-internet-gateway and a priority that is higher than the custom route for 0.0.0.0/0.
This way, traffic for 22.214.171.124/30 will go out the default Internet gateway of the VPC. It doesn’t actually go to the Internet – instead the underlying Google Cloud network will accept that traffic and direct it to the private internal interfaces of Google APIs.
In the screenshot below, the routes are ordered from highest priority (lowest number) to lowest priority (highest number). The private-google-access route has a priority of 90 which is a higher priority than the route for 0.0.0.0/0 with a priority of 100 that leads to another network. Therefore, traffic for the PGA range 126.96.36.199/30 will go out the default Internet gateway of the VPC and not onward to the other network like all other Internet traffic.
The VPC in your project should be advertising a route for for 0.0.0.0/0 with a priority that overrides the default internet gateway route. Therefore, an additional custom route must be created for the 188.8.131.52/30 range with a next hop of default-internet-gateway and a priority that is higher than the custom route for 0.0.0.0/0.
Typically, on-premise clients trying to reach the 184.108.40.206/30 range for private.googleapis.com would take the default route out to the internet. But, since this range is not available on the internet, communication will fail.
Supporting Private Google Access with on-premise clients requires hybrid connectivity between the Google Cloud project VPC and the on-premise environment such as through Interconnect or Cloud VPN.
A route for 220.127.116.11/30 must be advertised by the Cloud Router that is associated with the hybrid connectivity. This way, clients on-premise will have a route to 18.104.22.168/30 that leads back over the hybrid connectivity, instead of out to the internet.
Additionally, if the VPC associated with the hybrid connectivity (Interconnect or Cloud VPN) has its default internet gateway route overridden by a custom route that directs Internet-bound traffic (0.0.0.0/0) to a firewall appliance, then an additional custom route must be created for the 22.214.171.124/30 range with a next hop of default-internet-gateway and a priority that is higher than the custom route for 0.0.0.0/0.
This way, on-premise clients trying to reach the 126.96.36.199/30 range will be directed over the hybrid connectivity and then out the VPC’s default internet gateway to access PGA.
To test connectivity for any client to the 188.8.131.52/30 range for private.googleapis.com once routing has been configured, use the Test-NetConnection PowerShell cmdlet with port 80 or 443 (only HTTP/S is supported, ICMP Ping is not). The TcpTestSucceeded value should return True if the routing was configured successfully:
Verification of connectivity should be completed prior to making any DNS changes to avoid an outage for clients trying to reach Google APIs.
Once routing is in place and tested, changes in DNS must be made. These changes should be done on the DNS server used by clients.
To override the DNS resolution of a Google API fully-qualified domain name (FQDN), we need to create a new zone in DNS along with A- and CNAME-records that resolve the API FQDN to the private IPs. This way, clients will no longer use the publicly-advertised DNS zone and records which resolve to the public IP of the API and will instead use the private IP from the DNS server’s overriding zone.
If a client is configured to check its local hosts file first for DNS resolution prior to using the internal or public DNS, entries in the hosts file can be configured to point a public API FQDN to a PGA IP. This is a safe way to fully test PGA on a single host before affecting all clients by updating the common internal DNS server.
In this example, we override resolution of storage.googleapis.com and point it to one of the private.googleapis.com IP addresses, 184.108.40.206.
% cat /etc/hosts
# Host Database
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
Any requests on this client to storage.googleapis.com will now go to 220.127.116.11 and should be successful as long as routing is in place.
Within the new googleapis.com zone, create A-records for private.googleapis.com using the four different IP addresses in the 18.104.22.168/30 range: 22.214.171.124, 126.96.36.199, 188.8.131.52, and 184.108.40.206.
On-premise Active Directory clients
If clients are part of an Active Directory (AD) domain, AD DNS can be used to create the zone and A-records:
Be sure to create A-records for each of the IPs:
Next, create a wildcard CNAME-record for *.googleapis.com which points to the private.googleapis.com A records.
Now, when any client tries to resolve any googleapis.com hostname, it will be given one of the PGA IPs.
Compute Engine with Google Cloud DNS
If clients use Google Cloud DNS, such as Compute Engine instances, this same configuration can be made in Cloud DNS.
Create a new private zone for googleapis.com, and then create a record set for private.googleapis.com and add the four IP addresses, and then also create a record set for the wildcard CNAME *.googleapis.com.
Terraform can be used to create zones and record sets. Following is an example for setting up googleapis.com:
Google Cloud VMware Engine clients will need to use a DNS server such as Active Directory or Google Cloud DNS (through a Cloud DNS inbound server policy) to resolve overriding domain names.
Additional API domains
Additional zones may need to be configured in the DNS server other than googleapis.com depending on what the client is trying to access. This could include gcr.io and others. See Domain Options for private.googleapis.com.
For example, to override the domain gcr.io, create A records for the domain itself and then a wildcard CNAME for *.gcr.io which points to the domain A records.
When using any Google Cloud service, be sure to review which API FQDNs are in use and determine if additional overriding zones need to be created in the DNS server.
Changes to the instance template will result in a new version of the template. The MIG will be modified to use the new version. All MIG instances will be recreated. See the update_policy section of the MIG module definition (below) to control the update behavior.
Managed Instance Group
The MIG creates the set of instances using the same custom image and image template. Instances are customized as usual during first boot.
update_policy – specifies how instances should be recreated when a new version of the instance template is available.
type set to
PROACTIVE will update all instances in a rolling fashion.
Leave max_unavailable_fixed as null which results in a value of 0, meaning no live instances can be unavailable.
OPPORTUNISTIC means “only when you manually initiate the update on selected instances or when new instances are created. New instances can be created when you or another service, such as an autoscaler, resizes the MIG. Compute Engine does not actively initiate requests to apply opportunistic updates on existing instances.”
max_surge_fixed indicates the number of additional instances that are temporarily added to the group during an update.
These new instances will use the updated template.
Should be greater than or equal to the number of zones in distribution_policy_zones. If there are no zones specified in distribution_policy_zones, as mentioned previously, the Google-authored MIG module will automatically select all the zones in the region.
replacement_method can be set to either of the following values:
RECREATE instance name is preserved by deleting the old instance and then creating a new one with the same name.
SUBSTITUTE will create new instances with new names.
Results in a faster upgrade of the MIG – instances are available sooner than using RECREATE.
Be sure to consider any necessary firewall rules, especially if using network tags.
The Google-authored MIG module has create_before_destroy set to true, so a new MIG can replace an existing one as a backend behind the load balancer via a very minimal outage (less than 10 seconds). The new MIG will be created and added as a backend, and then the old MIG will be destroyed.
Day 2 operations
Changing size of MIG
If needed, adjust the target_size value of the MIG module to increase or decrease the number of instances. Adjustments take place right away.
If increasing the number of instances and a new template is in place and the update_policy is OPPORTUNISTIC, the new instances will be deployed using the new template.
Changing the zones to use for a MIG
Cannot be changed after creation. MIG must be destroyed and recreated.
Deleting MIG members
Deleting a MIG member automatically reduces the target number of instances for the MIG. The deleted member is not replaced.
Restarting MIG members
Do not manually restart a MIG instance from within the VM itself. This will cause its healthcheck to fail and the MIG will delete/recreate the VM using the template.
Use the RESTART/REPLACE button in the Cloud Console and choose the Restart option. This affects all instances in the group, but can be limited to only acting against a maximum number at a time (“Maximum unavailable instances”).
The “Replace” option within RESTART/REPLACE will delete and recreate instances using the current template.
Updating MIG instances to a new version
When a new version of the custom image is released, such as when it has been updated with new software, the MIG can be updated in a controlled fashion until all members are running the updated version, without any outage.
The MIG module update_policy setting is very important for this process to ensure there is no outage:
max_surge_fixed is the number of additional instances created in the MIG and verified healthy before the old ones are removed.
Should be set to greater than or equal to the number of zones in distribution_policy_zones
max_unavailable_fixed should be set to null which equals 0: no live instances will be unavailable during the update.
The MIG module has options for two different instance templates in order to support performing a canary update where only a percentage of instances are upgraded to the new version:
instance_template_initial_version – template to use for initial deployment
instance_template_next_version – template to use for future canary update
next_version_percent – percentage of instances in the group (of target_size) that should use the canary update
Initially, both options may point to the same template and 0% is allocated to the “next” version.
If a load balancer is used, newly created instances that are verified healthy will automatically be selected to respond to client requests.
To move a percentage of instances to the “next” version via a “canary” update:
Set the instance_template_next_version to point to an instance template which uses an updated custom image
Set the next_version_percent to an appropriate percentage of instances in the group that should use the “next” template.
Make sure update_policy has type set to PROACTIVE – this will cause the change to take effect right away.
When applied via terraform, all instances will be recreated (adhering to the update_policy) but a percentage of instances will be created using the “next” template.
After the canary update has been validated and all instances should be upgraded, see the steps below for a Regular update.
To update all instances at once (adhering to the update_policy):
Set both the instance_template_initial_version and instance_template_next_version to point to an instance template which uses an updated custom image
Set the next_version_percent to 0.
Make sure update_policy has type set to PROACTIVE – this will cause the change to take effect right away.
When applied via terraform, all instances will be recreated (adhering to the update_policy).
Minecraft on Xbox and other Consoles only support connecting to local LAN private servers or official ones over the Internet. If you want to connect Minecraft on a console to a remote private Minecraft server, you need a method to fool it into thinking the remote server is actually local.
If you host your own Minecraft Bedrock server in your house and you have an Xbox or other console, it will show up for you in the Friends tab in Minecraft on the console, but friends using a console in their house will not be able to see it in their Friends tab, even if you are in the game and try to invite them.
In order to get Minecraft on Xbox or other consoles on the local network to connect to a private Minecraft Bedrock server over the internet, we first need to understand how Minecraft discovers local servers. I have not done packet capture analysis to validate this theory, but it would appear that Minecraft scans the local subnet, making connection attempts against all local hosts on UDP 19132, looking for valid responses.
If we run a service on a host on the local network that listens on UDP 19132 and forwards any packets sent to it out over the internet to the real Minecraft Bedrock server, the console should believe it’s a local Minecraft server and list it in the Friends tab.
The following method uses a Windows laptop connected to the same local network as a console running Minecraft, running a service which listens on UDP 19132 and forwards connection requests to a remote Minecraft Bedrock server over the internet.
To Minecraft on the console, it will show a LAN server in the Friends tab which is actually the laptop, forwarding to the remote Minecraft Bedrock server.
Follow these steps to get your console running Minecraft to connect to a remotely-hosted Bedrock server, whether it’s at a friend’s house, or a hosting service.
If you have the Bedrock server running locally, your remote friend will need to perform these steps, not you.
Replace server.hostname.com with the DNS hostname or IP address of the Minecraft Bedrock server. Leave the default UDP port 19132.
Windows Security Alert may pop up asking if it’s OK for this application to communicate on networks.
Check the boxes for Domain, Private, and Public networks, then click Allow Access
This is safe to do because it only applies to your home network and only when the sudppipe program is running.
Leave the command prompt window open with sudppipe running
In Minecraft on the console, go to Play and then select the Friends tab
If everything is working properly, a LAN game should now be listed – this is the Minecraft Bedrock server that sudppipe is providing access to.
Should be able to connect and join the game.
When done, on the laptop in the Command Prompt box, press CTRL+C to quit sudppipe. The console won’t be able to see or communicate with the Bedrock server anymore.
Next time, just open a Command Prompt again and follow the steps above starting at Step 4 and run the commands again. If savvy enough, you can make a batch file script which performs the steps for you, and then just run the batch file.
This was validated with a Windows 10 laptop and Xbox One console but has not been tested with Nintendo Switch or PlayStation 4.
Attempts were made to use a Chromebook and UDP forwarding apps for Chrome or the linux container but were not successful.
I’ve always wanted to set up a Site to Site VPN between a cloud provider and my home network. What follows is a guide inspired by Configure Google Cloud HA VPN with BGP on pfSense but customized for a Google Wi-Fi home network and updated with some pfSense changes that I had to figure out.
When we built the house in 2015, I set up a 3-pack of original Google Wi-Fi (not the “Nest” version) to use as my router and access points throughout the house. Google Wi-Fi is great – it’s very easy to get started. Once deployed, it can generally be thought of as “set it and forget it”. However, it doesn’t provide all the bells and whistles that some of the more advanced home routers offer, but this can be a blessing in disguise because there is less to fiddle with and potentially mess up. Most importantly, it delivers a reliable experience for the family.
My home lab is a simple Intel NUC with a dual-core Intel Core i3-6100U 2.3 GHz CPU and 32 GB RAM. It runs a standalone instance of VMware ESXi 7. I run a few VMs when I need to, but nothing “production”.
Site-to-Site VPN with Google Cloud
Since switching to a full-time focus on cloud engineering and architecture, one of the things I’ve always wanted to try is to set up a Site-to-Site IPsec VPN tunnel with BGP between my home and a virtual private cloud (VPC) network to better understand the customer experience for VPN configuration and network management.
As I mentioned earlier, Google Wi-Fi is rather basic and doesn’t offer any VPN capability, but it can do port forwarding, and when combined with a virtual appliance, that’s all we really need.
Since Google Wi-Fi does not have any VPN capabilities, I intend to use a pfSense virtual appliance in ESXi to act as a router for virtual machine clients on an internal ESXi host-only network. The host-only network will have no physical uplinks so the only way out to the Internet or the private cloud network is through the pfSense router.
pfSense will provide DHCP, DNS, NAT, and routing/default gateway services only to the clients on the internal host-only network.
Following are some installation tips that I found to be helpful:
Upload the pfSense ISO to an ESXi datastore – don’t forget to unzip it first.
When creating a new VM for pfSense on ESXi 7, select Guest OS family “Other” and Guest OS version “FreeBSD Pre-11 versions (64-bit)”
Set CPU to 2
Set Memory to 1 GB
Set Hard Disk to 8 GB
Make sure the SCSI adapter is LSI Logic Parallel
Set Network Adapter 1 to the home/internet network, mark it as Connect
Add a second Network Adapter for the host-only network, leave it as E1000, mark it as Connect
CD/DVD Drive 1 set to Datastore ISO file and browse for the pfSense ISO, mark is as Connect
Boot the VM off the ISO, accept the defaults and let it reboot.
On first boot, the WAN interface will have a DHCP IP from the home network (Google Wi-Fi assigns in the 192.168.86.0/24 range) and the internal-facing LAN interface will have a static IP of 192.168.1.1. If this is incorrect, use the “Assign interfaces” menu item in the console to set which NIC corresponds to WAN and LAN appropriately. Use the ESXi configuration page to find the MAC address of each NIC and which network it is connected to in order to configure them appropriately.
Port-forward IPSec ports to pfSense
After pfSense is installed, we need to port-forward the external Internet-facing IPSec ports on the Google Wi-Fi router to the pfSense VM.
Google has recently relocated management of Google Wi-Fi to the Google Home app. Look for the Wi-Fi area, click the “gear” icon in upper right, select “Advanced networking”, and then “Port management”.
Use the “+” button to add a new rule. Scroll through the IPv4 tab to find the new “pfSense” entry and select it. Verify the MAC address shown is the same as the pfSense VM’s WAN NIC connected to the home network (“VM Network”). Add an entry for UDP 500. Repeat for UDP 4500.
Note: It is not possible to configure port forwarding unless the internal target is online. The Google Home app will only show a list of active targets that are connected to the network. If the pfSense host is not present, verify the VM is powered on and connected to the home network.
By default, pfSense only allows management access through its LAN interface, so the next step is to deploy a Jump VM with a web browser on the host-only network. Use the VM console to access the Jump VM desktop and launch the browser since it will not be reachable on the home network (in case you wanted to RDP). Verify it has a 192.168.1.0/24 IP. It should also be able to reach the internet but this is not required.
Set the Hostname and Domain to something different than the rest of the network.
Configure WAN interface: Uncheck “Block RFC1918 Private Networks”
Set a secure password for admin
Select Interfaces | WAN
Uncheck “Block bogon networks” if selected
Click Save and then Apply
Google Cloud VPN configuration
Use the Google Cloud Console for the following steps:
Networking | VPC Networks
Create a new VPC network or use an existing one. Should have Dynamic routing mode set to Global.
Networking | Hybrid Connectivity | VPN
Create a new VPN Connection
Select VPC network created earlier
Create a new external IP address or use an available one
Tunnels – set Remote peer IP address to the home external internet IPv4 address (from home, visit https://whatismyipaddress.com/ and note the IPv4 address)
Generate and save the pre-shared key – it is needed for pfSense.
Select Dynamic (BGP) routing option and create a new Cloud Router. Set Google ASN to 65000. Create a new BGP session, set Peer ASN (pfSense) to 65001. Enter Cloud Router BGP IP of 169.254.0.1, and BGP peer IP (pfSense) of 169.254.0.2
Note the external public IP address of the Cloud VPN.
pfSense IPsec configuration
Use the Jump VM web browser for these steps in the pfSense web interface:
Set Remote Gateway to the Google Cloud VPN external public IP recorded previously.
Set “My identifier” to be “IP address” and enter the external public IPv4 address of the home network recorded earlier.
Enter the Pre-Shared Key generated for the Google Cloud VPN tunnel
It may not be possible to paste the key in to the VM console – visit https://pastebin.com and create a new “Burn after reading” paste with the key and then access the paste from the Jump VM to retrieve the key.
Set the Phase 1 Encryption Algorithm to AES256-GCM
Set Life Time to 36000
Save and apply changes
Show P2 entries, Add P2
Mode: Routed (VTI)
Local network: Address, BGP IP 169.254.0.2
Remote network: Address, BGP IP 169.254.0.1
AES, 128 bits
AES128-GCM, 128 bits
Hash Algorithms: SHA256
PFS key group: 14 (2048 bit)
Save, Apply changes
Click on Firewall | Rules, select IPsec from along the top, Add a new rule
Set Protocol to Any
Save rule, Apply changes
pfSense BGP configuration
Go to System | Package Manager, click on Available Packages, search for “frr”. Install “frr”. This will connect out to the Internet to retrieve the packages. Wait for it to complete successfully.
Go to Services | FRR Global/Zebra
Enter a master password.
Set Syslog Logging to enabled and set Package Logging Level to Extended
Click on Access Lists along the top
Add a new Access List
Access List Entries: set Sequence to 0, set Action to Permit, check box for Source Any
Click on Prefix Lists along the top
Add a new Prefix List
Prefix List Entries: set Sequence to 0, set Action to Permit, check box for Any
Click on BGP along the top
Enable BGP Routing
Set Local AS to 65001 (GCP Cloud Router was set to 65000)
Set Router ID to 169.254.0.2 (GCP Cloud Router was set to 169.254.0.1)
Set Hold Time to 30
At the bottom, set Networks to Distribute to 192.168.1.0/24
Click Neighbors along the top, add a new Neighbor
Remote AS: 65000
Prefix List Filter: IPv4-any, for both Inbound & Outbound
Path Advertise: All Paths to Neighbor
In pfSense, click on Status | FRR
In the Zebra Routes area, you should see “B>*” entries for subnets in the GCP VPC “via 169.254.0.1” (BGP IP of GCP Cloud Router)
In the BGP Routes area, should see Networks listed for GCP VPC subnets, with Next Hop of 169.254.0.1 (BGP IP of GCP Cloud Router) and Path of 65000 (GCP Cloud Router ASN)
BGP Neighbors should list 169.254.0.1 as a neighbor with remote AS 65000, local AS 65001 and a number of “accepted prefixes” which are the VPC subnets.
Visit the Cloud VPN area in Google Cloud Console, the VPN Tunnel should show Established, and the BGP session should also show BGO established.
Visit the VPC and click on its Routes. There should be one listed for the on-premise pfSense LAN, 192.168.1.0/24 via next hop 169.254.0.2.
At this point, VMs in GCP should be able to communicate with VMs in the on-premise pfSense LAN 192.168.1.0/24 network.
Create a GCE instance with no public IP and attach it to the VPC subnet. Make sure firewall rules apply to the instance permit ingress traffic from 192.168.1.0/24 network and permit the appropriate ports and protocols:
TCP 22 for SSH
TCP 3389 for RDP
If things are not connecting, double-check everything, but also be sure to check the logs in pfSense and in GCP Cloud Logging. The most frequent issue I encountered was a mismatch of proposals by not selecting the right ciphers for the tunnel, or not setting my identifier properly. Also consider how firewall rules will impact communication.
Finally, the settings outlined here are obviously not meant for production use. I don’t claim to understand BGP any more than what it took to get pfSense working with Cloud VPN, so some of the settings I recommend could be enhanced and tightened from a security perspective. As always, your mileage may vary.