Applying GCP OS Login to Terraform and Ansible
My Terraform/Ansible script doesn’t work anymore after I turned on GCP OS Login. I didn’t know what OS Login means and just turned it on. Then I spent a couple of hours figuring out if it is caused by my custom image (OEL7). It turns out it is not. OS Login is a better authentication (oauth2) for Enterprise customers. In short, GCP OS Login lets you use your own desktop ssh key to log in to all GCE instances you are allowed to access (limited by service account). It is pretty straightforward if you use “gcloud compute ssh” like this.
Wait…what’s that sa_1233243151???? user? My public key doesn’t have that id. That is a “uniqueId” of service account I used in my GCP terraform project. I like it a lot. I tried it in my terraform/ansible project, and it didn’t work as ansible complaints ssh connection refused. This article describes how I modify my terraform/ansible project for OS Login. The high-level plan is like this:
- Creating a GCP service account/key/binding for my Terraform project
- Creating OS Login resource and adding metadata
- Parsing uniqueId from the service account
- Assigning the uniqueId as ansible_user in host inventory
Creating a GCP service account/key/binding
Since this is OS Login, I think gcloud on my desktop is a better choice to create a service account. You can do the same in the GCP console.
First, use gcloud to log in as a GCP user. The first command will bring you to a browser and ask you to log in to google.com. After authenticated, you can come back and run the 2nd command to create a service account. Then you make and store the private key on your desktop. Next, you bind this service account with “compute.osAdminLogin” (if you need sudo privilege) or “compute.osLogin”. Finally, you activate this service account to communicate back to GCP APIs.
Creating OS Login resource and adding metadata
Now we have the service account bound with the OS Login role. We create a resource, “google_os_login_ssh_public_key,” and associate the desktop ssh key with the service account.
We tell GCE VMs that I want to use osLogin. There are two ways to inject osLogin into GCE VM resources. One is metadata resource, and the other is metadata annotation inside the VM resources. The following is the annotation way.
Parsing uniqueId from the service account
So far, so good. I created a sample VM and “gcloud compute ssh stan1” did log in my GCE VM with service account uniqueId. However, the “null_resource” local-exec provisioner got a timeout, and OS secure log showed “no such user: ysung.” Now I need to tell ansible what user to ssh (ansible_user).
To inject the “uniqueId” to ansible host inventory, I use the “data. external” resource to keep the above result. Note that “data. external” expects the result JSON format. It will be great if there is a data module for this.
With the local_file resource, we can inject the “uniqueId” to ansible host inventory. Note that the username is “sa_uniqueId.”
The hosts.tpl template file looks like this.
Terraform does provide “data” and “resource” for GCP service accounts. If you have a service account ready, you can claim a service account data source and bind the service account with the ‘osAdminLogin’ or ‘osLogin’ role.
This is not a perfect solution. The terraform/ansible project now depends on “gcloud CLI” for full automation. It is relatively easy, and changes are minimal because of Terraform module design.