Taming Google Container Builder

Google Container Builder is Google Cloud Platform’s way of building container images for your projects.

It’s a basically a pipeline of “steps” that you can configure to build a chain that ultimately creates a container image. However, as of this writing (Dec 2017), it’s decidedly very… primitive.

I had already dabbled with Container Builder about a year ago, and could not make it do what I wanted. But recently I had to make a decision to leave my current CI/CD pipeline service, so I took a fresh look, and this time around, voila. I was able to do everything I wanted.

This post is basically my notes on the most useful and important points (read: where I stumbled) from my porting work.

Please note that I expect the reader to have a basic understanding of how to work with Google Cloud Platform, and that you have at least glanced through Google’s official documentation for the basics of how to use Container Builder.

High Level Overview

Compared to other Continuous Integration/Continuous Delivery services such as Circle CI, Travis, and the like, the conceptual model that Container Builder provides is very minimal.

(And BTW, I use all those services in my other projects)

Like other services, Container Builder expects a YAML configuration file to describe the steps to build your software, cloudbuild.yaml . However, the documentation is a little bit lacking. I believe this is the only page where the cloudbuild.yaml is described in its entirety:

The simplest way to describe the contents of cloudbuild.yaml is this: “You are specifying a list of arbitrary container images to be executed”. I believe this is the conceptual equivalent of what you are writing to cloudbuild.yaml:

docker run --rm $image-name $arg1 $arg2 $arg3 ...
docker run --rm $image-name $arg1 $arg2 $arg3 ...
docker run --rm $image-name $arg1 $arg2 $arg3 ...

The cloudbuild.yaml is just a way to write this in an alternate form.

Notice that there are no “before_script”, “artifacts”, “cache” or whatever in cloudbuild.yaml . It’s very… barebones.

The Secret Sauce: “entrypoint”

Before describing my pipeline, I’m going to start with what turned out to be the crucial revelation that finally allowed me to do what I wanted in Google Container Builder.

It was the entrypoint parameter in the step configuration.

The basic premise of the Container Builder is to break the entire process into containerized steps, such as “git”, “go”, “bazel”, etc. The stumbling block here is that when you read the docs, it seems as if you can only supply arguments to the principal command that the container was build for. For example, for docker step, you see something like the following:

- name: gcr.io/cloud-builders/docker
args:
- build
- -t
- my.registry.com/project/image-name:latest
- .

So what happens when you run this step and get an error? is it because I have my files in the wrong place? Am I lacking some environment variables? This is frustrating because it looks as though the only thing you can do there is to provide more arguments to the docker command — and if you have done it before, you know that the interesting bits of a docker failure logs are all with the dockerd, not the docker client command. It almost looks as if there’s no way to diagnose the situation.

But do not despair.

This is where the entrypoint option comes in (I could swear this option did not exist previously, but the important thing is that I know of it now). This parameter in the cloudbuild.yaml step entry allows you to change the ENTRYPOINT command to an arbitrary executable in the container.

Therefore, you can run diagnostics, helper commands, or anything that you want for that matter, if you insert a shell in the entrypoint . Below sample prints out the environment variables before executing go build :

- name: gcr.io/cloud-builders/go
entrypoint: 'ash' # because this is alpine
args:
- -c
- |
# following commands are executed by the real go step, so
# we replicate it here
. /builder/prepare_workspace.inc
prepare_workspace || exit
# Hey, let's print out the environment variables to
# see what's going on
env | sort
# Then do what you normally do
go build ....

So with entrypoint, your steps are basically the equivalent of the follwing:

docker run --rm gcr.io/cloud-builders/go \
--entrypoint ash -- -c $script

The entrypoint parameter and the use of shells in there gives you great, great, flexibility to make the Container Builder what you want to do, and quite frankly, I don’t think I could have done it without it.

I have one quick note about the ./builder/prepare_workspace.inc and prepare_workspace || exit bit above. One of the great things about Container Builder is that the basic steps are available for you on Github:

So if you ever need to peak, tweak, and/or customize their steps, you can always refer to their Dockerfiles and other helper scripts. This is how I learnt that the gcr.io/cloud-builders/go container was calling those two scripts before executing the go command.

Use The Local Builder

Seeing your jobs being processed through Google’s system may give you sone thrills at the beginning, but you will soon realize the pain of waiting for remote jobs to execute, especially when you are initially developing your pipeline.

You will also most likely be wanting to run many jobs in succession as you tweak your parameters: so those extra seconds and minutes that the remote jobs require starts adding up.

The answer is to run the jobs locally. Luckily for us, Google provides us with a simplified version of the Container Builder pipeline that you can run on your machine.

https://github.com/GoogleCloudPlatform/container-builder-local

You can install this via the gcloud components command.

$ gcloud components install container-builder-local

As far as I could tell, this command will give you exactly what you would see in the production environment, up to the service account that will normally execute your jobs, which allows you to debug even those nasty permission problems.

Throughout this entry, I will use the real gcloud command for examples, but always remember that you can substitute container-builder-local for gcloud container builds submit calls.

The Pipeline

Finally, getting back to my Container Builder pipeline: The basic premise is as follows:

  1. I have a monorepo with a handful of sub-projects/repos that are written in different languages
  2. I host my code on Gitlab, and in a private repository.
  3. I would like to deploy everything to Google Kubernetes Engine.

There are a few pain points that I’m sure everybody will stumble upon in this scenario. Let’s see what I did.

Cloning Of The Private Repository

First, we need to clone the repository. But the repository is hosted outside of the Google environment, and is no supported by the likes of GCP’s Cloud Source Repositories (which, if I were hosting on Github, would give me an automatic mirroring option to make authentication a non-problem)

To start configuring, first we need a deploy key. Head over to your service, in this case Gitlab, and they should tell you how to register an SSH key to used as a deploy key. (Newbie reminder: Create a new key only for this purpose. You register the public key to the code repository service, and keep you private key)

Just for the heck of it, I created a key with ed25519 encryption. Remember to provide an empty password.

ssh-keygen -t ed25519

Encrypt For Your Eyes Only

Now, in order for Container Builder to clone the repository, we need to somehow give it your private key, so that cloning over SSH works properly.

It would be rather insecure to just hand your private key to the service by sending the raw file or adding it in the source code repository, so we should be using a Key Management Service (KMS) to at least encrypt it.

Turns out Google also has a KMS. I mean, Google has everything.

The instructions on how to do this are pretty straight forward, so it should be easy to follow the guide below. Note that they are already using the entrypoint hack that I describe earlier in this article:

For those who are not already familiar with how KMS works, it’s a comprised of keys to encrypt/decrypt arbitrary data. The keys belong to keyrings, which are logical grouping of such keys.

What the above document is describing is how to take an SSH private key, encrypt it using a KMS key, and send it to the Container Builder environment for further processing. Because the private key is now encrypted, it is, um, safe to be sent to Container Builder.

In my case I had to encrypt my Gitlab key (obviously, I created my keyring and key beforehand)

$ gcloud kms encrypt \
--plaintext-file=$path-to-private-key \
--ciphertext-file=gitlab-deploy-key.enc \
--location=global \
--keyring=my-keyring \
--key=my-key

Using The Encrypted Files

I suppose that once encrypted, one could add this private key in the repository, but I personally opted to store the master version in Cloud Storage. When the build process starts, the necessary files are downloaded via gsutil, and processed.

Now that you have the encrypted private key (and other files) accessible from Container Builder, you can decrypt it and unpack it under /root/.ssh/ .

The document then tells you to patch the SSH configuration so that when git attempts to access the source repository, it will use that private key to authenticate against the server.

One thing to note is that you need to give your Container Builder service account (which is automatically created when you enable the service) the proper permissions to decrypt your data. You should be able to do this by going to IAM console, and giving the service account that looks like $your-numeric-project-id@cloudbuild.gserviceaccount.com the authorization to perform KMS decryption.

Having done that you can decipher your key like this. Don’t forget to keep the results in a “volume” so that it persists between steps.

- name: 'gcr.io/cloud-builders/gcloud'
args:
- kms
- decrypt
- --ciphertext-file=your-encrypted-key.enc
- --plaintext-file=/root/.ssh/id_ed25519
- --location=global
- --keyring=your-keyring-name
- --key=your-key-name
volumes:
- name: 'ssh'
path: /root/.ssh

Also, do not forget to properly configure git to use this key. In my case, I’m going to configure git clone against gitlab.com to use this key. In the following example, the container gcr.io/cloud-builders/git doesn’t really mean anything — all you need is a container with bash and the required commands. Again, do not forget to put the results in a persistent volume

- name: 'gcr.io/cloud-builders/git'
entrypoint: bash
args:
- '-c'
- |
chmod 600 /root/.ssh/id_ed25519
cat <<EOF >/root/.ssh/config
Hostname gitlab.com
IdentityFile /root/.ssh/id_ed25519
EOF
mv known_hosts /root/.ssh/known_hosts
volumes:
- name: 'ssh'
path: /root/.ssh

Clone The Source

Once you have the necessary files, you can now run git clone git@gitlab.com:... or whatever else that requires an SSH key to be present.

Cloning The Necessary Bits

Because we’re just building from a particular commit/branch, all that the pipeline needs from git are the files from that 1 commit. So a regular clone, which downloads all of the history, is overkill here.

I opted to workaround this by explicitly telling Container Builder which branch we want to work with through the command line. In Container Builder, what you do is to provide Substitution Rules.

gcloud container builds submit \
--substitutions _BRANCH=$(git rev-parse --abbrev-ref HEAD) \
--config cloudbuild.yaml .

This sends the variable $_BRANCH to Container Builder with the current branch name. These variables will be substituted before your cloudbuild.yaml is processed, so in your git clone step, you can write something like below to get a shallow clone:

- name: 'gcr.io/cloud-builders/git'
args:
- clone
- --depth=1
- --branch=$_BRANCH
- --single-branch
- git@gitlab.com:my-project/my-repository.git
volumes:
- name: 'ssh'
path: /root/.ssh

This will clone just the target branch, and it will not download the history.

Saving The Commit SHA

Because I could not use the automated kicks to build from a git push, I had to manually find out the current commit SHA. Normally you probably want to embed a call like the following close to where you would use it:

COMMIT_SHA=$(git rev-parse HEAD)

But remember that we’re using containers here: The git command may not be available in other container images, so it’s safer to just save it in the /workspace

$ git rev-parse HEAD > /workspace/COMMIT_SHA

Then all you need later is the cat command to get this value.

Afterthoughts

Should You Use Container Builder?

I will be the first to admit that Container Builder is not for everybody. It’s opinionated, and it’s primitive.

While it gives you enough leverage to do whatever you want, nice-to-haves such as hooks and integrations, a real Web UI, and automated wrappers for things that I had to painstakingly explain in this article are glaringly missing.

If you are comfortable with your current CI/CD pipeline, I do not think you need to consider switching, unless you have your reasons.

But on the other hand, now that running CI/CD on containers re becoming ubiquitous the approach taken by Container Builder does not feel strange.

Local Builds FTW

However, one place that Container Builder absolutely shines is its ability to run your pipeline locally from a simple binary.

If you have ever crafted pipelines in other services you know what I am talking about: the long waits, the errors that occur because you made one typo, and then the long waits again...

Heck, on another service I once had to wait over 5 hours for one of my pipelines to kick (and this is precisely one of the main reasons I had to look for alternatives). Understandably, it was apparently caused by an infrastructure glitch, but that’s nonetheless 5 hours of development time gone.

Container Builder takes away much of the above frustration by allowing you to run you pipeline right in front of your eyes, right when you want to run it. This, I feel, is a big win for the developer.

…And The Remote Runs

To complement the local runs, I must say that the pipelines that run on GCP are blazingly fast too. Not just the execution speed, but you practically have no queuing.

This can only be done by services that actually owns the infrastructure. YMMV but it feels nice not be completely stuck on that “pending” icon when all you want to know is if your build has passed.

Conclusion

I think the Container Builder is a very useful framework. It’s fast, development times are shorter because of local builds, and generally it gets the job done. After all, if you need more functionality, you can just create your own containers to do what you want.

On the other hand, if you’re not vested in the Google infrastructure, and if you absolutely need that fancy Web UI, it’s probably not for you.

Go/perl hacker; author of peco; works @ Mercari; ex-mastermind of builderscon; Proud father of three boys;