Deploying Domain Mappings in GCP with GitLab CI

12 November 2022 Haarsteeg 1314 words, 4 minutes #tech TL;DR

Many projects I’ve worked on of late have been deployed to Google’s Cloud Run service, a Container as a Service facility in Google Cloud Platform (GCP). Creating domain mappings for the services always remained a manual task after deployment, This post details how I’ve now started to deploy the domain mappings for those serverless instances in an automated fashion.

The Set Up

Most of these projects have a very similar setup, even though the application’s technologies differ quite a bit. The application is built, packaged in a docker image, which is published to the container registry for the GCP project, after which the deployment of that image is issued towards Cloud Run, where it’s exposed to the outside world. Cloud Run supports the creation of domain mappings, which configure the domain name (of a domain you are the verified owner of, of course) you would like to route to the instance that is deployed. For instance, requests to my-site.my-domain.com should be directed to the instance deployed in Cloud Run with identifier my-instance-production, whereas those for test.my-domain.com should go to the my-instance-test instance.

Some projects are hosted in Bitbucket, others in GitLab instances; the pipelines to build them use the native pipelines features that those platforms offer and differ mainly in configuration syntax for these projects. For the purposes of the examples below, the GitLab flavour is used.

The Problem

Setting up the domain mappings in Google’s Cloud Console is a relatively straightforward process. You navigate to the project, then to the Cloud Run section to view the deployed instances and click a button to manage the domain mappings. When creating a new mapping, you select the instance you want to direct the traffic to, choose from the verified domains and configure the (sub)domain that should point to it. Then you just wait for the provisioning of the certificates and routing to complete setting up so you can start using the mapping. This action is performed once; subsequent deployments of the application’s Cloud Run instance don’t invalidate the mapping. Traffic for the domain mapping will be routed to the new revision of the instance automatically.

So far, so good. Once you’ve discovered the pattern for a couple of projects, it’s easy to set up. But it still requires the manual action to set up the mapping. Surely this can be automated?

The Google Cloud SDK, which is used in these projects to interact with the GCP APIs, offers a way to create these domain mappings automatically.

$ gcloud run domain-mappings create \
    --service=my-instance-production \
    --domain=my-site.my-domain.com

There are various other options that can be supplied, but that’s the gist of it. This action can be added to our pipeline steps and that’s that. At least, that’s what you might think. This will definitely work, the first time you run the pipeline. Subsequent runs will fail, because the pipeline will attempt to create the domain mapping again, which will be -rightfully- rejected by GCP. The new problem is that although there are APIs to create, to delete and to list the domain mappings, there is no command to create-if-none-exists-yet. What’s missing is something similar to the apply command that kubectl offers.

The Solution

Why this command doesn’t exist in the API, is a whole different discussion. Fortunately, we can use our pipeline to configure a conditional action, of sorts.

As it turns out, there is a command that will describe a domain mapping. If the mapping exists, a set of metadata about it is returned. If the mapping doesn’t exist, the command will fail. Assuming we have an environment variable DOMAIN for the target mapping, we can issue the following command.

$ gcloud beta run domain-mappings describe \
    --domain=${DOMAIN} \
    --format='value(spec.routeName)'

Example output of this command would be (shortened for brevity):

apiVersion: domains.cloudrun.com/v1
kind: DomainMapping
metadata:
  annotations:
    serving.knative.dev/creator: <creator-email-address>
  creationTimestamp: '2022-11-13T12:02:22.975505Z'
  generation: 1
  labels:
    cloud.googleapis.com/location: europe-west4
    run.googleapis.com/overrideAt: '2022-11-13T12:02:29.649Z'
  name: my-site.my-domain.com
  uid: 6787bcfa-3156-4fb0-a5c4-00aa7da7139c
spec:
  certificateMode: AUTOMATIC
  routeName: my-instance-production
status:
  conditions:
  ...
  mappedRouteName: my-instance-production
  observedGeneration: 1
  resourceRecords:
  - name: my-site
    rrdata: ghs.googlehosted.com.
    type: CNAME

The details are actually not that important, but the domain name and the instance are clearly recognisable. The final section is used to instruct you to create the appropriate CNAME records in the domain’s DNS so they point to Google’s gateway.

In a first attempt to automate this, you might turn to the option to add a bit of multiline scripting to a pipeline’s script. First, issue the describe command. Then, check the status of the previous command using the standard Bash variable, $?, which will hold values 0 or 1 for successful and failed commands, respectively. You would be on the right track, but checking $? will not work in case the domain mapping is not yet in place. This is due to the fact that a failed command to describe the domain mapping immediately fails the pipeline too.

Another Bash feature comes to the rescue, though. Instead of allowing the describe command to fail on its own, a one-liner or statement can make sure the command always succeeds, but that we also capture the result of trying to describe the domain mapping. The general structure to do that looks like the statement below.

$ some_command_that_may_fail || some_command_that_always_succeeds

By adding a variable assignment to the or part of the statement (the part that will only run if the first part - our describe - is unsuccessful), a subsequent if block can act according to the value of that variable (or its non-existence).

Our job in the pipeline could look like the section below. Note the use of the > character to open a multiline block in the YAML and the assignment of the NO_MAPPING variable in the or statement. The if block either creates the domain mapping for the first time when the pipeline runs, or simply logs that it is already in place and doesn’t perform the create action again. The variables used, such as $MY_INSTANCE, $IMAGE_NAME and $DOMAIN are set up elsewhere in the pipeline, making them available for substitution here.

.gitlab-ci.yml

deploy:
  stage: deploy
  image: google/cloud-sdk
  script:
    - ... set up authentication and project details for GCP
    - gcloud run deploy $MY_INSTANCE --image $IMAGE_NAME --allow-unauthenticated --port=80
    - >
      gcloud beta run domain-mappings describe --domain=${DOMAIN} --format='value(spec.routeName)' || NO_MAPPING=true;
      if [ "$NO_MAPPING" = true ]; then
        echo "Creating new domain mapping from ${DOMAIN} to instance ${MY_INSTANCE}"
        gcloud beta run domain-mappings create --service ${MY_INSTANCE} --domain ${DOMAIN}
      else
        echo "The domain mapping from ${DOMAIN} to instance ${MY_INSTANCE} is already active."
      fi

As so often, if you see the solution, it’s actually pretty simple. It is the combination of a couple of features that eliminates yet another manual step in the rollout of services.