Rebuilding Jenkins from scratch

Restoring an existing Jenkins instance

If you’re restoring a Jenkins instance that’s disappeared for some reason, you can reuse the existing jenkins Terraform module definition in the digitalmarketplace-aws repo.

Run AWS_PROFILE=main-infrastructure make plan then AWS_PROFILE=main-infrastructure make apply from the main account folder.

Creating a new Jenkins instance

To create a new Jenkins instance (e.g. for testing), you’ll need to:

  • Create the AWS resources for the new Jenkins instance using Terraform.

  • Set up authorisation for the new instance

  • Provision the Jenkins app using Ansible

  • Set up a new URL for the instance

  • Copy any data (e.g. build history) from the old instance to the new instance

Warning

These steps assume that the shared AWS infrastructure for all Jenkins instances (the IAM profile/policy document, ELB certificate, EBS snapshot policy and the S3 bucket to store access logs) are already set up for the account. If you need to rebuild the shared resources, check with the team first.

Creating the AWS resources for the new instance

The infrastructure on which each Jenkins instance relies on is created in the jenkins.jenkins module. of the main AWS account. The module can be instantiated in the main account as many times as we like (typically once).

Define a new Jenkins module in the main account’s jenkins.tf, with a unique name to allow Terraform to namespace resources:

module "jenkins2_the_jenkinsening" { ...

The module will require the following variables to be defined:

Variable

Description

source

The relative path to the Jenkins module.

name

A unique name for the AWS resources, to allow name spacing. A good value for this would be the module block name.

dev_user_ips

A list of IP addresses to allowlist for the security groups. Automatically injected from the credentials repo using var.dev_user_ips.

jenkins_public_key_name

The name of the key pair which will be put on the ec2 to allow access. This can be reused. If you use the existing variable var.jenkins_public_key you will be able to reuse the private key defined in our credentials repo. Alternatively, generate a new key pair and use the name here.

jenkins_instance_profile

The name of the shared instance profile to use to give Jenkins its permissions (this should exist already).

jenkins_wildcard_elb_cert_arn

The arn of the shared ELB certificate defined in the main account. Terraform grabs this automatically with aws_acm_certificate.jenkins_wildcard_elb_certificate.arn.

ami_id

The ID of the machine image you want to base your image on. If you’re upgrading the server to a new operating system, or fixing security issues, this is what you’ll want to update. New images can be found here.

instance_type

The type of EC2 instance to use. Currently we use t3.large.

dns_zone_id

The id of our DNS zone. Terraform grabs this automatically with aws_route53_zone.marketplace_team.zone_id

dns_name

The DNS address of the new instance, for example ci2.marketplace.team.

log_bucket_name

The name of the shared S3 bucket that jenkins should log to (this should exist already).

From the main account Terraform project, digitalmarketplace-aws/terraform/accounts/main run:

$ AWS_PROFILE=main-infrastructure make plan

This will plan the new module and output to stdout what it’s going to do. Check it carefully to make sure everything looks correct. If all looks good, run:

$ AWS_PROFILE=main-infrastructure make apply

This will cause Terraform to actually create the module. Cross your fingers.

If there are any errors, Terraform will let you know. If there are you will need to fix them and then plan and apply again. If not, then all the new infrastructure should be up and running. You may need to wait a little while for DNS records to be propagated, but you should be able to ssh into the new instance immediately using the elastic IP address created (find it through the AWS console).

You should now have created:

  • a new EC2 instance

  • a new elastic load balancer (ELB), that uses the shared certificate

  • a new DNS ‘A’ record in Route 53

  • new security groups for the EC2 and ELB instances

Granting the box access to test environments

To run some jobs Jenkins needs access to our Preview and Staging environments and Admin interfaces which are not publicly accessible by default.

  • Find the Public IPv4 address of the instance as displayed in the AWS EC2 Console.

  • Add it to the user_ips, dev_user_ips and admin_user_ips blocks in the credentials files: vars/preview.yaml vars/staging.yaml and vars/production.yaml.

  • Once the changes have been merged, re-release the router app in all environments, using the Jenkins Job Release application to PaaS.

Creating a new OAuth app in Github

  • Log into Github as the dm-ssp-jenkins user. Credentials are in the credentials repo in pass/github.com/jenkins-ci. You will need someone with 2FA for dm-ssp-jenkins handy.

  • Go to Settings / Developer Settings / OAuth Apps and click New OAuth App.

  • Give it a friendly name. Something like Jenkins 2 OAuth app.

  • Set Homepage URL as the full URL to the new instance, for example https://ci2.marketplace.team

  • Application description is optional.

  • Set Authorization callback URL to <Homepage URL>/securityRealm/finishLogin. To follow the example from above you would use https://ci2.marketplace.team/securityRealm/finishLogin

  • Click Register application and take note of the provided Client ID and Client Secret.

  • Add the client id and client secret for the app to digitalmarketplace-credentials/jenkins-vars/jenkins.yaml. They need to be stored as a nested dict under jenkins_github_auth_by_hostname with the host name for the new instance as the key, along side the existing application. For example:

    jenkins_github_auth_by_hostname:
        ci2.marketplace.team:
            client_id: <CLIENT-ID>
            client_secret: <CLIENT-SECRET>
        ci.marketplace.team:
          .....
    

Provision the box with Ansible

In the digitalmarketplace-jenkins repo, update /playbooks/hosts with:

  • The URL of the new instance

  • The data volume ID. Usually this will be jenkins_data_device=/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_<VOLUME_ID_WITHOUT_HYPHENS>

If you used a new ssh key pair when defining jenkins_public_key_name, you will need to update deploy-jenkins.sh to copy your new private key to the $PRIVATE_KEY_FILE variable. If you used the existing key, do nothing.

Run the full Ansible playbook from the root of the Jenkins repo as follows:

$ make install

This will start the Ansible playbook for Jenkins, and apply all configuration to the instance. Answer “yes” if prompted about authenticity by ssh. The process make take around 20 minutes so get some biscuits.

When running make install, all jobs will be disabled by default. This is so jobs don’t run unexpectedly once Jenkins is up and running.

Once Ansible is finished, you should be able to access the new Jenkins at the URL you defined earlier.

Checking everything is working and secure

When you login to the new Jenkins server for the first time, there are some things you’ll want to check:

  • Are all the jobs disabled? When you’re first setting up a new Jenkins instance you probably don’t want it to start building things right away. The Ansible playbook should have disabled all jobs, but it’s worth checking before you get some nasty surprises!

  • Is the server accessible outside of the GDS building? It shouldn’t be! See if you can access it when your device is connected to the GovWifi network or tethered to your phone.

  • You might need to dismiss the “access control for builds” warning (see this screenshot for an example). This warning isn’t relevant for us, as we don’t use access control for Jenkins users.

Copying build history from an old instance to a new instance

It’s important that we maintain the history of some of our jobs, if possible, for audit purposes. We can get the builds of these jobs onto a new box by copying certain files over. We don’t have an automated way of doing this at the moment. This is a method for how it has been done. There may (read probably) be better ways.

  • ssh on to the old instance (lets assume it’s using ci as the subdomain):

    $ ssh -i <path to private key> ubuntu@ci.marketplace.team
    
  • Switch to the jenkins user:

    $ sudo su - jenkins
    
  • Compress the contents of the jobs directory:

    $ tar -zcvf /tmp/jobs.tgz --exclude 'workspace**' /data/jenkins/jobs/
    
  • Copy the file to the new instance (lets assume it’s using ci2 as the subdomain), from your local machine run:

    $ scp ubuntu@ci.marketplace.team:/tmp/jobs.tgz /tmp
    $ scp /tmp/jobs.tgz ubuntu@ci2.marketplace.team:/tmp
    
  • Unpack the file on the new machine:

    $ ssh ubuntu@ci2.marketplace.team
    $ sudo su - jenkins
    $ tar -xvf /tmp/jobs.tgz -C /
    
  • Restart the new instance by going to: https://ci2.marketplace.team/safeRestart

  • Once done, the history of the old jobs should appear on the new instance. Any pipelines will need to be restarted as this doesn’t seem to transfer their current state across.

  • Delete the compressed files created on the local and remote machines.

Changing the URL for the new instance

  • You may want to change the URL of the new instance. We normally use ci.marketplace.team.

  • If so, update the dns_name variable in the new module definition in the main.tf file of the main account. If there is another module already using this name, you may need to change or remove it first.

  • Run make plan then make apply.

  • It may take a few minutes for DNS setting to propagate.