Upgrading backend services

As time passes and new versions of our backend services are released, we may find it necessary to upgrade the version of a given service that we use (e.g. for new functionality or to patch security issues).

Nginx

Our Nginx service (the router app) is run in a Docker container and deployed as a PaaS app, in the same way as our Flask apps (see deploying a new dockerised app).

The version of Nginx itself is set in the router app’s Dockerfile.

PostgreSQL

Our PostgreSQL database is a managed PaaS service.

Creating an upgraded (empty) service instance is straightforward enough, using the PaaS command cf create-service postgres <PLAN> <SERVICE_NAME>.

However any upgrades must be managed carefully with a migration plan (TBC!), potentially with scheduled downtime to minimise any transactional data loss during the migration.

Elasticsearch

Note

We switched to OpenSearch in March 2022. Update this section if you work out how to do the same for OpenSearch.

Overview

Deploying a new Elasticsearch backend requires spinning up a separate ES backend and compatible search-api alongside the existing infrastructure. The existing indices can be re-indexed into the new backend and then, once ready, the new search-api can be deployed through the standard pipeline and will automatically connect to the new ES backend. The old ES backend can then be retired after the transition.

This is less complicated due to our fairly small and relatively non-volatile indices. If we hold multiple indices that change often and must remain up-to-date, more complex setup will be required to replicate changes to both ES backends during the transition. As it is, we indexed into the new ES backend and then kicked off a re-index immediately after deploying the new search-api (connected to the new ES instance) to catch any possible service changes that may have happened between the initial index and the release.

Detailed steps

These steps are for the Preview environment - repeat for staging and production as appropriate.

  1. Make sure the current release of the Search API is up to date with master (and pause further releases on the app until the migration is complete).

  2. Create a new Elasticsearch service in the preview PaaS space, with a unique name: * cf create-service elasticsearch small-ha-5.x search_api_elasticsearch_new

  3. In digitalmarketplace-aws, generate a manifest file for the new search API instance: * APPLICATION_NAME=search-api STAGE=preview make generate-manifest

  4. Edit the manifest manually, setting the following values: * name: elasticsearch-migration * route: dm-elasticsearch-migration-preview.cloudapps.digital (a temporary route) * services: search_api_elasticsearch_new * Add a new env variable, DM_ELASTICSEARCH_SERVICE_NAME: search_api_elasticsearch_new

  5. Use the manifest to create a new PaaS app, based on the latest released image: * cf push -f new-search-api-manifest.yml -o digitalmarketplace/search-api:release-1234

  6. You now have two versions of the search-api running; the live version and your new version with an upgraded ES backend.

  7. Re-create all required indices and settings in the new ES backend. Don’t forget to set up any required aliases for the indices (.e.g. g-cloud-9 for the g-cloud-9-yyyy-mm-dd index you’ve created). In digitalmarketplace-scripts: 1. Create G10 index: scripts/index-to-search-service.py services preview --frameworks=g-cloud-10 --index=g-cloud-10-2018-10-18 --search-api-url=https://dm-elasticsearch-migration-preview.cloudapps.digital --create-with-mapping=services-g-cloud-10 2. Create G10 alias: scripts/update-index-alias.py --stage=preview g-cloud-10 g-cloud-10-2018-10-18 https://dm-elasticsearch-migration-preview.cloudapps.digital 3. Create DOS index: scripts/index-to-search-service.py briefs preview --frameworks=digital-outcomes-and-specialists-3 --index=briefs-digital-outcomes-and-specialists-3-2018-10-18 --search-api-url=https://dm-elasticsearch-migration-preview.cloudapps.digital --create-with-mapping=briefs-digital-outcomes-and-specialists-2 4. Create DOS alias: scripts/update-index-alias.py --stage=preview briefs-digital-outcomes-and-specialists briefs-digital-outcomes-and-specialists-3-2018-10-18 https://dm-elasticsearch-migration-preview.cloudapps.digital

  8. Check that all documents have been indexed correctly (with the right mappings) for both DOS and G-Cloud, by checking the new Search API app’s /_status endpoint.

  9. Map the new Search API app instances to the ‘live’ route. * cf map-route elasticsearch-migration cloudapps.digital --hostname dm-search-api-preview

  10. Both apps should now be serving requests from their respective ES services (which should be identical! You just indexed the new one, right?). Check the logs for each app to make sure they are both receiving traffic.

  11. Remove the old Search API app instances from the route * cf unmap-route search-api cloudapps.digital --hostname dm-search-api-preview

  12. Remove the new Search API app instances from the temporary route. * cf unmap-route elasticsearch-migration cloudapps.digital --hostname dm-elasticsearch-migration-preview

  13. Rename the old ES service to search_api_elasticsearch_old * cf rename-service search_api_elasticsearch search_api_elasticsearch_old

  14. Rename the new ES service to search_api_elasticsearch * cf rename-service search_api_elasticsearch_new search_api_elasticsearch

  15. Rename the old Search API app * cf rename search-api search-api-old

  16. Rename the new Search API app * cf rename elasticsearch-migration search-api

  17. Remove the old Search API app * cf delete search-api-old

  18. Re-release the Search API via the pipeline to pick up the new service/name changes

  19. Reindex everything using Jenkins, to make sure anything added to the database during the migration period is indexed. For preview/staging, waiting for the overnight job to run is fine.

  20. Remove the old ES service. * cf delete-service search_api_elasticsearch_old

Redis

We use Redis for user session management. So upgrading Redis will log out all users. So choose a time when this will not be too disruptive. The disruption is short enough that maintenance mode is unlikely to be helpful.

  1. Create a new uplevel Redis service of the same size: cf create-service redis medium-ha-6.x digitalmarketplace_redis-new.

  2. Rename the downlevel Redis service: cf rename-service digitalmarketplace_redis digitalmarketplace_redis-old.

  3. Rename the uplevel Redis service to have the canonical name: cf rename-service digitalmarketplace_redis-new digitalmarketplace_redis.

  4. Redeploy all apps in the environment using the Jenkins job. This step will log out all users and cause some brief (<1 minute) downtime between the first and final frontend app completing.

  5. Once the smoke and smoulder tests are happy, remove the downlevel service: cf delete-service digitalmarketplace_redis-old.