From 6b4df7b1b5d13fc56379a5c3ee4bfded6c207b18 Mon Sep 17 00:00:00 2001 From: Samuel Young Date: Thu, 11 Dec 2025 12:18:03 +0000 Subject: [PATCH 1/3] CLDC-NONE: Split deployment instructions into their own page makes them easier to find, under the infrastructure tab is a little unintuitive --- docs/deployments.md | 66 ++++++++++++++++++++++++++++++++++++++++++ docs/infrastructure.md | 63 ---------------------------------------- 2 files changed, 66 insertions(+), 63 deletions(-) create mode 100644 docs/deployments.md diff --git a/docs/deployments.md b/docs/deployments.md new file mode 100644 index 000000000..423fcd26d --- /dev/null +++ b/docs/deployments.md @@ -0,0 +1,66 @@ +--- +nav_order: 6 +--- + +# Deployments + +## Production Deployment + +The application is set up so that it can be deployed via GitHub actions. We use Git tags to mark releases. The only pre-requisite is that your GitHub account is added to our team. + +To deploy you need to: + +1. Determine [previous version](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data/tags), such as `v0.1.1`. +2. Create a [new release](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data/releases/new) with subsequent version (e.g., `v0.1.2`). On this page, create a new tag with that version and generate release notes. Save as draft. +3. Post release notes on Slack. +4. Ensure that there are no other pipelines running on the repo right now. If a staging deployment is running, it must complete before you can deploy to production. +5. Publish release. This will trigger the deployment pipeline. +6. Monitor alerting, logging and Sentry. +7. Post success message on Slack. +8. Tag tickets as ‘Released’ and move tickets to done on JIRA. + +## Staging Deployment + +When a commit is made to `main` the following GitHub action jobs are triggered: + +1. **Test**: RSpec runs our test suite +2. **AWS Deploy**: If the Test stage passes, this job will deploy the app to AWS + +When a pull request is opened to `main` only the Test stage runs. + +## Review apps + +When a pull request is opened a review app will be spun up. Each review app has its own ECS Fargate cluster and Redis instances (plus any infrastructure to enable this), while the rest is shared. + +The review app github pipeline is independent of any test pipeline and therefore it will attempt to deploy regardless of the state the code is in. + +The usual seeding process takes place when the review app boots so there will be some minimal data that can be used to login with. 2FA has been disabled in the review apps for easier access. + +The app boots in a new environment called `development`. As such this is the environment you should filter by for sentry errors or to change any config. + +After a sucessful deployment a comment will be added to the pull request with the URL to the review app for your convenience. When a pull request is updated e.g. more code is added it will re-deploy the new code. + +Once a pull request has been closed the review app infrastructure will be tore down to save on any costs. Should you wish to re-open a closed pull request the review app will be spun up again. + +### Review app deployment failures + +One reason a review app deployment might fail is that it is attempting to run migrations which conflict with data in the database. For example you might have introduced a unique constraint, but the database associated with the review app has duplicate data in it that would violate this constraint, and so the migration cannot be run. + +## Destroying/recreating infrastructure + +Things to watch out for when destroying/creating infra: + +- All resources + - The lifecycle meta-argument prevent_destroy will stop you destroying things. Best to set this to false before trying to destroy! +- Database + - skip_final_snapshot being false will prevent you from destroying the db without creating a final snapshot. +- Load Balancer + - Sometimes when creating infra, you may see the error message: failure configuring LB attributes: InvalidConfigurationRequest: Access Denied for bucket: . Please check S3bucket permission during a terraform apply. To get around this you may have wait a few minutes and try applying again to ensure everything is fully updated (the error shouldn’t appear on the second attempt). It’s unclear what the exact cause is, but as this is related to infra that enables load balancer access logging, it is suspected there might be a delay with the S3 bucket permissions being realised or the load balancer recognising it can access the bucket. +- S3 + - Terraform won’t let you delete buckets that have objects in them. +- Secrets + - If you destroy secrets, they will actually be marked as ‘scheduled to delete’ which will take effect after a minimum of 7 days. You can’t recreate secrets with the same name during this period. + - You may need to manually re-enter secret values into Secrets Manager at some point. When you do, just paste the secret value as plain text (don’t enter a key name, or format it as JSON). +- ECS + - Sometimes task definitions don’t get deleted. You may need to manually delete them. + - After destroying the db, you’ll need to make sure the ad hoc ECS task which seeds the database gets run in order to set up the database correctly. diff --git a/docs/infrastructure.md b/docs/infrastructure.md index fedf3cf96..b3f384558 100644 --- a/docs/infrastructure.md +++ b/docs/infrastructure.md @@ -57,68 +57,5 @@ Where to find the Infrastructure? The infrastructure is managed as code. In the terraform folder of the codebase, there will be dedicated sub-folders for each of the aforementioned environments, where all the infrastructure for them is defined. -## Production Deployment - -The application is set up so that it can be deployed via GitHub actions. We use Git tags to mark releases. The only pre-requisite is that your GitHub account is added to our team. - -To deploy you need to: - -1. Determine [previous version](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data/tags), such as `v0.1.1`. -2. Create a [new release](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data/releases/new) with subsequent version (e.g., `v0.1.2`). On this page, create a new tag with that version and generate release notes. Save as draft. -3. Post release notes on Slack. -4. Ensure that there are no other pipelines running on the repo right now. If a staging deployment is running, it must complete before you can deploy to production. -5. Publish release. This will trigger the deployment pipeline. -6. Monitor alerting, logging and Sentry. -7. Post success message on Slack. -8. Tag tickets as ‘Released’ and move tickets to done on JIRA. - -## Staging Deployment - -When a commit is made to `main` the following GitHub action jobs are triggered: - -1. **Test**: RSpec runs our test suite -2. **AWS Deploy**: If the Test stage passes, this job will deploy the app to AWS - -When a pull request is opened to `main` only the Test stage runs. - -## Review apps - -When a pull request is opened a review app will be spun up. Each review app has its own ECS Fargate cluster and Redis instances (plus any infrastructure to enable this), while the rest is shared. - -The review app github pipeline is independent of any test pipeline and therefore it will attempt to deploy regardless of the state the code is in. - -The usual seeding process takes place when the review app boots so there will be some minimal data that can be used to login with. 2FA has been disabled in the review apps for easier access. - -The app boots in a new environment called `development`. As such this is the environment you should filter by for sentry errors or to change any config. - -After a sucessful deployment a comment will be added to the pull request with the URL to the review app for your convenience. When a pull request is updated e.g. more code is added it will re-deploy the new code. - -Once a pull request has been closed the review app infrastructure will be tore down to save on any costs. Should you wish to re-open a closed pull request the review app will be spun up again. - -### Review app deployment failures - -One reason a review app deployment might fail is that it is attempting to run migrations which conflict with data in the database. For example you might have introduced a unique constraint, but the database associated with the review app has duplicate data in it that would violate this constraint, and so the migration cannot be run. - -## Destroying/recreating infrastructure - -Things to watch out for when destroying/creating infra: - -- All resources - - The lifecycle meta-argument prevent_destroy will stop you destroying things. Best to set this to false before trying to destroy! -- Database - - skip_final_snapshot being false will prevent you from destroying the db without creating a final snapshot. -- Load Balancer - - Sometimes when creating infra, you may see the error message: failure configuring LB attributes: InvalidConfigurationRequest: Access Denied for bucket: . Please check S3bucket permission during a terraform apply. To get around this you may have wait a few minutes and try applying again to ensure everything is fully updated (the error shouldn’t appear on the second attempt). It’s unclear what the exact cause is, but as this is related to infra that enables load balancer access logging, it is suspected there might be a delay with the S3 bucket permissions being realised or the load balancer recognising it can access the bucket. -- S3 - - Terraform won’t let you delete buckets that have objects in them. -- Secrets - - If you destroy secrets, they will actually be marked as ‘scheduled to delete’ which will take effect after a minimum of 7 days. You can’t recreate secrets with the same name during this period. If you want to destroy immediately, you need to do it from the command line (using your staging developer role, rather than your MHCLG-wide role used to apply Terraform) with this command: aws secretsmanager delete-secret --force-delete-without-recovery --secret-id . (Note that if a secret is marked as scheduled to delete, you can undo this in the console to make it an ‘active’ secret again.) - - You may need to manually re-enter secret values into Secrets Manager at some point. When you do, just paste the secret value as plain text (don’t enter a key name, or format it as JSON). -- ECS - - Sometimes task definitions don’t get deleted. You may need to manually delete them. - - After destroying the db, you’ll need to make sure the ad hoc ECS task which seeds the database gets run in order to set up the database correctly. -- SNS - - When creating an email subscription in an environment, Terraform will look up the email to use as the subscription endpoint from Secrets Manager. If you haven’t already created this (e.g. by running terraform apply -target="module.monitoring" -var="create_secrets_first=true") then this will lead to the subscription creation erroring, because it can’t retrieve the value of the secret (because it doesn’t exist yet). If this happens, remember you’ll need to go to Secrets Manager in the console and enter the desired email (as plaintext, no quotation marks or anything else required) as the value of the secret (which is most likely called MONITORING_EMAIL). Then run another apply with Terraform and this time it should succeed. - ![Architecture Diagram](https://raw.githubusercontent.com/communitiesuk/submit-social-housing-lettings-and-sales-data/main/docs/images/architecture_diagram.png) ![Context Diagram](https://raw.githubusercontent.com/communitiesuk/submit-social-housing-lettings-and-sales-data/main/docs/images/context_diagram.png) From 63ed5073db529d5b7a8023523ec7a7f3fbed0eb4 Mon Sep 17 00:00:00 2001 From: Samuel Young Date: Thu, 11 Dec 2025 12:18:20 +0000 Subject: [PATCH 2/3] CLDC-NONE: Renumber nav_order --- docs/adr/index.md | 2 +- docs/app_api.md | 2 +- docs/bulk_upload.md | 2 +- docs/csv_downloads.md | 2 +- docs/documentation_website.md | 2 +- docs/exports.md | 2 +- docs/form/index.md | 2 +- docs/monitoring.md | 2 +- docs/rake.md | 2 +- 9 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/adr/index.md b/docs/adr/index.md index 2f26ec915..b4ee4f8ce 100644 --- a/docs/adr/index.md +++ b/docs/adr/index.md @@ -1,6 +1,6 @@ --- has_children: true -nav_order: 13 +nav_order: 14 --- # Architecture decisions diff --git a/docs/app_api.md b/docs/app_api.md index 627be0ad7..eb3a2c577 100644 --- a/docs/app_api.md +++ b/docs/app_api.md @@ -1,5 +1,5 @@ --- -nav_order: 8 +nav_order: 9 --- # Using the App API diff --git a/docs/bulk_upload.md b/docs/bulk_upload.md index b0a06c32a..8c45993fb 100644 --- a/docs/bulk_upload.md +++ b/docs/bulk_upload.md @@ -1,5 +1,5 @@ --- -nav_order: 11 +nav_order: 12 --- # Bulk Upload diff --git a/docs/csv_downloads.md b/docs/csv_downloads.md index 7dd82fb34..d544c57f5 100644 --- a/docs/csv_downloads.md +++ b/docs/csv_downloads.md @@ -1,5 +1,5 @@ --- -nav_order: 10 +nav_order: 11 --- # CSV Downloads diff --git a/docs/documentation_website.md b/docs/documentation_website.md index 728038fc5..b306a6398 100644 --- a/docs/documentation_website.md +++ b/docs/documentation_website.md @@ -1,5 +1,5 @@ --- -nav_order: 14 +nav_order: 15 --- # This documentation website diff --git a/docs/exports.md b/docs/exports.md index a35d7632e..a6e9c0b9d 100644 --- a/docs/exports.md +++ b/docs/exports.md @@ -1,5 +1,5 @@ --- -nav_order: 7 +nav_order: 8 --- # Exporting to CDS diff --git a/docs/form/index.md b/docs/form/index.md index ed21e3b10..4c6724b0c 100644 --- a/docs/form/index.md +++ b/docs/form/index.md @@ -1,6 +1,6 @@ --- has_children: true -nav_order: 9 +nav_order: 10 --- # Generating forms diff --git a/docs/monitoring.md b/docs/monitoring.md index 7b25801f7..705ccac34 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -1,5 +1,5 @@ --- -nav_order: 6 +nav_order: 7 --- # Logs and Debugging diff --git a/docs/rake.md b/docs/rake.md index 0fc28eb5c..c0dbd2c44 100644 --- a/docs/rake.md +++ b/docs/rake.md @@ -1,5 +1,5 @@ --- -nav_order: 11 +nav_order: 13 --- # Running Rake Tasks From c4aa06a889892316bf71d3c56223f098e83c9cbc Mon Sep 17 00:00:00 2001 From: Samuel Young Date: Fri, 19 Dec 2025 14:03:36 +0000 Subject: [PATCH 3/3] CLDC-NONE: Reinstate immediate secret deletion --- docs/deployments.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/deployments.md b/docs/deployments.md index 423fcd26d..3c0c60096 100644 --- a/docs/deployments.md +++ b/docs/deployments.md @@ -60,6 +60,7 @@ Things to watch out for when destroying/creating infra: - Terraform won’t let you delete buckets that have objects in them. - Secrets - If you destroy secrets, they will actually be marked as ‘scheduled to delete’ which will take effect after a minimum of 7 days. You can’t recreate secrets with the same name during this period. + - If you want to destroy immediately, you need to do it from the command line (using AWS CLI, see [here](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data-infrastructure/blob/main/docs/development_setup.md#set-up-aws-vault--cli)) with this command: aws secretsmanager delete-secret --force-delete-without-recovery --secret-id . (Note that if a secret is marked as scheduled to delete, you can undo this in the console to make it an ‘active’ secret again.) - You may need to manually re-enter secret values into Secrets Manager at some point. When you do, just paste the secret value as plain text (don’t enter a key name, or format it as JSON). - ECS - Sometimes task definitions don’t get deleted. You may need to manually delete them.