Improve documentation (#696)

* Modularise documentation * Add some background about the service * Add more instructions for local dependencies * Form builder docs * Stimulus and asset pipeline sections * Infrastructure setup * Add monitoring and logging * Init form runner * Export init * Testing * Testing * Update architecture image * Domain docs * Org relationships
3 years ago · 3ab21c1625
17 changed files with 516 additions and 248 deletions
--- a/README.md
+++ b/README.md
@ -3,257 +3,33 @@
 [![Production CI/CD Pipeline](https://github.com/communitiesuk/mhclg-data-collection-beta/actions/workflows/production_pipeline.yml/badge.svg)](https://github.com/communitiesuk/mhclg-data-collection-beta/actions/workflows/production_pipeline.yml)
 [![Staging CI/CD Pipeline](https://github.com/communitiesuk/mhclg-data-collection-beta/actions/workflows/staging_pipeline.yml/badge.svg)](https://github.com/communitiesuk/mhclg-data-collection-beta/actions/workflows/staging_pipeline.yml)
-Codebase for the Ruby on Rails app that handles the submission of lettings and sales of social housing data in England.
+Ruby on Rails app that handles the submission of lettings and sales of social housing data in England. Currently in private beta.
-## API documentation
+## Domain documentation
 API documentation can be found here: <https://communitiesuk.github.io/mhclg-data-collection-beta>. This is driven by [OpenAPI docs](docs/api/DLUHC-CORE-Data.v1.json)
 ## Required Setup
 Pre-requisites:
 - Ruby
 - Rails
 - Postgres
 ### Quick start
 1. Copy the `.env.example` to `.env` and replace the database credentials with your local postgres user credentials.
 2. Install the dependencies:\
  `bundle install`
 3. Create the database:\
  `rake db:create`
 4. Run the database migrations:\
  `rake db:migrate`
 5. Seed the database if required:\
 `rake db:seed`
 6. Seed the database with rent ranges if required (~7000 rows per year):\
 `rake "data_import:rent_ranges[<start_year>,<rent_ranges_path>]"`
    For 2021-2022 ranges run:\
    `rake "data_import:rent_ranges[2021,config/rent_range_data/2021.csv]"`
 7. Install the frontend depenencies:\
  `yarn install`
 8. Start the dev servers using foreman:\
  `./bin/dev`
  Or start them individually:\
  a. Rails:\
    `bundle exec rails s`
  b. JS (for hot reloading):\
    `yarn build --mode=development --watch`
 If you're not modifying front end assets you can bundle them as a one off task:\
  `yarn build --mode=development`
 Development mode will target the latest versions of Chrome, Firefox and Safari for transpilation while production mode will target older browsers.
 The Rails server will start on <http://localhost:3000>.
 Running the test suite (front end assets need to be built or server needs to be running):\
  `bundle exec rspec`
 ### Using Docker
 1. Build the image:\
 `docker-compose build`
 2. Run the database migrations:\
 `docker-compose run --rm app /bin/bash -c 'rake db:migrate'`
 3. Seed the database if required:\
 `docker-compose run --rm app /bin/bash -c 'rake db:seed'`
 4. To be able to debug with Pry run the app using:\
 `docker-compose run --service-ports app`
 If this is not needed you can run `docker-compose up` as normal
 The Rails server will start on <http://localhost:8080>.
 ## Infrastructure
 This application is running on [GOV.UK PaaS](https://www.cloud.service.gov.uk/). To deploy you need to:
 1. Contact your organisation manager to get an account in `dluhc-core` organization and in the relevant spaces (staging/production).
 2. [Install the Cloud Foundry CLI](https://docs.cloudfoundry.org/cf-cli/install-go-cli.html)
 3. Login:\
 `cf login -a api.london.cloud.service.gov.uk -u <your_username>`
-4. Set your deployment target (staging/production):\
+- [Service overview](docs/service_overview.md)
-`cf target -o dluhc-core -s <deploy_environment>`
+- [User roles](docs/user_roles.md)
 - [Schemes](docs/schemes.md)
 - [Organisation relationships (Parent/Child)](docs/organisation_relationships.md)
-5. Deploy:\
+## Technical Documentation
 `cf push dluhc-core --strategy rolling`. This will use the [manifest file](staging_manifest.yml)
-Once the app is deployed:
+- [Developer setup](docs/developer_setup.md)
 - [Form builder](docs/form_builder.md)
 - [Form runner](docs/form_runner.md)
 - [Infrastructure & CI/CD pipelines](docs/infrastructure.md)
 - [Monitoring, logging & alerting](docs/monitoring.md)
 - [Frontend](docs/frontend.md)
 - [Testing strategies and style guide](docs/testing.md)
 - [Export to CDS](docs/exports)
-1. Get a Rails console:\
+## API documentation
 `cf ssh dluhc-core-staging -t -c "/tmp/lifecycle/launcher /home/vcap/app 'rails console' ''"`
 2. Check logs:\
 `cf logs dluhc-core-staging --recent`
 ### Troubleshooting deployments
 A failed Github deployment action will occasionally leave a Cloud Foundry deployment in a broken state. As a result all subsequent Github deployment actions will also fail with the message `Cannot update this process while a deployment is in flight`.
 `
 cf cancel-deployment dluhc-core
 `
 You'd then need to check the logs and fix the issue that caused the initial deployment to fail.
 ## CI/CD
 When a commit is made to `main` the following GitHub action jobs are triggered:
 1. **Test**: RSpec runs our test suite
 2. **Deploy**: If the Test stage passes, this job will deploy the app to our GOV.UK PaaS account using the Cloud Foundry CLI
 When a pull request is opened to `main` only the Test stage runs.
 ## Frontend
 ### GOV.UK Design System components
 This service follows the guidance and recommendations from the [GOV.UK Design System](https://design-system.service.gov.uk). This is achieved using the following libraries:
 - **GOV.UK Frontend** – CSS and JavaScript for all Design System components\
  [Documentation](https://frontend.design-system.service.gov.uk) ·
  [GitHub](https://github.com/alphagov/govuk-frontend)
 - **GOV.UK Components** – Rails view components for non-form related Design System components\
  [Documentation](https://govuk-components.netlify.app) ·
  [Github](https://github.com/DFE-Digital/govuk-components) ·
  [RubyDoc](https://www.rubydoc.info/gems/govuk-components)
 - **GOV.UK FormBuilder** – Rails form builder for form related Design System components\
  [Documentation](https://govuk-form-builder.netlify.app) ·
  [GitHub](https://github.com/DFE-Digital/govuk-formbuilder) ·
  [RubyDoc](https://www.rubydoc.info/gems/govuk_design_system_formbuilder)
 ### Service-specific components
 Service-specific components are built using the [ViewComponent](https://viewcomponent.org) framework, and can be found in `app/components`.
 Components use HTML class names that follow the BEM methodology. We use the `app-*` prefix to prevent collisions with components provided by the Design System (which uses `govuk-*`). See [Extending and modifying components in production](https://design-system.service.gov.uk/get-started/extending-and-modifying-components/).
 Stylesheets are written using [Sass](https://sass-lang.com) (and the SCSS syntax), using the mixins and helpers provided by [govuk-frontend](https://frontend.design-system.service.gov.uk/sass-api-reference/).
 Separate stylesheets are used for each component, with filenames that match the component’s namespace.
 Like the components provided by the Design System, components are progressively enhanced. We use [Stimulus](https://stimulus.hotwired.dev) to add any client-side JavaScript enhancements.
 ## Single log submission form configuration
 The form for this is driven by a JSON file in `/config/forms/{start_year}_{end_year}.json`
 The JSON should follow the structure:
 ```jsonc
 {
  "form_type": "lettings" / "sales",
  "start_year": Integer, // i.e. 2020
  "end_year": Integer, // i.e. 2021
  "sections": {
    "[snake_case_section_name_string]": {
      "label": String,
      "description": String,
      "subsections": {
        "[snake_case_subsection_name_string]": {
          "label": String,
          "pages": {
            "[snake_case_page_name_string]": {
              "header": String,
              "description": String,
              "questions": {
                "[snake_case_question_name_string]": {
                  "header": String,
                  "hint_text": String,
                  "check_answer_label": String,
                  "type": "text" / "numeric" / "radio" / "checkbox" / "date",
                  "min": Integer, // numeric only
                  "max": Integer, // numeric only
                  "step": Integer, // numeric only
                  "width": 2 / 3 / 4 / 5 / 10 / 20, // text and numeric only
                  "prefix": String, // numeric only
                  "suffix": String, //numeric only
                  "answer_options": { // checkbox and radio only
                    "0": String,
                    "1": String
                  },
                  "conditional_for": {
                    "[snake_case_question_to_enable_1_name_string]": ["condition-that-enables"],
                    "[snake_case_question_to_enable_2_name_string]": ["condition-that-enables"]
                  },
                  "inferred_answers": { "field_that_gets_inferred_from_current_field": { "is_that_field_inferred": true } },
                  "inferred_check_answers_value": {
                    "condition": { "field_name_for_inferred_check_answers_condition": "field_value_for_inferred_check_answers_condition" },
                    "value": "Inferred value that gets displayed if condition is met"
                  }
                }
              },
              "depends_on": [{ "question_key": "answer_value_required_for_this_page_to_be_shown" }]
            }
          }
        }
      }
    }
  }
 }
 ```
 Assumptions made by the format:
 - All forms have at least 1 section
 - All sections have at least 1 subsection
 - All subsections have at least 1 page
 - All pages have at least 1 question
 - The ActiveRecord case log model has a field for each question name (must match). In the case of checkbox questions it must have one field for every answer option (again names must match).
 - Text not required by a page/question such as a header or hint text should be passed as an empty string
 - For conditionally shown questions, conditions that have been implemented and can be used are:
  - Radio question answer option selected matches one of conditional e.g. ["answer-options-1-string", "answer-option-3-string"]
  - Numeric question value matches condition e.g. [">2"], ["<7"] or ["== 6"]
 - When the top level question is a radio button and the conditional question is a numeric, text or date field then the conditional question is shown inline
 - When the conditional question is a radio, checkbox or select field it should be displayed on it's own page and "depends_on" should be used rather than "conditional_for"
  Page routing:
  - Form navigation works by stepping sequentially through every page defined in the JSON form definition for the given subsection. For every page it checks if it has "depends_on" conditions. If it does, it evaluates them to determine whether that page should be show or not.
  - In this way we can build up whole branches by having:
  ```jsonc
  "page_1": { "questions": { "question_1: "answer_options": ["A", "B"] } },
  "page_2": { "questions": { "question_2: "answer_options": ["C", "D"] }, "depends_on": [{ "question_1": "A" }] },
  "page_3": { "questions": { "question_3: "answer_options": ["E", "F"] }, "depends_on": [{ "question_1": "A" }] },
  "page_4": { "questions": { "question_4: "answer_options": ["G", "H"] }, "depends_on": [{ "question_1": "B" }] },
  ```
 ### JSON form validation against Schema
 To validate the form JSON against the schema you can run:\
  `rake form_definition:validate["config/forms/2021_2022.json"]`
-n.b. You may have to escape square brackets in zsh\
+API documentation can be found here: <https://communitiesuk.github.io/mhclg-data-collection-beta>. This is driven by [OpenAPI docs](docs/api/DLUHC-CORE-Data.v1.json)
  `rake form_definition:validate\["config/forms/2021_2022.json"\]`
 This will validate the given form definition against the schema in `config/forms/schema/generic.json`.
-You can also run:\
+System architecture:
-  `rake form_definition:validate_all`
+![View of system architecture](docs/images/architecture.png)
-This will validate all forms in directories = `["config/forms", "spec/fixtures/forms"]`
+View of the service frontend:
 ![View of the logs list](docs/images/logs_list.png)
--- a/docs/developer_setup.md
+++ b/docs/developer_setup.md
@ -0,0 +1,142 @@
 # **Developing locally on host machine**
 The most common way to run a development version of the application is run with local dependencies.
 Dependencies:
 - Ruby
 - Rails
 - PostgreSQL
 - NodeJS
 - Gecko driver (https://github.com/mozilla/geckodriver/releases) [for running Selenium tests]
 We recommend using RBenv to manage Ruby versions.
 1. Install PostgreSQL
  Mac OS:
  ```bash
  brew install postgresql
  brew services start postgresql
  ```
  Linux (Debian):
  ```bash
  sudo apt install -y postgresql postgresql-contrib libpq-dev
  sudo systemctl start postgresql
  ```
 2. Create a Postgres user
  ```bash
  sudo su - postgres -c "createuser <username> -P"
  ```
 3. Install RBenv & Ruby-build
  Mac OS:
  ```bash
  brew install rbenv
  rbenv init
  mkdir -p ~/.rbenv/plugins
  git clone https://github.com/rbenv/ruby-build.git ~/.rbenv/plugins/ruby-build
  ```
  Linux (Debian):
  ```bash
  sudo apt install -y rbenv
  echo 'export PATH="/usr/local/rbenv/bin:\$PATH"' >> ~/.bashrc
  rbenv init
  echo "# Load RBenv" >> ~/.bashrc
  echo 'eval "$(rbenv init -)"' >> ~/.bashrc
  mkdir -p ~/.rbenv/plugins
  git clone https://github.com/rbenv/ruby-build.git ~/.rbenv/plugins/ruby-build
  ```
 4. Install Ruby & Bundler
  ```bash
  rbenv install 3.1.2
  rbenv global 3.1.2
  gem install bundler
  ```
 5. Install Javascript depenencies
  Mac OS:
  ```bash
  brew install node
  brew install yarn
  ```
  Linux (Debian):
  ```bash
  curl -sL https://deb.nodesource.com/setup_16.x | bash -
  sudo apt -y install nodejs
  mkdir "~/.npm-packages"
  npm config set prefix "~/.npm-packages"
  echo 'NPM_PACKAGES="~/.npm-packages"' >> ~/.bashrc
  echo 'export PATH="$PATH:$NPM_PACKAGES/bin"' >> ~/.bashrc
  npm install --global yarn
  ```
 6. Clone the repo
  ```bash
  git clone git@github.com:communitiesuk/submit-social-housing-lettings-and-sales-data.git
  ```
 ## App setup (OS agnostic)
 1. Copy the `.env.example` to `.env` and replace the database credentials with your local postgres user credentials.
 2. Install the dependencies:\
  `bundle install && yarn install`
 3. Create the database & run migrations:\
  `rake db:create db:migrate`
 4. Seed the database if required:\
  `rake db:seed`
 5. Start the dev servers
  a. Using foreman:\
  `./bin/dev`
  b. Individually:\
    i. Rails:\
    `bundle exec rails s`
    ii. JS (for hot reloading):\
    `yarn build --mode=development --watch`
  If you're not modifying front end assets you can bundle them as a one off task:\
    `yarn build --mode=development`
 Development mode will target the latest versions of Chrome, Firefox and Safari for transpilation while production mode will target older browsers.
 The Rails server will start on <http://localhost:3000>.
 Running the test suite (front end assets need to be built or server needs to be running):\
  `bundle exec rspec`
 # **Using Docker**
 1. Build the image:\
 `docker-compose build`
 2. Run the database migrations:\
 `docker-compose run --rm app /bin/bash -c 'rake db:migrate'`
 3. Seed the database if required:\
 `docker-compose run --rm app /bin/bash -c 'rake db:seed'`
 4. To be able to debug with Pry run the app using:\
 `docker-compose run --service-ports app`
 If this is not needed you can run `docker-compose up` as normal
 The Rails server will start on <http://localhost:8080>.
--- a/docs/exports.md
+++ b/docs/exports.md
@ -0,0 +1,15 @@
 # CDS exports
 All data collected by the application needs to be exported to the Consolidated Data Store (CDS) which is a data warehouse based on MS SQL running in the DAP (Data Analytics Platform).
 This is done via XML exports saved in an S3 bucket located in the DAP VPC using dedicated credentials shared out of band. The data mapping for this export can be found in `app/services/exports/case_log_export_service.rb`. Initially the application database field names and field types were chosen to match the existing CDS data as closely as possible to minimise the amount of transformation needed. This has led to a less than optimal data model though and increasingly we should look to transform at the mapping layer where beneficial for our application.
 The export service is triggered nightly using [Gov PaaS tasks](https://docs.cloudfoundry.org/devguide/using-tasks.html). These tasks are triggered from a Github action, as Gov PaaS does not currently support the Cloud Foundry Task Scheduler.
 The S3 bucket is located in the DAP VPC rather than the application VPC as DAP runs in an AWS account directly so access to the S3 bucket can be restricted to only the IPs used by the application. This is not possible the other way around as Gov PaaS does not support restricting S3 access by IP (https://github.com/alphagov/paas-roadmap/issues/107).
 ## Other options previously considered:
 - CDC replication using a managed service such as [AWS DMS](https://aws.amazon.com/dms/)
  - Would require VPC peering which Gov PaaS does not currently support (https://github.com/alphagov/paas-roadmap/issues/105)
  - Would require CDS to make changes to their ingestion model
--- a/docs/form_builder.md
+++ b/docs/form_builder.md
@ -0,0 +1,151 @@
 ## Single log submission form configuration
 ### Background
 Lettings and Sales of Social housing data is collected in annual "collection windows" that run from 1st April to 1st April. During this window the form and questions generally stay constant. The form will generally change by small amounts between each collection window. Typical changes are adding new questions, adding or removing answer options from questions or tweaking question wording for clarity.
 A paper form is produced for guidance and to help data providers collect the data offline, and a bulk upload template is circulated which need to match the online form.
 Data is accepted for a collection window for up to 3 months after it's finished to allow for late data submission. This means that between April and July two version of the form run simultaneously.
 Other considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a UI.
 We haven't used micro-services, preferring to deploy a single application for CLDC but we have modelled the form itself as configuration in the form of a JSON structure that acts as a sort of DSL/form builder for the form. The idea is to decouple the code that creates the required routes, controller methods, views etc to display the form from the actual wording of questions or order of pages such that it becomes possible to make changes to the form with little or no code changes.
 This should also mean that in the future it could be possible to create a UI that can construct the JSON config, which would open up the ability to make form changes to a wider audience. Doing this fully would require generating and running the necessary migrations for data storage, generating the required ActiveRecord methods to validate the data server side, and generating/updating API endpoints and documentation. All of this is likely to be beyond the scope of initial MVP but could be looked at in the future.
 Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
 - The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
 - The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
 A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application. We'll call this the "Form Runner" part of the application.
 ### Setup this log
 The setup this log section is treated slightly differently from the rest of the form. It is more accurately viewed as providing metadata about the form than as being part of the form itself. It also needs to know far more about the application specific context than other parts of the form such as who the current user is, what organisation they're part of and what role they have etc.
 As a result it's not modelled as part of the config but rather as code. It still uses the same "Form Runner" components though.
 ### Features the Form Config supports
 - Defining sections, subsections, pages and questions that fit the GovUK tasklist pattern
 - Auto-generated routes - urls are automatically created from dasherized page names
 - Data persistence requires a database field to exist which matches the name/id for each question (and answer option for checkbox questions)
 - Text, Numeric, Date, Radio, Select and Checkbox question types
 - Conditional questions (`conditional_for`) - Radio and Checkbox questions can support "conditional" text or numeric questions that show/hide on the same page when the triggering option is selected
 - Routing (`depends_on`) - all pages can specify conditions (attributes of the case log) that determine whether or not they're shown to the user
  - Methods can be chained (i.e. you can have conditions in the form `{ owning_organisation.provider_type: "local_authority"`) which will call `case_log.owning_organisation.provider_type` and compare the result to the provided value.
  - Numeric questions support math expression depends_on conditions such as `{ age2: ">16" }`
 - By default questions on pages that are not routed to are assumed to be invalid and are cleared. This can be prevented by setting `derived: true` on a question.
 - Questions can be optionally hidden from the check answers page of each section by setting `hidden_in_check_answers: true`. This can also take a condition.
 - Questions can be set as being inferred from other answers. This is similar to derived with the difference being that derived questions can be derived from anything not just other form question answers, and inferred answers are cleared when the answers they depend on change, whereas derived questions aren't.
 - Soft validation interruption pages can be included
 - For complex html guidance partials can be referenced
 ### JSON Config
 The form for this is driven by a JSON file in `/config/forms/{start_year}_{end_year}.json`
 The JSON should follow the structure:
 ```jsonc
 {
  "form_type": "lettings" / "sales",
  "start_year": Integer, // i.e. 2020
  "end_year": Integer, // i.e. 2021
  "sections": {
    "[snake_case_section_name_string]": {
      "label": String,
      "description": String,
      "subsections": {
        "[snake_case_subsection_name_string]": {
          "label": String,
          "pages": {
            "[snake_case_page_name_string]": {
              "header": String,
              "description": String,
              "questions": {
                "[snake_case_question_name_string]": {
                  "header": String,
                  "hint_text": String,
                  "check_answer_label": String,
                  "type": "text" / "numeric" / "radio" / "checkbox" / "date",
                  "min": Integer, // numeric only
                  "max": Integer, // numeric only
                  "step": Integer, // numeric only
                  "width": 2 / 3 / 4 / 5 / 10 / 20, // text and numeric only
                  "prefix": String, // numeric only
                  "suffix": String, //numeric only
                  "answer_options": { // checkbox and radio only
                    "0": String,
                    "1": String
                  },
                  "conditional_for": {
                    "[snake_case_question_to_enable_1_name_string]": ["condition-that-enables"],
                    "[snake_case_question_to_enable_2_name_string]": ["condition-that-enables"]
                  },
                  "inferred_answers": { "field_that_gets_inferred_from_current_field": { "is_that_field_inferred": true } },
                  "inferred_check_answers_value": {
                    "condition": { "field_name_for_inferred_check_answers_condition": "field_value_for_inferred_check_answers_condition" },
                    "value": "Inferred value that gets displayed if condition is met"
                  }
                }
              },
              "depends_on": [{ "question_key": "answer_value_required_for_this_page_to_be_shown" }]
            }
          }
        }
      }
    }
  }
 }
 ```
 Assumptions made by the format:
 - All forms have at least 1 section
 - All sections have at least 1 subsection
 - All subsections have at least 1 page
 - All pages have at least 1 question
 - The ActiveRecord case log model has a field for each question name (must match). In the case of checkbox questions it must have one field for every answer option (again names must match).
 - Text not required by a page/question such as a header or hint text should be passed as an empty string
 - For conditionally shown questions, conditions that have been implemented and can be used are:
  - Radio question answer option selected matches one of conditional e.g. ["answer-options-1-string", "answer-option-3-string"]
  - Numeric question value matches condition e.g. [">2"], ["<7"] or ["== 6"]
 - When the top level question is a radio button and the conditional question is a numeric, text or date field then the conditional question is shown inline
 - When the conditional question is a radio, checkbox or select field it should be displayed on it's own page and "depends_on" should be used rather than "conditional_for"
  Page routing:
  - Form navigation works by stepping sequentially through every page defined in the JSON form definition for the given subsection. For every page it checks if it has "depends_on" conditions. If it does, it evaluates them to determine whether that page should be show or not.
  - In this way we can build up whole branches by having:
  ```jsonc
  "page_1": { "questions": { "question_1: "answer_options": ["A", "B"] } },
  "page_2": { "questions": { "question_2: "answer_options": ["C", "D"] }, "depends_on": [{ "question_1": "A" }] },
  "page_3": { "questions": { "question_3: "answer_options": ["E", "F"] }, "depends_on": [{ "question_1": "A" }] },
  "page_4": { "questions": { "question_4: "answer_options": ["G", "H"] }, "depends_on": [{ "question_1": "B" }] },
  ```
 ### JSON form validation against Schema
 To validate the form JSON against the schema you can run:\
  `rake form_definition:validate["config/forms/2021_2022.json"]`
 n.b. You may have to escape square brackets in zsh\
  `rake form_definition:validate\["config/forms/2021_2022.json"\]`
 This will validate the given form definition against the schema in `config/forms/schema/generic.json`.
 You can also run:\
  `rake form_definition:validate_all`
 This will validate all forms in directories = `["config/forms", "spec/fixtures/forms"]`
 ### Improvements that could be made
 - JSON schema definition could be expanded such that we can better automatically validate that a given config is valid and internally consistent
 - Generators could parse a given valid JSON form and generate the required database migrations to ensure all the expected fields exist and are of a compatible type
 - The parsed form could be visualised using something like GraphViz to help manually verify the coded config meets requirements
--- a/docs/form_runner.md
+++ b/docs/form_runner.md
@ -0,0 +1,19 @@
 # Form Runner
 The form runner is composed of:
 Ruby Classes:
 - A singleton form handler that instantiates an instances of each form definition (config file we have) combined with the "setup" section that is common to all forms. This is created at rails boot time. (`app/models/form_handler.rb`)
 - A Form class that is the entry point for parsing a form definition and handles most of the associated logic (`app/models/form.rb`)
 - Section, Subsection, Page and Question classes (`app/models/form/`)
 - Setup subsection specific instances (subclasses) of Section, Subsection, Pages and Questions (`app/form/setup/`)
 ERB Templates:
 - The page view which is the main view for each form page (`app/views/form/page.html.erb`)
 - Partials for each question type (radio, checkbox, select, text, numeric, date) (`app/views/form/`)
 - Partials for specific question guidance (`app/views/form/guidance`)
 - The check answers page which is the view for the answer summary page of each section (`app/views/form/check_answers.html.erb`)
 Routes for each form page are generated by looping over each Page instance in each Form instance held by the Form Handler and defining a "Get" path. The corresponding controller method is also auto-generated with meta-programming via the same looping in `app/controllers/form_controller.rb`
 All form pages submit to the same controller method (`app/controllers/form_controller.rb#submit_form`) which validates and persists the data, and then redirects to the next form page that identifies as "routed_to" given the current case log state.
--- a/docs/frontend.md
+++ b/docs/frontend.md
@ -0,0 +1,49 @@
 ## Frontend
 ### GOV.UK Design System components
 This service follows the guidance and recommendations from the [GOV.UK Design System](https://design-system.service.gov.uk). This is achieved using the following libraries:
 - **GOV.UK Frontend** – CSS and JavaScript for all Design System components\
  [Documentation](https://frontend.design-system.service.gov.uk) ·
  [GitHub](https://github.com/alphagov/govuk-frontend)
 - **GOV.UK Components** – Rails view components for non-form related Design System components\
  [Documentation](https://govuk-components.netlify.app) ·
  [Github](https://github.com/DFE-Digital/govuk-components) ·
  [RubyDoc](https://www.rubydoc.info/gems/govuk-components)
 - **GOV.UK FormBuilder** – Rails form builder for form related Design System components\
  [Documentation](https://govuk-form-builder.netlify.app) ·
  [GitHub](https://github.com/DFE-Digital/govuk-formbuilder) ·
  [RubyDoc](https://www.rubydoc.info/gems/govuk_design_system_formbuilder)
 ### Service-specific components
 Service-specific components are built using the [ViewComponent](https://viewcomponent.org) framework, and can be found in `app/components`.
 Components use HTML class names that follow the BEM methodology. We use the `app-*` prefix to prevent collisions with components provided by the Design System (which uses `govuk-*`). See [Extending and modifying components in production](https://design-system.service.gov.uk/get-started/extending-and-modifying-components/).
 Stylesheets are written using [Sass](https://sass-lang.com) (and the SCSS syntax), using the mixins and helpers provided by [govuk-frontend](https://frontend.design-system.service.gov.uk/sass-api-reference/).
 Separate stylesheets are used for each component, with filenames that match the component’s namespace.
 Like the components provided by the Design System, components are progressively enhanced. We use [Stimulus](https://stimulus.hotwired.dev) to add any client-side JavaScript enhancements.
 ### Stimulus
 For adding custom javascript to the application we use [Stimulus](https://stimulus.hotwired.dev/).
 The general pattern is:
 - Register a controller in `/app/frontend/controllers/index.js`- be sure to use kebab case
 - Create that controller in `app/frontend/controllers/` - be sure to use underscore case
 - Attach the controller to the html element that should trigger it's functionality
 ### Asset bundling and compilation
 - We use [Webpack](https://webpack.js.org/) via [jsbundling-rails](https://github.com/rails/jsbundling-rails) to bundle js, css and images. The configuration can be found in `webpack.config.js`.
 - We use [Propshaft](https://github.com/rails/propshaft) as our asset pipeline to serve the assets bundled/compiled by webpack
 - We use [Babel](https://babeljs.io/) to transpile js down to ES5 for Internet Explorer compatibility. The configuration can be found in `babel.config.js`
 - We use [browserslist](https://github.com/browserslist/browserslist) to specifiy the browsers we want to transpile for. The configuration can be found in `package.json`
 - We include a number of polyfills to support Internet Explorer. These can be found in `app/frontend/application.js`
--- a/docs/images/architecture.png
+++ b/docs/images/architecture.png
--- a/docs/images/logs_list.png
+++ b/docs/images/logs_list.png
--- a/docs/images/organisational_relationships.png
+++ b/docs/images/organisational_relationships.png
--- a/docs/images/user_log_permissions.png
+++ b/docs/images/user_log_permissions.png
--- a/infrastructure_setup.md
+++ b/infrastructure_setup.md
@ -1,4 +1,53 @@
-# Staging
+## Infrastructure
 This application is running on [GOV.UK PaaS](https://www.cloud.service.gov.uk/). To deploy you need to:
 1. Contact your organisation manager to get an account in `dluhc-core` organization and in the relevant spaces (staging/production).
 2. [Install the Cloud Foundry CLI](https://docs.cloudfoundry.org/cf-cli/install-go-cli.html)
 3. Login:\
 `cf login -a api.london.cloud.service.gov.uk -u <your_username>`
 4. Set your deployment target (staging/production):\
 `cf target -o dluhc-core -s <deploy_environment>`
 5. Deploy:\
 `cf push dluhc-core --strategy rolling`. This will use the [manifest file](staging_manifest.yml)
 Once the app is deployed:
 1. Get a Rails console:\
 `cf ssh dluhc-core-staging -t -c "/tmp/lifecycle/launcher /home/vcap/app 'rails console' ''"`
 2. Check logs:\
 `cf logs dluhc-core-staging --recent`
 ### Troubleshooting deployments
 A failed Github deployment action will occasionally leave a Cloud Foundry deployment in a broken state. As a result all subsequent Github deployment actions will also fail with the message `Cannot update this process while a deployment is in flight`.
 `
 cf cancel-deployment dluhc-core
 `
 You'd then need to check the logs and fix the issue that caused the initial deployment to fail.
 ## CI/CD
 When a commit is made to `main` the following GitHub action jobs are triggered:
 1. **Test**: RSpec runs our test suite
 2. **Deploy**: If the Test stage passes, this job will deploy the app to our GOV.UK PaaS account using the Cloud Foundry CLI
 When a pull request is opened to `main` only the Test stage runs.
 ## Setting up Infrastructure for a new environment
 ### Staging
 1. Login:\
  `cf login -a api.london.cloud.service.gov.uk -u <your_username>`
@ -27,7 +76,7 @@
  `cf create-service-key dluhc-core-staging-export-bucket data-export -c '{"allow_external_access": true, "permissions": "read-only"}'`
-# Production
+### Production
 1. Login:\
  `cf login -a api.london.cloud.service.gov.uk -u <your_username>`
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@ -0,0 +1,13 @@
 # Infrastructure Metric monitoring
 We use self-hosted Prometheus and Grafana for monitoring infrastructure metrics. These are run in a dedicated Gov PaaS space called "monitoring" and are deployed as Docker images using Github action pipelines. The repository for these and more information is here: [dluhc-data-collection-monitoring](https://github.com/communitiesuk/dluhc-data-collection-monitoring).
 # Application & Performance monitoring & alerting
 For application error and performance monitoring we use managed [Sentry](https://sentry.io/organizations/dluhc-core). You will need to be added to the DLUHC account to access this. It triggers slack notifications to the #team-data-collection-alerts channel for all application errors in staging and production and for any controller endpoints that have a P95 transaction duration > 250ms over a 24 hour period.
 # Logs
 For log persistence we use a managed ELK (Elasticsearch, Logstash, Kibana) stack provided by [Logit](https://logit.io/). You will need to be added to the DLUHC account to access this. Longs are retained for 14 days with a daily limit of 2GB.
 Logs are also available from Gov PaaS directly via cli `cf logs <gov-paas-space-name> --recent`.
--- a/docs/organisation_relationships.md
+++ b/docs/organisation_relationships.md
@ -0,0 +1,23 @@
 # Definitions
 - **Stock owning organisation** (parent): an organisation that owns housing stock (parent). It may manage the allocation of people in and out of their accommodation, or it may contract this out to a managing agent (child).
 - **Managing agent (child)**: These are about orgs. In scenarios where one organisation owns stock and another organisation is contracted to manage the stock and tenants, the latter organisation is often called a ‘managing agent’. A managing agent is the same as a child and is the term more commonly used by data providing organisations. Parent/child is what we call them internally but is not a term that should be used for external customers. Managing agents are responsible for the allocation of people in and out of the accommodation, and/or responsible for the services provided to support those people in the accommodation (in the case of Supported Housing).
 # Permissions
 ## Organisational relationships:
 Organisations that own stock can contract out the management of that stock to another organisation. This relationship is often referred to as a parent/child relationship. This is a useful analogy as a parent can have multiple children, and a child can have many parents. A child organisation can also be a parent, and a parent organisation can also be a child organisation:
 ![Organisational relationships](images/organisational_relationships.png)
 The case logs that a user can see depends on their role:
  - Customer Support users can access any case log
  - Data coordinators can access any case log for which the organisation they work for is ultimately responsible for, meaning they can see logs managed by a child organisation
  - Data providers can only access case logs for which their organisation manages (or directly owns)
 Taking the relationships from the above diagram, and looking at which logs each user can access:
 ![User log access permissions](images/user_log_permissions.png)
--- a/docs/schemes.md
+++ b/docs/schemes.md
@ -0,0 +1,5 @@
 # Supported housing schemes
 - **Schemes**: Groups of similar properties in the same location, intended for similar tenants with the same type of support needs, managed in the same way. As some of the information we need about a new tenancy is the same for all new tenancies in the ‘scheme’, users can set up a ‘scheme’ in the CORE system by completing the information once. In Supported Housing forms, the user just supplies the appropriate scheme. This means providers do not have to complete identical information multiple times in each CORE form. Effectively we model these as "templates" or "predefined answer sets"
 - **Management groups**: Schemes are often managed together as part of a ‘management group’. An organisation may have multiple management groups, and each management group may have multiple schemes. For Supported Housing logs, users must select the management group first, then select scheme. 
--- a/docs/service_overview.md
+++ b/docs/service_overview.md
@ -0,0 +1,5 @@
 ## Service
 All lettings and and sales of social housing in England need to be logged with the Department for levelling up, housing and communities (DLUHC). This is done by Local Authorities and Housing Associations, who are the primary users of this service. Data is collected via a form that runs on an annual data collection window basis. Form changes are made annually to add new questions, remove any that are no longer needed, or adjust wording or answer options etc. Each data collection window runs from 1st April to 1st April + an extra 3 months to allow for any late submissions, meaning that between April and July, two collection windows are open simultaneously and logs can be submitted for either.
 ADD (Analytics & Data Directorate) statisticians are the other primary users of the service. The data collected is transferred to DLUHCs data warehouse (CDS - consolidated data store), via nightly exports to XML which are transferred to S3 and ingested from there. CDS ingests and transforms the data, ultimately storing it in a MS SQL database and exposing it to analysts and statisticians via Amazon Workspaces. 
--- a/docs/testing.md
+++ b/docs/testing.md
@ -0,0 +1,8 @@
 # Testing strategy
 - We use [RSpec](https://rspec.info/) and [Capybara](https://teamcapybara.github.io/capybara/)
 - Capybara is used for our feature tests. These use the Rack driver by default (faster) or the Gecko driver (installation required) when the `js: true` option is passed for a test.
 - Capybara is configured to run in headless mode but this can be toggled by commenting out `app/spec/rails_helper.rb#L14`
 - Capybara is configured to use Gecko driver for JS tests as Chrome is more commonly used and so naturally more likely to be better tested but this can be switched to Chrome driver by changing `app/spec/rails_helper.rb#L13`
 - Feature specs are generally written sparingly as they're also the slowest, where possible a request spec is preferred as this still tests a large surface area (route, controller, model, view) without the performance impact. They are not suitable for tests that need to run javascript or test that a specific set of UI events triggers a specific set of requests (with high confidence).
 - Test data is created with [FactoryBot](https://github.com/thoughtbot/factory_bot) where ever possible
--- a/docs/user_roles.md
+++ b/docs/user_roles.md
@ -0,0 +1,13 @@
 # External Users
 The primary users of the system are external data providing organisations: Local Authorities and Private Registered Providers (Housing Associations). These have 2 main user type:
 - Data Coordinators - administrators for their own organisation, can also complete logs
 - Data Providers - complete the logs
 Additionally there are Data Protection Officers (DPO) which at some organisations is a separate role, but in our codebase is modelled as an attribute of the user (i.e. a data coordinator or provider can additionally be a DPO). They are responsible for ensuring the organisation has signed the data sharing agreement.
 # Internal users
 - Customer support (helpdesk) - can administrate all organisations
 - ADD statisticians - primary consumers of the data collected via CDS/DAP