Generate website for technical docs (#729)

* Generate website for technical docs * Add docs to .cfignore
3 years ago · bf8bd0bfef
44 changed files with 714 additions and 289 deletions
--- a/.cfignore
+++ b/.cfignore
@ -0,0 +1 @@
+app/docs/*
--- a/.gitignore
+++ b/.gitignore
@ -28,6 +28,12 @@
 # Ignore master key for decrypting credentials and more.
 /config/master.key

+# Ignore generated documentation files
+_site
+.sass-cache
+.jekyll-cache
+.jekyll-metadata
+
 /public/packs
 /public/packs-test
 /node_modules
--- a/README.md
+++ b/README.md
@ -7,26 +7,27 @@ Ruby on Rails app that handles the submission of lettings and sales of social ho

 ## Domain documentation

- [Service overview](docs/service_overview.md)
- [Organisations](docs/organisations.md)
- [Users and roles](docs/users.md)
- [Supported housing schemes](docs/schemes.md)
-
-## Technical Documentation
-
- [Developer setup](docs/developer_setup.md)
- [Frontend](docs/frontend.md)
- [Testing strategy](docs/testing.md)
- [Form Builder](docs/form_builder.md)
- [Form Runner](docs/form_runner.md)
- [Infrastructure](docs/infrastructure.md)
- [Monitoring](docs/monitoring.md)
- [Exporting to CDS](docs/exports.md)
- [Architecture decision records](docs/adr)
-
-## API documentation
-
-API documentation can be found here: <https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data>. This is driven by [OpenAPI docs](docs/api/DLUHC-CORE-Data.v1.json)
+* [Domain and technical documentation](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data)
+  * [Local development setup](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data/setup)
+  * [Architecture decision records](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data/adr)
+* [API browser](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data/api) (using this [OpenAPI specification](docs/api/v1.json))
+* [Design history](https://core-design-history.herokuapp.com)
+
+### Running documentation locally
+
+The documentation website can be generated and served locally using Jekyll.
+
+1. Change into the `/docs/` directory:\
+`cd docs`
+
+2. Install Jekyll and its dependencies:\
+`bundle install`
+
+3. Start the Jekyll server:\
+`bundle exec jekyll serve`
+
+4. View the website:\
+<http://localhost:4000>

 ## System architecture

--- a/docs/Gemfile
+++ b/docs/Gemfile
@ -0,0 +1,9 @@
+source "https://rubygems.org"
+
+group :jekyll_plugins do
+  gem "github-pages"
+end
+
+group :development do
+  gem "webrick"
+end
--- a/docs/Gemfile.lock
+++ b/docs/Gemfile.lock
@ -0,0 +1,263 @@
+GEM
+  remote: https://rubygems.org/
+  specs:
+    activesupport (6.0.5)
+      concurrent-ruby (~> 1.0, >= 1.0.2)
+      i18n (>= 0.7, < 2)
+      minitest (~> 5.1)
+      tzinfo (~> 1.1)
+      zeitwerk (~> 2.2, >= 2.2.2)
+    addressable (2.8.0)
+      public_suffix (>= 2.0.2, < 5.0)
+    coffee-script (2.4.1)
+      coffee-script-source
+      execjs
+    coffee-script-source (1.11.1)
+    colorator (1.1.0)
+    commonmarker (0.23.5)
+    concurrent-ruby (1.1.10)
+    dnsruby (1.61.9)
+      simpleidn (~> 0.1)
+    em-websocket (0.5.3)
+      eventmachine (>= 0.12.9)
+      http_parser.rb (~> 0)
+    ethon (0.15.0)
+      ffi (>= 1.15.0)
+    eventmachine (1.2.7)
+    execjs (2.8.1)
+    faraday (2.3.0)
+      faraday-net_http (~> 2.0)
+      ruby2_keywords (>= 0.0.4)
+    faraday-net_http (2.0.3)
+    ffi (1.15.5)
+    forwardable-extended (2.6.0)
+    gemoji (3.0.1)
+    github-pages (226)
+      github-pages-health-check (= 1.17.9)
+      jekyll (= 3.9.2)
+      jekyll-avatar (= 0.7.0)
+      jekyll-coffeescript (= 1.1.1)
+      jekyll-commonmark-ghpages (= 0.2.0)
+      jekyll-default-layout (= 0.1.4)
+      jekyll-feed (= 0.15.1)
+      jekyll-gist (= 1.5.0)
+      jekyll-github-metadata (= 2.13.0)
+      jekyll-include-cache (= 0.2.1)
+      jekyll-mentions (= 1.6.0)
+      jekyll-optional-front-matter (= 0.3.2)
+      jekyll-paginate (= 1.1.0)
+      jekyll-readme-index (= 0.3.0)
+      jekyll-redirect-from (= 0.16.0)
+      jekyll-relative-links (= 0.6.1)
+      jekyll-remote-theme (= 0.4.3)
+      jekyll-sass-converter (= 1.5.2)
+      jekyll-seo-tag (= 2.8.0)
+      jekyll-sitemap (= 1.4.0)
+      jekyll-swiss (= 1.0.0)
+      jekyll-theme-architect (= 0.2.0)
+      jekyll-theme-cayman (= 0.2.0)
+      jekyll-theme-dinky (= 0.2.0)
+      jekyll-theme-hacker (= 0.2.0)
+      jekyll-theme-leap-day (= 0.2.0)
+      jekyll-theme-merlot (= 0.2.0)
+      jekyll-theme-midnight (= 0.2.0)
+      jekyll-theme-minimal (= 0.2.0)
+      jekyll-theme-modernist (= 0.2.0)
+      jekyll-theme-primer (= 0.6.0)
+      jekyll-theme-slate (= 0.2.0)
+      jekyll-theme-tactile (= 0.2.0)
+      jekyll-theme-time-machine (= 0.2.0)
+      jekyll-titles-from-headings (= 0.5.3)
+      jemoji (= 0.12.0)
+      kramdown (= 2.3.2)
+      kramdown-parser-gfm (= 1.1.0)
+      liquid (= 4.0.3)
+      mercenary (~> 0.3)
+      minima (= 2.5.1)
+      nokogiri (>= 1.13.4, < 2.0)
+      rouge (= 3.26.0)
+      terminal-table (~> 1.4)
+    github-pages-health-check (1.17.9)
+      addressable (~> 2.3)
+      dnsruby (~> 1.60)
+      octokit (~> 4.0)
+      public_suffix (>= 3.0, < 5.0)
+      typhoeus (~> 1.3)
+    html-pipeline (2.14.2)
+      activesupport (>= 2)
+      nokogiri (>= 1.4)
+    http_parser.rb (0.8.0)
+    i18n (0.9.5)
+      concurrent-ruby (~> 1.0)
+    jekyll (3.9.2)
+      addressable (~> 2.4)
+      colorator (~> 1.0)
+      em-websocket (~> 0.5)
+      i18n (~> 0.7)
+      jekyll-sass-converter (~> 1.0)
+      jekyll-watch (~> 2.0)
+      kramdown (>= 1.17, < 3)
+      liquid (~> 4.0)
+      mercenary (~> 0.3.3)
+      pathutil (~> 0.9)
+      rouge (>= 1.7, < 4)
+      safe_yaml (~> 1.0)
+    jekyll-avatar (0.7.0)
+      jekyll (>= 3.0, < 5.0)
+    jekyll-coffeescript (1.1.1)
+      coffee-script (~> 2.2)
+      coffee-script-source (~> 1.11.1)
+    jekyll-commonmark (1.4.0)
+      commonmarker (~> 0.22)
+    jekyll-commonmark-ghpages (0.2.0)
+      commonmarker (~> 0.23.4)
+      jekyll (~> 3.9.0)
+      jekyll-commonmark (~> 1.4.0)
+      rouge (>= 2.0, < 4.0)
+    jekyll-default-layout (0.1.4)
+      jekyll (~> 3.0)
+    jekyll-feed (0.15.1)
+      jekyll (>= 3.7, < 5.0)
+    jekyll-gist (1.5.0)
+      octokit (~> 4.2)
+    jekyll-github-metadata (2.13.0)
+      jekyll (>= 3.4, < 5.0)
+      octokit (~> 4.0, != 4.4.0)
+    jekyll-include-cache (0.2.1)
+      jekyll (>= 3.7, < 5.0)
+    jekyll-mentions (1.6.0)
+      html-pipeline (~> 2.3)
+      jekyll (>= 3.7, < 5.0)
+    jekyll-optional-front-matter (0.3.2)
+      jekyll (>= 3.0, < 5.0)
+    jekyll-paginate (1.1.0)
+    jekyll-readme-index (0.3.0)
+      jekyll (>= 3.0, < 5.0)
+    jekyll-redirect-from (0.16.0)
+      jekyll (>= 3.3, < 5.0)
+    jekyll-relative-links (0.6.1)
+      jekyll (>= 3.3, < 5.0)
+    jekyll-remote-theme (0.4.3)
+      addressable (~> 2.0)
+      jekyll (>= 3.5, < 5.0)
+      jekyll-sass-converter (>= 1.0, <= 3.0.0, != 2.0.0)
+      rubyzip (>= 1.3.0, < 3.0)
+    jekyll-sass-converter (1.5.2)
+      sass (~> 3.4)
+    jekyll-seo-tag (2.8.0)
+      jekyll (>= 3.8, < 5.0)
+    jekyll-sitemap (1.4.0)
+      jekyll (>= 3.7, < 5.0)
+    jekyll-swiss (1.0.0)
+    jekyll-theme-architect (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-cayman (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-dinky (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-hacker (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-leap-day (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-merlot (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-midnight (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-minimal (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-modernist (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-primer (0.6.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-github-metadata (~> 2.9)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-slate (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-tactile (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-theme-time-machine (0.2.0)
+      jekyll (> 3.5, < 5.0)
+      jekyll-seo-tag (~> 2.0)
+    jekyll-titles-from-headings (0.5.3)
+      jekyll (>= 3.3, < 5.0)
+    jekyll-watch (2.2.1)
+      listen (~> 3.0)
+    jemoji (0.12.0)
+      gemoji (~> 3.0)
+      html-pipeline (~> 2.2)
+      jekyll (>= 3.0, < 5.0)
+    kramdown (2.3.2)
+      rexml
+    kramdown-parser-gfm (1.1.0)
+      kramdown (~> 2.0)
+    liquid (4.0.3)
+    listen (3.7.1)
+      rb-fsevent (~> 0.10, >= 0.10.3)
+      rb-inotify (~> 0.9, >= 0.9.10)
+    mercenary (0.3.6)
+    minima (2.5.1)
+      jekyll (>= 3.5, < 5.0)
+      jekyll-feed (~> 0.9)
+      jekyll-seo-tag (~> 2.1)
+    minitest (5.16.2)
+    nokogiri (1.13.6-arm64-darwin)
+      racc (~> 1.4)
+    octokit (4.25.1)
+      faraday (>= 1, < 3)
+      sawyer (~> 0.9)
+    pathutil (0.16.2)
+      forwardable-extended (~> 2.6)
+    public_suffix (4.0.7)
+    racc (1.6.0)
+    rb-fsevent (0.11.1)
+    rb-inotify (0.10.1)
+      ffi (~> 1.0)
+    rexml (3.2.5)
+    rouge (3.26.0)
+    ruby2_keywords (0.0.5)
+    rubyzip (2.3.2)
+    safe_yaml (1.0.5)
+    sass (3.7.4)
+      sass-listen (~> 4.0.0)
+    sass-listen (4.0.0)
+      rb-fsevent (~> 0.9, >= 0.9.4)
+      rb-inotify (~> 0.9, >= 0.9.7)
+    sawyer (0.9.2)
+      addressable (>= 2.3.5)
+      faraday (>= 0.17.3, < 3)
+    simpleidn (0.2.1)
+      unf (~> 0.1.4)
+    terminal-table (1.8.0)
+      unicode-display_width (~> 1.1, >= 1.1.1)
+    thread_safe (0.3.6)
+    typhoeus (1.4.0)
+      ethon (>= 0.9.0)
+    tzinfo (1.2.9)
+      thread_safe (~> 0.1)
+    unf (0.1.4)
+      unf_ext
+    unf_ext (0.0.8.2)
+    unicode-display_width (1.8.0)
+    webrick (1.7.0)
+    zeitwerk (2.6.0)
+
+PLATFORMS
+  arm64-darwin-21
+
+DEPENDENCIES
+  github-pages
+  webrick
+
+BUNDLED WITH
+   2.3.4
--- a/docs/_config.yml
+++ b/docs/_config.yml
@ -0,0 +1,12 @@
+title: "CORE Tech Docs"
+remote_theme: just-the-docs/just-the-docs
+lang: en-GB
+permalink: pretty
+color_scheme: govuk
+aux_links:
+  "API browser":
+    - /api
+  "Design history":
+    - https://core-design-history.herokuapp.com
+  "GitHub":
+    - https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data
--- a/docs/_sass/color_schemes/govuk.scss
+++ b/docs/_sass/color_schemes/govuk.scss
@ -0,0 +1,3 @@
+$btn-primary-color: #00703c;
+$link-color: #1d70b8;
+$grey-lt-000: #f8f8f8;
--- a/docs/adr/adr-001-initial-architecture-decisions.md
+++ b/docs/adr/adr-001-initial-architecture-decisions.md
@ -1,31 +1,34 @@
-### ADR - 001: Initial Architecture Decisions
+---
+parent: Architecture decisions
+---

-##### Application Framework
+# 001: Initial architecture decisions

-Ruby on Rails
- Well established and commonly used within MHCLG and gov.uk in general
+## Ruby on Rails
+
+- Well established and commonly used within DLUHC and GOV.UK in general
 - Good ecosystem for common web app tasks, quick productivity
 - Matches team skill set
 - Analysis/RAP pipelines will sit in the DAP platform and not this application directly so optimising for web framework tasks makes sense.

-Testing
+## Testing
+
 - Rspec for unit testing
 - Capybara or Cypress-Rails for front end testing
 - TDD or ATDD approach
 - No specific code coverage target or deploy gate as we feel this leads to arbitrary metric chasing and is counter-productive

-Front end
+## Front end
+
 - In the same app codebase
 - ERB templates

-Code style and linting
- Gov.uk Rubocop for Ruby style
- .editorconfig for whitespace, newlines etc
-
+## Code style and linting

-<br />
+- GOV.UK Rubocop for Ruby style
+- `.editorconfig` for whitespace, newlines etc

-#### Ways of Working
+## Ways of working

 - Flexible approach to branching. Generally Trunk based CI (every TDD round results in a commit and push to master) when pairing, branches and PR when doing solo or more exploratory work.
 - Github actions for automated test, build, deploy pipeline
--- a/docs/adr/adr-002-repositories.md
+++ b/docs/adr/adr-002-repositories.md
@ -1,6 +1,8 @@
-### ADR - 002: Initial Architecture Decisions
+---
+parent: Architecture decisions
+---

-#### Repositories
+# 002: Repositories

 There will be two git repositories for this project.

--- a/docs/adr/adr-003-form-submission-flow.md
+++ b/docs/adr/adr-003-form-submission-flow.md
@ -1,19 +1,21 @@
-### ADR - 003: Form Submission Flow
+---
+parent: Architecture decisions
+---

-Turbo Frames (https://github.com/hotwired/turbo-rails) for form pages/questions with data saved (but not necessarily fully validated) to Active Record model on each submit.
+# 003: Form submission flow

+[Turbo Frames](https://github.com/hotwired/turbo-rails) for form pages/questions with data saved (but not necessarily fully validated) to Active Record model on each submit.

-#### Impact on Performance
+## Impact on performance

 Using Turbo Frames allows us to swap out just the question part of the page without needing full page refreshes as you go through the form and provides a "Single Page Application like" user experience. Each question still gets a unique URL that can be navigated to directly with the Case Log ID and the overall user experience is that form navigation feels faster.

-#### Impact on interrupted sessions
+## Impact on interrupted sessions

-We currently have a single Active Record model for Case Logs that contains all the question fields. Every time a question is submitted the answer will be saved in the Active Record model instance before the next frame is rendered. This model will need to be able to handle partial records and partial validation anyway since not all API users will have all the required data. Validation can occur based on the data already saved and/or once the form is finally submitted. Front end validation will still happen additionally as you go through the form to help make sure users don't get a long list of errors at the end. Using session data here and updating the model only once the form is completed would not seem to have any advantages over this approach.
+We currently have a single Active Record model for Case Logs that contains all the question fields. Every time a question is submitted the answer will be saved in the Active Record model instance before the next frame is rendered. This model will need to be able to handle partial records and partial validation anyway since not all API users will have all the required data. Validation can occur based on the data already saved and/or once the form is finally submitted. Front end validation will still happen additionally as you go through the form to help make sure users don’t get a long list of errors at the end. Using session data here and updating the model only once the form is completed would not seem to have any advantages over this approach.

 This means that when a user navigates away from the form or closes the tab etc, they can use the URL to navigate directly back to where they left off, or follow the form flow through again, and in both cases their submitted answers will still be there.

-
-#### Impact on API
+## Impact on API

 The API will still expect to take a JSON describing the case log, instantiate the model with the given fields, and run validations as if it had been submitted.
--- a/docs/adr/adr-004-gov-paas.md
+++ b/docs/adr/adr-004-gov-paas.md
@ -1,8 +1,10 @@
-### ADR - 004: Infrastructure Switch
+---
+parent: Architecture decisions
+---

-#### Gov PaaS
+# 004: Infrastructure switch to GOV.UK PaaS

-The application infrastructure will be moved from the initial AWS set up to Gov PaaS. The initial expectation is to have a Gov PaaS account `dluhc-core` with 2 spaces `sandbox`, `production`.
+The application infrastructure will be moved from the initial AWS set up to GOV.UK PaaS. The initial expectation is to have a GOV.UK PaaS account `dluhc-core` with 2 spaces `sandbox`, `production`.

 Sandbox will consist of 2 small instances (512M) and 1 tiny-unencrypted-13 Postgres instance.

@ -10,11 +12,11 @@ Production infrastructure sizing will be decided at a later time and once our ac

 The reasoning for this is:

- Department policy is to use Gov PaaS whenever possible
- DLUHC does not have a lot of internal dev ops skills/capacity so by leveraging Gov PaaS we can leverage having most of the monitoring, running, scaling and security already provided.
+- Department policy is to use GOV.UK PaaS whenever possible
+- DLUHC does not have a lot of internal dev ops skills/capacity so by leveraging GOV.UK PaaS we can leverage having most of the monitoring, running, scaling and security already provided.
 - We get a simpler infrastructure setup than the AWS setup we currently have
- All of the infrastructure we currently require is well supported on Gov PaaS
+- All of the infrastructure we currently require is well supported on GOV.UK PaaS

 One potential downside is that data replication to CDS may be slightly more complicated as adding our database to a VPC requires the Gov PaaS support team to do that on our behalf.

-This also means the Github repository previously used for [Infrastructure](https://github.com/communitiesuk/mhclg-data-collection-beta-infrastructure) will be archived after this change goes in as it won't be needed anymore.
+This also means the GitHub repository previously used for [Infrastructure](https://github.com/communitiesuk/mhclg-data-collection-beta-infrastructure) will be archived after this change goes in as it won’t be needed any more.
--- a/docs/adr/adr-005-form-definition.md
+++ b/docs/adr/adr-005-form-definition.md
@ -1,6 +1,8 @@
-### ADR - 005: Form Definition
+---
+parent: Architecture decisions
+---

-#### Config driven front-end
+# 005: Config driven frontend

 We will initially try to model the form as a JSON structure that should describe all the information needed to display the form to the user. That means it will need to describe the sections, subsections, pages, questions, answer options etc.

@ -10,16 +12,16 @@ This should also mean that in the future it could be possible to create a UI tha

 Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:

- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
+- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on).
 - The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.

 A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application.

-#### JSON Structure
+## JSON Structure

 First pass of a form definition

-```
+```json
 {
  form_type: [lettings/sales]
  start_year: yyyy
--- a/docs/adr/adr-006-saving-values.md
+++ b/docs/adr/adr-006-saving-values.md
@ -1,8 +1,10 @@
-### ADR - 006: Saving values to the database
+---
+parent: Architecture decisions
+---

-We have opted to save values to the database directly instead of saving keys/numbers that need to be converted with enums in models using active record.
+# 006: Saving values to the database

-### Saving values to the database
+We have opted to save values to the database directly instead of saving keys/numbers that need to be converted with enums in models using active record.

 There are a few reasons we have opted to save the values directly, they are as follows

@ -10,14 +12,12 @@ There are a few reasons we have opted to save the values directly, they are as f

 - Currently there is no need to abstract the data as the data should be safe from being accessed by anyone external to the project

- It doesn't require additional dev work to map keys/numbers to values, we can just pull the values out directly and use them in the code, for example on the check answers page
-
-
+- It doesn’t require additional dev work to map keys/numbers to values, we can just pull the values out directly and use them in the code, for example on the check answers page

-### Drawbacks
+## Drawbacks

- Changing the wording/casing of the answers could result in discrepancies in the database
+- Changing the wording/casing of the answers could result in discrepancies in the database.

- There is a small risk that if the database is accessed by someone unauthorised they would have access to personally identifiable information if we were to collect Any. We  will be mitigating this risk by encrypting the production database 
+- There is a small risk that if the database is accessed by someone unauthorised they would have access to personally identifiable information if we were to collect Any. We  will be mitigating this risk by encrypting the production database.

 This decision is not too difficult to change and can be revisited in the future if there is sufficient reason to switch to storing keys/numbers and using enums and active record to convert those to the appropriate values.
--- a/docs/adr/adr-007-data-validations.md
+++ b/docs/adr/adr-007-data-validations.md
@ -1,4 +1,8 @@
-### ADR - 007: Data Validations
+---
+parent: Architecture decisions
+---
+
+# 007: Data validations

 Data validations that happen in CORE at the point of data collection fall into two categories:

@ -7,21 +11,20 @@ Data validations that happen in CORE at the point of data collection fall into t

 These are handled slightly differently:

-##### Validity checks
+## Validity checks

-These run for all submitted data. Every time a form page (in the UI) is submitted, the fields related to that form page will be checked to ensure that any responses given are valid. If they are not, an error message will be shown on screen, and it will not be possible to "Save and continue" until the response is fixed or removed.
+These run for all submitted data. Every time a form page (in the UI) is submitted, the fields related to that form page will be checked to ensure that any responses given are valid. If they are not, an error message will be shown on screen, and it will not be possible to ‘Save and continue’ until the response is fixed or removed.

 Similarly if an API request is made to create a case log with data that contains _invalid_ fields, that data will be rejected, and an error message will be returned.

+## Presence checks

-##### Presence checks
-
-These are not strictly error checks since it's possible to submit partial data. In the form UI it is possible to click "Save and continue" and move past questions that you might not know right now, and leave them to come back to later. We shouldn't prevent this workflow.
+These are not strictly error checks since it’s possible to submit partial data. In the form UI it is possible to click ‘Save and continue’ and move past questions that you might not know right now, and leave them to come back to later. We shouldn’t prevent this workflow.

 Similarly the API client (3rd party software system) may not have all the required data and may only be submitting a partial log. This is still a valid use case so we should not be enforcing presence checks and returning errors based on them for either submission type.

-Instead we determine the _status_ of the case log based the presence checks. Every time data is submitted (via a form page, bulk upload or API), before saving the data, the system will check whether all fields have been completed *and* pass validity checks. If so, the case log will be marked as *completed*, if not it will be marked as *in progress*.
+Instead we determine the _status_ of the case log based the presence checks. Every time data is submitted (via a form page, bulk upload or API), before saving the data, the system will check whether all fields have been completed _and_ pass validity checks. If so, the case log will be marked as _completed_, if not it will be marked as _in progress_.

 By default all fields that a Case Log has will be assumed to be required unless explicitly marked as not required (for example as a result of other answers rendering a question inapplicable).

-On the form UI this will work by not allowing you to "submit" the form, until all presence checks have been satisfied, but all other navigation is allowed. On the API this will work by returning a Case Log that is "in progress" if you've submitted a partial log, or "completed" if you've submitted a full log, or "Errors" if you've submitted an invalid log.
+On the form UI this will work by not allowing you to submit the form, until all presence checks have been satisfied, but all other navigation is allowed. On the API this will work by returning a Case Log that is ‘in progress’ if you’ve submitted a partial log, or ‘completed’ if you’ve submitted a full log, or ‘errors’ if you’ve submitted an invalid log.
--- a/docs/adr/adr-008-field-names.md
+++ b/docs/adr/adr-008-field-names.md
@ -1,11 +1,11 @@
-### ADR - 008: Field Names
+---
+parent: Architecture decisions
+---

-We are changing the schema to reflect the way the data is stored in CORE. 
-This is due to the SPSS queries that are being performed by ADD and the complexity that would come with changing them.
+# 008: Field names

-The field names are saved lowercase as opposed to the uppercase versions we see in CORE.
-This is due to Ruby expecting the uppercase parameters to be constants and database fields are expected to be lower case.
-These fields could be mapped to their uppercase versions during the replication if needed. 
+We are changing the schema to reflect the way the data is stored in CORE. This is due to the SPSS queries that are being performed by ADD and the complexity that would come with changing them.

-A lot of the values are now also being stored as enums. 
-This gives as some validation by default as the values not defined in the enums will fail to save. 
+The field names are saved lowercase as opposed to the uppercase versions we see in CORE. This is due to Ruby expecting the uppercase parameters to be constants and database fields are expected to be lower case. These fields could be mapped to their uppercase versions during the replication if needed.
+
+A lot of the values are now also being stored as enums. This gives as some validation by default as the values not defined in the enums will fail to save.
--- a/docs/adr/adr-009-form-routing-logic.md
+++ b/docs/adr/adr-009-form-routing-logic.md
@ -1,12 +1,16 @@
-### ADR - 009: Form Routing Logic
+---
+parent: Architecture decisions
+---
+
+# 009: Form routing logic

 There are 2 ways you can think about form (page) routing logic:

-1. Based on the answer you give to a page you are navigated to some point in the form, i.e. a "Jump to"
+1. Based on the answer you give to a page you are navigated to some point in the form, i.e. a ‘jump to’
 2. Each question is considered sequentially and independently and we evaluate whether it should be shown or not

 Our Form Definition DSL takes the second approach. This has a couple of advantages:

- It makes the check answers pattern easier to code as you can ask each page directly: "Have the conditions for you to be shown been met?", with approach 1, you would effectively have to traverse the full route branch to see if a particular page was shown for each page/question which adds complexity.
+- It makes the check answers pattern easier to code as you can ask each page directly: “Have the conditions for you to be shown been met?”, with approach 1, you would effectively have to traverse the full route branch to see if a particular page was shown for each page/question which adds complexity.

 - It makes it easier to look at the JSON and see at a glance what conditions will show or hide a page, which is closer to how the business logic is discussed and is easier to reason about.
--- a/docs/adr/adr-010-admin-users-vs-users.md
+++ b/docs/adr/adr-010-admin-users-vs-users.md
@ -1,9 +1,13 @@
-### ADR - 010: Admin Users vs Users
+---
+parent: Architecture decisions
+---

-#### Why do we have 2 User classes, AdminUser and User?
+# 010: Admin users vs Users

-This is modelling a real life split. `AdminUsers` are internal DLUHC users or helpdesk employees. While `Users` are external users working at data providing organisations. So local authority/housing association's "admin" users, i.e. Data Co-ordinators are a type of the User class. They have the ability to add or remove other users to or from their organisation, and to update their organisation details etc, but only through the designed UI. They do not get direct access to ActiveAdmin.
+## Why do we have 2 user classes, `AdminUser` and `User`?
+
+This is modelling a real life split. `AdminUsers` are internal DLUHC users or help desk employees. While `Users` are external users working at data providing organisations. So local authority/housing association’s "admin" users, i.e. Data Co-ordinators are a type of the User class. They have the ability to add or remove other users to or from their organisation, and to update their organisation details etc, but only through the designed UI. They do not get direct access to ActiveAdmin.

 AdminUsers on the other hand get direct access to ActiveAdmin. From there they can download entire datasets (via CSV, XML, JSON), view any log from any organisation, and add or remove users of any type including other Admin users. This means TDA will likely also require more stringent authentication for them using MFA (which users will likely not require). So the class split also helps there.

-A potential downside to this approach is that it does not currently allow for `AdminUsers` to sign into the application UI itself with their Admin credentials. However, we need to see if there's an actual use case for this and what it would be (since they aren't part of an organisation to be uploading data for, but could add or amend data or user or org details through ActiveAdmin anyway). If there is a strong use case for it this could be work around by either: providing them with two sets of credentials, or modifying the `authenticate_user` method to also check `AdminUser` credentials.
+A potential downside to this approach is that it does not currently allow for `AdminUsers` to sign into the application UI itself with their Admin credentials. However, we need to see if there’s an actual use case for this and what it would be (since they aren’t part of an organisation to be uploading data for, but could add or amend data or user or org details through ActiveAdmin anyway). If there is a strong use case for it this could be work around by either: providing them with two sets of credentials, or modifying the `authenticate_user` method to also check `AdminUser` credentials.
--- a/docs/adr/adr-011-form-oop-refactor.md
+++ b/docs/adr/adr-011-form-oop-refactor.md
@ -1,8 +1,14 @@
-### ADR - 011: Splitting the form parsing into objects
+---
+parent: Architecture decisions
+---

-Initially a single "Form" class handled the parsing of the form definition JSON as well as a lot of the logic around what different sections meant. This works fine but led to a lot of places in code where we're passing around arguments to determine whether a page or section should or shouldn't do something rather than being able to ask it directly. Refactoring this into smaller form domain object classes has several benefits:
+# 011: Splitting the form parsing into objects

- It's easier to compare the form definition JSON to the code classes and reason about what fields can be passed and what effect they'll have
+Initially a single `Form` class handled the parsing of the form definition JSON as well as a lot of the logic around what different sections meant. This works fine but led to a lot of places in code where we’re passing around arguments to determine whether a page or section should or shouldn’t do something rather than being able to ask it directly.
+
+Refactoring this into smaller form domain object classes has several benefits:
+
+- It’s easier to compare the form definition JSON to the code classes and reason about what fields can be passed and what effect they’ll have
 - It moves business logic out of the helpers and keeps them to just dealing with display logic
 - It makes it easier to unit test form functionality, and group that into smaller chunks
 - It allows for less passing of arguments. e.g. `page.routed_to?(case_log)` vs `form.was_page_routed_to?(page, case_log)`
--- a/docs/adr/adr-012-controller-http-return-statuses.md
+++ b/docs/adr/adr-012-controller-http-return-statuses.md
@ -1,4 +1,8 @@
-### ADR - 012: Controller HTTP return statuses
+---
+parent: Architecture decisions
+---
+
+# 012: Controller HTTP return statuses

 Controllers assess authentication by 3 criteria:

@ -6,7 +10,7 @@ Controllers assess authentication by 3 criteria:
 2. Are you signed in and requesting an action that your role/user type has access to?
 3. Are you signed in, requesting an action that your role/user type has access to and requesting a resource that your user has access to.

-When these aren't met they fail with the following response types:
+When these aren’t met they fail with the following response types:

 1. 401: Unauthorized. Redirect to sign-in page.
 2. 401: Unauthorized
--- a/docs/adr/adr-013-inferring-la-from-postcode.md
+++ b/docs/adr/adr-013-inferring-la-from-postcode.md
@ -1,12 +1,13 @@
-### ADR - 013: Inferring LA from postcode
+---
+parent: Architecture decisions
+---
+
+# 013: Inferring LA from postcode

 We use ONS data to infer local authority from postcode in the property information section.
-The Office for National Statistics (ONS) publishes the National Statistics
-Postcode Lookup (NSPL) and ONS Postcode Directory (ONSPD) datasets,
-which may be used to find a local authority district for a postcode when compiling statistics.

-We're using postcodes.io API with postcodes_io gem.
-Postcodes.io uses OS and ONS data which is updated as soon as new data becomes available.
+The Office for National Statistics (ONS) publishes the National Statistics Postcode Lookup (NSPL) and ONS Postcode Directory (ONSPD) datasets, which may be used to find a local authority district for a postcode when compiling statistics.
+
+We’re using postcodes.io API with postcodes_io gem. Postcodes.io uses OS and ONS data which is updated as soon as new data becomes available.

-We are not using OS places API due to the lack of data.
-Closest datapoint to LA in OS places api is ADMINISTRATIVE_AREA which does not always match with local authority.
+We are not using OS places API due to the lack of data. Closest data point to LA in OS places api is ADMINISTRATIVE_AREA which does not always match with local authority.
--- a/docs/adr/adr-014-annual-form-changes.md
+++ b/docs/adr/adr-014-annual-form-changes.md
@ -1,4 +1,8 @@
-### ADR - 014: Annual form changes
+---
+parent: Architecture decisions
+---
+
+# 014: Annual form changes

 Given that the data collection form changes annually and that the data collection windows overlap by several months to allow for late submissions of data from the previous year, we need to be able to run at least two different versions of a form concurrently. We can do this in one of at least two ways:

--- a/docs/adr/index.md
+++ b/docs/adr/index.md
@ -0,0 +1,8 @@
+---
+has_children: true
+nav_order: 9
+---
+
+# Architecture decisions
+
+A record of architectural decisions made on this project.
--- a/docs/api/index.html
+++ b/docs/api/index.html
@ -0,0 +1,20 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+  <link rel="stylesheet" type="text/css" href="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui.css">
+  <title>DLUHC CORE Data Collection API</title>
+</head>
+<body>
+  <div id="openapi"></div>
+  <script src="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui-bundle.min.js"></script>
+  <script>
+    window.onload = function () {
+      const ui = SwaggerUIBundle({
+        url: "v1.json",
+        dom_id: "#openapi"
+      })
+    }
+  </script>
+</body>
--- a/docs/api/DLUHC-CORE-Data.v1.json
+++ b/docs/api/DLUHC-CORE-Data.v1.json
@ -1,13 +1,13 @@
 {
  "openapi": "3.0.0",
  "info": {
-    "title": "DLUHC CORE Data",
+    "title": "DLUHC CORE Data Collection API",
    "version": "1.0",
-    "description": "Submit or Update CORE Case Log Data on Lettings and Sales of Social Housing in England"
+    "description": "Submit social housing lettings and sales data (CORE)"
  },
  "servers": [
    {
-      "url": "https://dluhc-core.london.cloudapps.digital",
+      "url": "https://dluhc-core-staging.london.cloudapps.digital/logs",
      "description": "Staging"
    }
  ],
--- a/docs/exports.md
+++ b/docs/exports.md
@ -1,3 +1,7 @@
+---
+nav_order: 7
+---
+
 # Exporting to CDS

 All data collected by the application needs to be exported to the Consolidated Data Store (CDS) which is a data warehouse based on MS SQL running in the DAP (Data Analytics Platform).
@ -6,7 +10,7 @@ This is done via XML exports saved in an S3 bucket located in the DAP VPC using

 Initially the application database field names and field types were chosen to match the existing CDS data as closely as possible to minimise the amount of transformation needed. This has led to a less than optimal data model though and increasingly we should look to transform at the mapping layer where beneficial for our application.

-The export service is triggered nightly using [Gov PaaS tasks](https://docs.cloudfoundry.org/devguide/using-tasks.html). These tasks are triggered from a Github action, as Gov PaaS does not currently support the Cloud Foundry Task Scheduler.
+The export service is triggered nightly using [Gov PaaS tasks](https://docs.cloudfoundry.org/devguide/using-tasks.html). These tasks are triggered from a GitHub action, as Gov PaaS does not currently support the Cloud Foundry Task Scheduler.

 The S3 bucket is located in the DAP VPC rather than the application VPC as DAP runs in an AWS account directly so access to the S3 bucket can be restricted to only the IPs used by the application. This is not possible the other way around as [Gov PaaS does not support restricting S3 access by IP](https://github.com/alphagov/paas-roadmap/issues/107).

--- a/docs/form/builder.md
+++ b/docs/form/builder.md
@ -1,36 +1,15 @@
-# Form Builder
+---
+parent: Generating forms
+nav_order: 1
+---

-## Background
-
-Social housing lettings and sales data is collected in annual collection windows that run from 1st April to 1st April.
-
-During this window the form and questions generally stay constant. The form will generally change by small amounts between each collection window. Typical changes are adding new questions, adding or removing answer options from questions or tweaking question wording for clarity.
-
-A paper form is produced for guidance and to help data providers collect the data offline, and a bulk upload template is circulated which need to match the online form.
-
-Data is accepted for a collection window for up to 3 months after it’s finished to allow for late data submission. This means that between April and July two version of the form run simultaneously.
-
-Other considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a user interface.
-
-We haven’t used micro-services, preferring to deploy a single application but we have modelled the form itself as configuration in the form of a JSON structure that acts as a sort of DSL/form builder for the form.
-
-The idea is to decouple the code that creates the required routes, controller methods, views etc to display the form from the actual wording of questions or order of pages such that it becomes possible to make changes to the form with little or no code changes.
-
-This should also mean that in the future it could be possible to create an interface that can construct the JSON config, which would open up the ability to make form changes to a wider audience. Doing this fully would require generating and running the necessary migrations for data storage, generating the required ActiveRecord methods to validate the data server side, and generating/updating API endpoints and documentation. All of this is likely to be beyond the scope of initial MVP but could be looked at in the future.
-
-Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
-
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
-
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
-
-A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application. We’ll call this the Form Runner part of the application.
+# Form builder

 ## Setup this log

 The setup this log section is treated slightly differently from the rest of the form. It is more accurately viewed as providing metadata about the form than as being part of the form itself. It also needs to know far more about the application specific context than other parts of the form such as who the current user is, what organisation they’re part of and what role they have etc.

-As a result it’s not modelled as part of the config but rather as code. It still uses the same Form Runner components though.
+As a result it’s not modelled as part of the config but rather as code. It still uses the same [Form Runner](runner) components though.

 ## Features the Form Config supports

@ -183,7 +162,7 @@ This will validate all forms in directories `["config/forms", "spec/fixtures/for

 ## Form models and definition

-For information about the form model and related models (section, subsection, page, question) and how these relate to each other follow [this link](/docs/form/form.md)
+For information about the form model and related models (section, subsection, page, question) and how these relate to each other see [form definition](/form/definition).

 ## Improvements that could be made

--- a/docs/form/definition.md
+++ b/docs/form/definition.md
@ -1,4 +1,10 @@
-## Form Definition
+---
+parent: Generating forms
+has_children: true
+nav_order: 3
+---
+
+# Form definition

 The current system is built around a form definition written in JSON. At the top level every form will expect to have the following attributes:

@ -8,7 +14,8 @@ The current system is built around a form definition written in JSON. At the top
 - Sections: the sections in the form, this block is where the bulk of the form definition will be.

 An example of this might look like the following:
-```JSON
+
+```json
 { 
  "form_type": "lettings",
  "start_date": "2021-04-01T00:00:00.000+01:00",
@ -21,50 +28,39 @@ An example of this might look like the following:

 Note that the end date of one form will overlap the start date of another to allow for late submissions. This means that every year there will be a period of time in which two forms are running simultaneously.

-### How is the form split up?
-
-A summary of how the form is split up is as follows:
-
- A form is divided up into one or more sections. 
- Each section can have one or more subsections. 
- Each subsection can have one or more pages. 
- Each page can have one or more questions.
-
-More information about these form elements can be found in the following links:
+A form is split up is as follows:

- [Section](/docs/form/section.md)
- [Subsection](/docs/form/subsection.md)
- [Page](/docs/form/page.md)
- [Question](/docs/form/question.md)
+- A form is divided up into one or more [sections](section)
+- Each section can have one or more [subsections](subsection)
+- Each subsection can have one or more [pages](page)
+- Each page can have one or more [questions](question)

-### The Form Model, Views and Controller
+Rails uses the model, view, controller (MVC) pattern which we follow.

-Rails uses the Model, View, Controller (MVC) pattern which we follow.
-
-#### The Form Model
+## Form model

 There is no need to manually initialise a form object as this is handled by the FormHandler class at boot time. If a new form needs to be added then a JSON file containing the form definition should be added to `config/forms` where the FormHandler will be able to locate it and instantiate it.

 A form has the following attributes:

- name: The name of the form
- setup_sections: The setup section (this is not defined in the JSON, for more information see this)
- form_definition: The parsed form JSON
- form_sections: The sections found within the form definition JSON
- type: The type of form (this is used to indicate if the form is for a sale or a letting)
- sections: The combination of the setup section with those found in the JSON definition
- subsections: The subsections of the form (these live under the sections)
- pages: The pages of the form (these live under the subsections)
- questions: The questions of the form (these live under the pages)
- start_date: The start date of the form, in iso8601 format
- end_date: The end date of the form, in iso8601 format
-
+- `name`: The name of the form
+- `setup_sections`: The setup section (this is not defined in the JSON, for more information see this)
+- `form_definition`: The parsed form JSON
+- `form_sections`: The sections found within the form definition JSON
+- `type`: The type of form (this is used to indicate if the form is for a sale or a letting)
+- `sections`: The combination of the setup section with those found in the JSON definition
+- `subsections`: The subsections of the form (these live under the sections)
+- `pages`: The pages of the form (these live under the subsections)
+- `questions`: The questions of the form (these live under the pages)
+- `start_date`: The start date of the form, in ISO 8601 format
+- `end_date`: The end date of the form, in ISO 8601 format

-#### The Form Views
+## Form views

 The main view used for rendering the form is the `app/views/form/page.html.erb` view as the Form contains multiple pages (which live in subsections within sections). This page view then renders the appropriate partials for the question types of the questions on the current page.

 We currently have views for the following question types:
+
 - Numerical
 - Date
 - Checkbox
@ -76,11 +72,10 @@ We currently have views for the following question types:

 Interruption screen questions are radio questions used for soft validation of fields. They usually have yes and no options for a user to confirm a value is correct.

-#### The Form Controller
+## Form controller

 The form controller handles the form submission as well as the rendering of the check answers page and the review page.

-### The FormHandler helper class
+## FormHandler helper class

 The FormHandler helper is a helper that loads all of the defined forms and initialises them as Form objects. It can also be used to get specific forms if needed.
-
--- a/docs/form/index.md
+++ b/docs/form/index.md
@ -0,0 +1,30 @@
+---
+has_children: true
+nav_order: 8
+---
+
+# Generating forms
+
+Social housing lettings and sales data is collected in annual collection windows that run from 1 April to 1 April the following year.
+
+During this window the form and questions generally stay constant. The form will generally change by small amounts between each collection window. Typical changes are adding new questions, adding or removing answer options from questions or tweaking question wording for clarity.
+
+A paper form is produced for guidance and to help data providers collect the data offline, and a bulk upload template is circulated which need to match the online form.
+
+Data is accepted for a collection window for up to 3 months after it’s finished to allow for late data submission. This means that between April and July 2 versions of the form run simultaneously.
+
+Other considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a user interface.
+
+We haven’t used micro-services, preferring to deploy a single application but we have modelled the form itself as configuration in the form of a JSON structure that acts as a sort of DSL/form builder for the form.
+
+The idea is to decouple the code that creates the required routes, controller methods, views etc to display the form from the actual wording of questions or order of pages such that it becomes possible to make changes to the form with little or no code changes.
+
+This should also mean that in the future it could be possible to create an interface that can construct the JSON config, which would open up the ability to make form changes to a wider audience. Doing this fully would require generating and running the necessary migrations for data storage, generating the required ActiveRecord methods to validate the data server side, and generating/updating API endpoints and documentation. All of this is likely to be beyond the scope of initial MVP but could be looked at in the future.
+
+Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
+
+- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
+
+- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
+
+A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application. We’ll call this the Form Runner part of the application.
--- a/docs/form/page.md
+++ b/docs/form/page.md
@ -1,8 +1,16 @@
-## Page
+---
+parent: Form definition
+grand_parent: Generating forms
+nav_order: 3
+---

-Pages are under the subsection level of the form definition. A example page might look something like this:
+# Page

-```JSON
+Pages sit below the [`Subsection`](subsection) level of a form definition.
+
+An example page might look something like this:
+
+```json
 "property_postcode": {
  "header": "",
  "description": "",
@ -23,6 +31,6 @@ The header is optional but if provided is used for the heading displayed on the

 The description is optional but if provided is used for a paragraph displayed under the page header.

-It's worth noting that like subsections a page can also have a `depends_on` which contains the set of conditions that must be met for the section to be accessibile to a data provider. If the conditions are not met then the page is not routed to as part of the form flow. The `depends_on` for a page will usually depend on answers given to questions, most likely to be questions in the setup section. In the above example the page is dependent on the answer to the `needstype` question being `1`, which corresponds to picking `General needs` on that question as displayed to the data provider.
+It’s worth noting that like subsections a page can also have a `depends_on` which contains the set of conditions that must be met for the section to be accessible to a data provider. If the conditions are not met then the page is not routed to as part of the form flow. The `depends_on` for a page will usually depend on answers given to questions, most likely to be questions in the setup section. In the above example the page is dependent on the answer to the `needstype` question being `1`, which corresponds to picking `General needs` on that question as displayed to the data provider.

-Pages can contain one or more questions.
+Pages can contain one or more [questions](question).
--- a/docs/form/question.md
+++ b/docs/form/question.md
@ -1,8 +1,16 @@
-## Question
+---
+parent: Form definition
+grand_parent: Generating forms
+nav_order: 4
+---

-Questions are under the page level of the form definition. A example question might look something like this:
+# Question

-```JSON
+Questions are under the page level of the form definition.
+
+An example question might look something like this:
+
+```json
 "postcode_known": {
  "check_answer_label": "Do you know the property postcode?",
  "header": "Do you know the property’s postcode?",
@ -37,7 +45,7 @@ The `conditional_for` contains the value needed to be selected by the data input

 the `hidden_in_check_answers` is used to hide a value from displaying on the check answers page. You only need to provide this if you want to set it to true in order to hide the value for some reason e.g. it's one of two questions appearing on a page and the other question is displayed on the check answers page. It's also worth noting that you can declare this as a with a `depends_on` which can be useful for conditionally displaying values on the check answers page. For example:

-```JSON
+```json
 "hidden_in_check_answers": {
  "depends_on": [
    {
@ -54,7 +62,7 @@ Would mean the question the above is attached to would be hidden in the check an

 The answer the data inputter provides to some questions allows us to infer the values of other questions we might have asked in the form, allowing us to save the data inputters some time. An example of how this might look is as follows:

-```JSON
+```json
 "postcode_full": {
  "check_answer_label": "Postcode",
  "header": "What is the property’s postcode?",
--- a/docs/form/runner.md
+++ b/docs/form/runner.md
@ -1,12 +1,17 @@
-# Form Runner
+---
+parent: Generating forms
+nav_order: 2
+---

-The Form Runner is composed of:
+# Form runner

-Ruby Classes:
+The Form runner is composed of:
+
+Ruby classes:

 - A singleton form handler that instantiates an instances of each form definition (config file we have) combined with the setup section that is common to all forms. This is created at rails boot time. (`app/models/form_handler.rb`)
 - A `Form` class that is the entry point for parsing a form definition and handles most of the associated logic (`app/models/form.rb`)
- `Section`, `Subsection`, `Page` and `Question` classes (`app/models/form/`)
+- [`Section`](section), [`Subsection`](subsection), [`Page`](page) and [`Question`](question) classes (`app/models/form/`)
 - Setup subsection specific instances (subclasses) of `Section`, `Subsection`, `Pages` and `Questions` (`app/form/setup/`)

 ERB templates:
@ -22,4 +27,4 @@ All form pages submit to the same controller method (`app/controllers/form_contr

 ## Form models and definition

-For information about the form model and related models (section, subsection, page, question) and how these relate to each other follow [this link](/docs/form/form.md)
+For information about the form model and related models (section, subsection, page, question) and how these relate to each other see [form definition](/form/definition).
--- a/docs/form/section.md
+++ b/docs/form/section.md
@ -1,8 +1,16 @@
-## Section
+---
+parent: Form definition
+grand_parent: Generating forms
+nav_order: 1
+---

-Sections are under the top level of the form definition. A example section might look something like this:
+# Section

-```JSON
+Sections sit at the top level of a form definition.
+
+An example section might look something like this:
+
+```json
 "sections": {
  "tenancy_and_property": {
    "label": "Property and tenancy information",
@ -23,4 +31,4 @@ In the above example the section id would be `tenancy_and_property` and its subs

 The label contains the text that users will see for that section in the task list page of a case log.

-Sections can contain one or more subsections.
+Sections can contain one or more [subsections](subsection).
--- a/docs/form/subsection.md
+++ b/docs/form/subsection.md
@ -1,8 +1,16 @@
-## Subsection
+---
+parent: Form definition
+grand_parent: Generating forms
+nav_order: 2
+---

-Subsections are under the section level of the form definition. A example subsection might look something like this:
+# Subsection

-```JSON
+Subsections sit below the [`Section`](section) level of a form definition.
+
+An example subsection might look something like this:
+
+```json
 "property_information": {
  "label": "Property information",
  "depends_on": [
@ -21,8 +29,10 @@ Subsections are under the section level of the form definition. A example subsec
 }
 ```

-In the above example the the subsection has the id `property_information`. The `depends_on` contains the set of conditions that must be met for the section to be accessibile to a data provider, in this example subsection depends on the completion of the setup section/subsection (note that this is a common condition as the answers provided to questions in the setup subsection often have an impact on what questions are asked of the data provider in later subsections of the form).
+In the above example the the subsection has the id `property_information`. The `depends_on` contains the set of conditions that must be met for the section to be accessible to a data provider, in this example subsection depends on the completion of the setup section/subsection (note that this is a common condition as the answers provided to questions in the setup subsection often have an impact on what questions are asked of the data provider in later subsections of the form).

 The label contains the text that users will see for that subsection in the task list page of a case log.

-The pages of the subsection in the example would be `property_postcode` and `property_local_authority`. Subsections can contain one or more pages.
+The pages of the subsection in the example would be `property_postcode` and `property_local_authority`.
+
+Subsections can contain one or more [pages](page).
--- a/docs/frontend.md
+++ b/docs/frontend.md
@ -1,3 +1,7 @@
+---
+nav_order: 2
+---
+
 # Frontend

 ## GOV.UK Design System components
--- a/docs/index.html
+++ b/docs/index.html
@ -1,17 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-<meta charset="UTF-8">
-<meta name="viewport" content="width=device-width, initial-scale=1">
-<link rel="stylesheet" type="text/css" href="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui.css">
-<title>OpenAPI DLUHC CORE Data Collection</title>
-<body><div id="openapi"><script src="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui-bundle.min.js"></script>
-<script>
-window.onload = function () {
-  const ui = SwaggerUIBundle({
-    url: "api/DLUHC-CORE-Data.v1.json",
-    dom_id: "#openapi"
-  })
-}
-</script>
-</body>
--- a/docs/index.md
+++ b/docs/index.md
@ -0,0 +1,67 @@
+---
+nav_order: 1
+---
+
+# Overview
+
+All lettings and and sales of social housing in England need to be logged with the Department for levelling up, housing and communities (DLUHC). This is done by data providing organisations: Local Authorities and Private Registered Providers (PRPs, i.e. housing associations).
+
+Data is collected via a form that runs on an annual data collection window basis. Form changes are made annually to add new questions, remove any that are no longer needed, or adjust wording or answer options etc.
+
+Each data collection window runs from 1 April to 1 April the following year (plus an extra 3 months to allow for any late submissions). This means that between April and June, 2 collection windows are open simultaneously and logs can be submitted for either.
+
+ADD (Analytics & Data Directorate) statisticians are the other primary users of the service. The data collected is transferred to DLUHCs consolidated data store (CDS) via nightly XML exports to an S3 bucket. CDS ingests and transforms this data, ultimately storing it in a MS SQL database and exposing it to analysts and statisticians via Amazon Workspaces.
+
+![Diagram of the CORE system architecture](../images/architecture.drawio.png)
+
+## Users
+
+External data providing organisations have 2 main user types:
+
+- **Data coordinators** are administrators for their organisation, but may also complete logs
+- **Data providers** complete the logs
+
+Additionally there are data protection officers (DPO). For some organisations this is a separate role, but in our codebase this is modelled as an attribute of a user (i.e. a data coordinator or provider can additionally be a DPO). They are responsible for ensuring the organisation has signed the data sharing agreement.
+
+There are also 2 internal user types:
+
+- **Customer support:** can administrate all organisations
+- **Statisticians:** primary consumers of the collected data
+
+## Organisations
+
+There are 2 types of organisation:
+
+- An **owning organisations** own housing stock. It may manage the allocation of people in and out of their accommodation, or contract this function out to managing agents.
+
+- A **managing organisation** (or managing agent) is responsible for the allocation of people in and out of accommodation, and/or responsible for the services provided to support those people in the accommodation (in the case of supported housing).
+
+### Relationships between organisations
+
+Organisations that own stock can contract out the management of that stock to another organisation. This relationship is often referred to as a parent/child relationship.
+
+This is a useful analogy as a parent can have multiple children, and a child can have many parents. A child organisation can also be a parent, and a parent organisation can also be a child organisation:
+
+![Organisational relationships](../images/organisational_relationships.png)
+
+### User permissions within organisations
+
+The case logs that a user can see depends on their role:
+
+- Customer support users can access any case log
+
+- Data coordinators can access any case log for which the organisation they work for is ultimately responsible for, meaning they can see logs managed by a child organisation
+
+- Data providers can only access case logs for which their organisation manages (or directly owns)
+
+Taking the relationships from the above diagram, and looking at which logs each user can access:
+
+![User log access permissions](../images/user_log_permissions.png)
+
+## Supported housing schemes
+
+A supported housing scheme (or service) provides shared or self-contained housing for a particular client group, for example younger or vulnerable people. A scheme can be run at multiple locations, and a single location may contain multiple units (for example bedrooms in shared houses or a bungalow with 3 bedrooms).
+
+Logs for supported housing will share a number of similar characteristics at this location. Additional data also needs to be collected specifically regarding the supported housing scheme, such as the type of client groups served and type of support provided.
+
+Asking these questions would require data inputters to re-enter the same information repeatedly and answer more questions than those asked for general needs lettings. Schemes exist in CORE to reduce this burden, and effectively act as predefined answer sets.
--- a/docs/infrastructure.md
+++ b/docs/infrastructure.md
@ -1,3 +1,7 @@
+---
+nav_order: 5
+---
+
 # Infrastructure

 ## Deployment
@ -26,7 +30,7 @@ This application is running on [GOV.UK PaaS](https://www.cloud.service.gov.uk/).
    cf push dluhc-core --strategy rolling
    ```

-    This will use the [manifest file](staging_manifest.yml)
+    This will use the [manifest file](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data/blob/main/manifest.yml)

 Once the app is deployed:

--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@ -1,6 +1,10 @@
+---
+nav_order: 6
+---
+
 # Monitoring

-We use self-hosted Prometheus and Grafana for monitoring infrastructure metrics. These are run in a dedicated Gov PaaS space called "monitoring" and are deployed as Docker images using Github action pipelines. The repository for these and more information is here: [dluhc-data-collection-monitoring](https://github.com/communitiesuk/dluhc-data-collection-monitoring).
+We use self-hosted Prometheus and Grafana for monitoring infrastructure metrics. These are run in a dedicated Gov PaaS space called "monitoring" and are deployed as Docker images using GitHub action pipelines. The repository for these and more information is here: [dluhc-data-collection-monitoring](https://github.com/communitiesuk/dluhc-data-collection-monitoring).

 ## Performance monitoring and alerting

--- a/docs/organisations.md
+++ b/docs/organisations.md
@ -1,25 +0,0 @@
-# Organisational relationships
-
-## Definitions
-
- **Stock owning organisation**: An organisation that owns housing stock. It may manage the allocation of people in and out of their accommodation, or it may contract this out to managing agents.
-
- **Managing agent**: In scenarios where one organisation owns stock and another organisation is contracted to manage the stock and tenants, the latter organisation is often called a ‘managing agent’. Managing agents are responsible for the allocation of people in and out of the accommodation, and/or responsible for the services provided to support those people in the accommodation (in the case of supported housing).
-
-## Permissions
-
-Organisations that own stock can contract out the management of that stock to another organisation. This relationship is often referred to as a parent/child relationship. This is a useful analogy as a parent can have multiple children, and a child can have many parents. A child organisation can also be a parent, and a parent organisation can also be a child organisation:
-
-![Organisational relationships](images/organisational_relationships.png)
-
-The case logs that a user can see depends on their role:
-
- Customer support users can access any case log
-
- Data coordinators can access any case log for which the organisation they work for is ultimately responsible for, meaning they can see logs managed by a child organisation
-
- Data providers can only access case logs for which their organisation manages (or directly owns)
-
-Taking the relationships from the above diagram, and looking at which logs each user can access:
-
-![User log access permissions](images/user_log_permissions.png)
--- a/docs/schemes.md
+++ b/docs/schemes.md
@ -1,7 +0,0 @@
-# Supported housing schemes
-
-A supported housing scheme (or service) provides shared or self-contained housing for a particular client group, for example younger or vulnerable people. A scheme can be run at multiple locations, and a single location may contain multiple units (for example bedrooms in shared houses or a bungalow with 3 bedrooms).
-
-Logs for supported housing will share a number of similar characteristics at this location. Additional data also needs to be collected specifically regarding the supported housing scheme, such as the type of client groups served and type of support provided.
-
-Asking these questions would require data inputters to re-enter the same information repeatedly and answer more questions than those asked for general needs lettings. Schemes exist in CORE to reduce this burden, and effectively act as predefined answer sets.
--- a/docs/service_overview.md
+++ b/docs/service_overview.md
@ -1,5 +0,0 @@
-# Service overview
-
-All lettings and and sales of social housing in England need to be logged with the Department for levelling up, housing and communities (DLUHC). This is done by Local Authorities and Housing Associations, who are the primary users of this service. Data is collected via a form that runs on an annual data collection window basis. Form changes are made annually to add new questions, remove any that are no longer needed, or adjust wording or answer options etc. Each data collection window runs from 1st April to 1st April + an extra 3 months to allow for any late submissions, meaning that between April and June, two collection windows are open simultaneously and logs can be submitted for either.
-
-ADD (Analytics & Data Directorate) statisticians are the other primary users of the service. The data collected is transferred to DLUHCs data warehouse (CDS - consolidated data store), via nightly exports to XML which are transferred to S3 and ingested from there. CDS ingests and transforms the data, ultimately storing it in a MS SQL database and exposing it to analysts and statisticians via Amazon Workspaces.
--- a/docs/developer_setup.md
+++ b/docs/developer_setup.md
@ -1,4 +1,8 @@
-# Developing locally on host machine
+---
+nav_order: 2
+---
+
+# Local development

 The most common way to run a development version of the application is run with local dependencies.

@ -8,7 +12,7 @@ Dependencies:
 - [Rails](https://rubyonrails.org/)
 - [PostgreSQL](https://www.postgresql.org/)
 - [NodeJS](https://nodejs.org/en/)
- [Gecko driver](https://github.com/mozilla/geckodriver/releases) [for running Selenium tests]
+- [Gecko driver](https://github.com/mozilla/geckodriver/releases) (for running Selenium tests)

 We recommend using [RBenv](https://github.com/rbenv/rbenv) to manage Ruby versions.

@ -34,7 +38,7 @@ We recommend using [RBenv](https://github.com/rbenv/rbenv) to manage Ruby versio
    sudo su - postgres -c "createuser <username> -s -P"
    ```

-3. Install RBenv & Ruby-build
+3. Install RBenv and Ruby-build

    macOS:

--- a/docs/testing.md
+++ b/docs/testing.md
@ -1,4 +1,8 @@
-# Testing strategy
+---
+nav_order: 4
+---
+
+# Testing

 - We use [RSpec](https://rspec.info/) and [Capybara](https://teamcapybara.github.io/capybara/)

--- a/docs/users.md
+++ b/docs/users.md
@ -1,15 +0,0 @@
-# User roles
-
-## External users
-
-The primary users of the system are external data providing organisations: Local Authorities and Private Registered Providers (Housing Associations). These have 2 main user types:
-
- Data coordinators – administrators for their own organisation, can also complete logs
- Data providers – complete the logs
-
-Additionally there are Data Protection Officers (DPO), which for some organisations is a separate role, but in our codebase is modelled as an attribute of the user (i.e. a data coordinator or provider can additionally be a DPO). They are responsible for ensuring the organisation has signed the data sharing agreement.
-
-## Internal users
-
- Customer support (help desk) – can administrate all organisations
- ADD statisticians – primary consumers of the data collected via CDS/DAP