Browse Source

Generate website for technical docs (#729)

* Generate website for technical docs
* Add docs to .cfignore
pull/731/head
Paul Robert Lloyd 3 years ago committed by GitHub
parent
commit
bf8bd0bfef
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 1
      .cfignore
  2. 8
      .gitignore
  3. 41
      README.md
  4. 9
      docs/Gemfile
  5. 263
      docs/Gemfile.lock
  6. 12
      docs/_config.yml
  7. 3
      docs/_sass/color_schemes/govuk.scss
  8. 35
      docs/adr/adr-001-initial-architecture-decisions.md
  9. 6
      docs/adr/adr-002-repositories.md
  10. 16
      docs/adr/adr-003-form-submission-flow.md
  11. 16
      docs/adr/adr-004-gov-paas.md
  12. 14
      docs/adr/adr-005-form-definition.md
  13. 18
      docs/adr/adr-006-saving-values.md
  14. 19
      docs/adr/adr-007-data-validations.md
  15. 16
      docs/adr/adr-008-field-names.md
  16. 10
      docs/adr/adr-009-form-routing-logic.md
  17. 12
      docs/adr/adr-010-admin-users-vs-users.md
  18. 14
      docs/adr/adr-011-form-oop-refactor.md
  19. 8
      docs/adr/adr-012-controller-http-return-statuses.md
  20. 17
      docs/adr/adr-013-inferring-la-from-postcode.md
  21. 6
      docs/adr/adr-014-annual-form-changes.md
  22. 8
      docs/adr/index.md
  23. 20
      docs/api/index.html
  24. 6
      docs/api/v1.json
  25. 6
      docs/exports.md
  26. 35
      docs/form/builder.md
  27. 67
      docs/form/definition.md
  28. 30
      docs/form/index.md
  29. 18
      docs/form/page.md
  30. 24
      docs/form/question.md
  31. 15
      docs/form/runner.md
  32. 40
      docs/form/section.md
  33. 22
      docs/form/subsection.md
  34. 4
      docs/frontend.md
  35. 17
      docs/index.html
  36. 67
      docs/index.md
  37. 6
      docs/infrastructure.md
  38. 6
      docs/monitoring.md
  39. 25
      docs/organisations.md
  40. 7
      docs/schemes.md
  41. 5
      docs/service_overview.md
  42. 10
      docs/setup.md
  43. 6
      docs/testing.md
  44. 15
      docs/users.md

1
.cfignore

@ -0,0 +1 @@
app/docs/*

8
.gitignore vendored

@ -28,6 +28,12 @@
# Ignore master key for decrypting credentials and more.
/config/master.key
# Ignore generated documentation files
_site
.sass-cache
.jekyll-cache
.jekyll-metadata
/public/packs
/public/packs-test
/node_modules
@ -40,7 +46,7 @@ yarn-debug.log*
# Code coverage results
/coverage
#IDE specific files
# IDE specific files
/.idea
/.idea/*
.DS_Store

41
README.md

@ -7,26 +7,27 @@ Ruby on Rails app that handles the submission of lettings and sales of social ho
## Domain documentation
- [Service overview](docs/service_overview.md)
- [Organisations](docs/organisations.md)
- [Users and roles](docs/users.md)
- [Supported housing schemes](docs/schemes.md)
## Technical Documentation
- [Developer setup](docs/developer_setup.md)
- [Frontend](docs/frontend.md)
- [Testing strategy](docs/testing.md)
- [Form Builder](docs/form_builder.md)
- [Form Runner](docs/form_runner.md)
- [Infrastructure](docs/infrastructure.md)
- [Monitoring](docs/monitoring.md)
- [Exporting to CDS](docs/exports.md)
- [Architecture decision records](docs/adr)
## API documentation
API documentation can be found here: <https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data>. This is driven by [OpenAPI docs](docs/api/DLUHC-CORE-Data.v1.json)
* [Domain and technical documentation](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data)
* [Local development setup](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data/setup)
* [Architecture decision records](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data/adr)
* [API browser](https://communitiesuk.github.io/submit-social-housing-lettings-and-sales-data/api) (using this [OpenAPI specification](docs/api/v1.json))
* [Design history](https://core-design-history.herokuapp.com)
### Running documentation locally
The documentation website can be generated and served locally using Jekyll.
1. Change into the `/docs/` directory:\
`cd docs`
2. Install Jekyll and its dependencies:\
`bundle install`
3. Start the Jekyll server:\
`bundle exec jekyll serve`
4. View the website:\
<http://localhost:4000>
## System architecture

9
docs/Gemfile

@ -0,0 +1,9 @@
source "https://rubygems.org"
group :jekyll_plugins do
gem "github-pages"
end
group :development do
gem "webrick"
end

263
docs/Gemfile.lock

@ -0,0 +1,263 @@
GEM
remote: https://rubygems.org/
specs:
activesupport (6.0.5)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (>= 0.7, < 2)
minitest (~> 5.1)
tzinfo (~> 1.1)
zeitwerk (~> 2.2, >= 2.2.2)
addressable (2.8.0)
public_suffix (>= 2.0.2, < 5.0)
coffee-script (2.4.1)
coffee-script-source
execjs
coffee-script-source (1.11.1)
colorator (1.1.0)
commonmarker (0.23.5)
concurrent-ruby (1.1.10)
dnsruby (1.61.9)
simpleidn (~> 0.1)
em-websocket (0.5.3)
eventmachine (>= 0.12.9)
http_parser.rb (~> 0)
ethon (0.15.0)
ffi (>= 1.15.0)
eventmachine (1.2.7)
execjs (2.8.1)
faraday (2.3.0)
faraday-net_http (~> 2.0)
ruby2_keywords (>= 0.0.4)
faraday-net_http (2.0.3)
ffi (1.15.5)
forwardable-extended (2.6.0)
gemoji (3.0.1)
github-pages (226)
github-pages-health-check (= 1.17.9)
jekyll (= 3.9.2)
jekyll-avatar (= 0.7.0)
jekyll-coffeescript (= 1.1.1)
jekyll-commonmark-ghpages (= 0.2.0)
jekyll-default-layout (= 0.1.4)
jekyll-feed (= 0.15.1)
jekyll-gist (= 1.5.0)
jekyll-github-metadata (= 2.13.0)
jekyll-include-cache (= 0.2.1)
jekyll-mentions (= 1.6.0)
jekyll-optional-front-matter (= 0.3.2)
jekyll-paginate (= 1.1.0)
jekyll-readme-index (= 0.3.0)
jekyll-redirect-from (= 0.16.0)
jekyll-relative-links (= 0.6.1)
jekyll-remote-theme (= 0.4.3)
jekyll-sass-converter (= 1.5.2)
jekyll-seo-tag (= 2.8.0)
jekyll-sitemap (= 1.4.0)
jekyll-swiss (= 1.0.0)
jekyll-theme-architect (= 0.2.0)
jekyll-theme-cayman (= 0.2.0)
jekyll-theme-dinky (= 0.2.0)
jekyll-theme-hacker (= 0.2.0)
jekyll-theme-leap-day (= 0.2.0)
jekyll-theme-merlot (= 0.2.0)
jekyll-theme-midnight (= 0.2.0)
jekyll-theme-minimal (= 0.2.0)
jekyll-theme-modernist (= 0.2.0)
jekyll-theme-primer (= 0.6.0)
jekyll-theme-slate (= 0.2.0)
jekyll-theme-tactile (= 0.2.0)
jekyll-theme-time-machine (= 0.2.0)
jekyll-titles-from-headings (= 0.5.3)
jemoji (= 0.12.0)
kramdown (= 2.3.2)
kramdown-parser-gfm (= 1.1.0)
liquid (= 4.0.3)
mercenary (~> 0.3)
minima (= 2.5.1)
nokogiri (>= 1.13.4, < 2.0)
rouge (= 3.26.0)
terminal-table (~> 1.4)
github-pages-health-check (1.17.9)
addressable (~> 2.3)
dnsruby (~> 1.60)
octokit (~> 4.0)
public_suffix (>= 3.0, < 5.0)
typhoeus (~> 1.3)
html-pipeline (2.14.2)
activesupport (>= 2)
nokogiri (>= 1.4)
http_parser.rb (0.8.0)
i18n (0.9.5)
concurrent-ruby (~> 1.0)
jekyll (3.9.2)
addressable (~> 2.4)
colorator (~> 1.0)
em-websocket (~> 0.5)
i18n (~> 0.7)
jekyll-sass-converter (~> 1.0)
jekyll-watch (~> 2.0)
kramdown (>= 1.17, < 3)
liquid (~> 4.0)
mercenary (~> 0.3.3)
pathutil (~> 0.9)
rouge (>= 1.7, < 4)
safe_yaml (~> 1.0)
jekyll-avatar (0.7.0)
jekyll (>= 3.0, < 5.0)
jekyll-coffeescript (1.1.1)
coffee-script (~> 2.2)
coffee-script-source (~> 1.11.1)
jekyll-commonmark (1.4.0)
commonmarker (~> 0.22)
jekyll-commonmark-ghpages (0.2.0)
commonmarker (~> 0.23.4)
jekyll (~> 3.9.0)
jekyll-commonmark (~> 1.4.0)
rouge (>= 2.0, < 4.0)
jekyll-default-layout (0.1.4)
jekyll (~> 3.0)
jekyll-feed (0.15.1)
jekyll (>= 3.7, < 5.0)
jekyll-gist (1.5.0)
octokit (~> 4.2)
jekyll-github-metadata (2.13.0)
jekyll (>= 3.4, < 5.0)
octokit (~> 4.0, != 4.4.0)
jekyll-include-cache (0.2.1)
jekyll (>= 3.7, < 5.0)
jekyll-mentions (1.6.0)
html-pipeline (~> 2.3)
jekyll (>= 3.7, < 5.0)
jekyll-optional-front-matter (0.3.2)
jekyll (>= 3.0, < 5.0)
jekyll-paginate (1.1.0)
jekyll-readme-index (0.3.0)
jekyll (>= 3.0, < 5.0)
jekyll-redirect-from (0.16.0)
jekyll (>= 3.3, < 5.0)
jekyll-relative-links (0.6.1)
jekyll (>= 3.3, < 5.0)
jekyll-remote-theme (0.4.3)
addressable (~> 2.0)
jekyll (>= 3.5, < 5.0)
jekyll-sass-converter (>= 1.0, <= 3.0.0, != 2.0.0)
rubyzip (>= 1.3.0, < 3.0)
jekyll-sass-converter (1.5.2)
sass (~> 3.4)
jekyll-seo-tag (2.8.0)
jekyll (>= 3.8, < 5.0)
jekyll-sitemap (1.4.0)
jekyll (>= 3.7, < 5.0)
jekyll-swiss (1.0.0)
jekyll-theme-architect (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-cayman (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-dinky (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-hacker (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-leap-day (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-merlot (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-midnight (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-minimal (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-modernist (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-primer (0.6.0)
jekyll (> 3.5, < 5.0)
jekyll-github-metadata (~> 2.9)
jekyll-seo-tag (~> 2.0)
jekyll-theme-slate (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-tactile (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-time-machine (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-titles-from-headings (0.5.3)
jekyll (>= 3.3, < 5.0)
jekyll-watch (2.2.1)
listen (~> 3.0)
jemoji (0.12.0)
gemoji (~> 3.0)
html-pipeline (~> 2.2)
jekyll (>= 3.0, < 5.0)
kramdown (2.3.2)
rexml
kramdown-parser-gfm (1.1.0)
kramdown (~> 2.0)
liquid (4.0.3)
listen (3.7.1)
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
mercenary (0.3.6)
minima (2.5.1)
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.16.2)
nokogiri (1.13.6-arm64-darwin)
racc (~> 1.4)
octokit (4.25.1)
faraday (>= 1, < 3)
sawyer (~> 0.9)
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (4.0.7)
racc (1.6.0)
rb-fsevent (0.11.1)
rb-inotify (0.10.1)
ffi (~> 1.0)
rexml (3.2.5)
rouge (3.26.0)
ruby2_keywords (0.0.5)
rubyzip (2.3.2)
safe_yaml (1.0.5)
sass (3.7.4)
sass-listen (~> 4.0.0)
sass-listen (4.0.0)
rb-fsevent (~> 0.9, >= 0.9.4)
rb-inotify (~> 0.9, >= 0.9.7)
sawyer (0.9.2)
addressable (>= 2.3.5)
faraday (>= 0.17.3, < 3)
simpleidn (0.2.1)
unf (~> 0.1.4)
terminal-table (1.8.0)
unicode-display_width (~> 1.1, >= 1.1.1)
thread_safe (0.3.6)
typhoeus (1.4.0)
ethon (>= 0.9.0)
tzinfo (1.2.9)
thread_safe (~> 0.1)
unf (0.1.4)
unf_ext
unf_ext (0.0.8.2)
unicode-display_width (1.8.0)
webrick (1.7.0)
zeitwerk (2.6.0)
PLATFORMS
arm64-darwin-21
DEPENDENCIES
github-pages
webrick
BUNDLED WITH
2.3.4

12
docs/_config.yml

@ -0,0 +1,12 @@
title: "CORE Tech Docs"
remote_theme: just-the-docs/just-the-docs
lang: en-GB
permalink: pretty
color_scheme: govuk
aux_links:
"API browser":
- /api
"Design history":
- https://core-design-history.herokuapp.com
"GitHub":
- https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data

3
docs/_sass/color_schemes/govuk.scss

@ -0,0 +1,3 @@
$btn-primary-color: #00703c;
$link-color: #1d70b8;
$grey-lt-000: #f8f8f8;

35
docs/adr/adr-001-initial-architecture-decisions.md

@ -1,31 +1,34 @@
### ADR - 001: Initial Architecture Decisions
---
parent: Architecture decisions
---
##### Application Framework
# 001: Initial architecture decisions
Ruby on Rails
- Well established and commonly used within MHCLG and gov.uk in general
## Ruby on Rails
- Well established and commonly used within DLUHC and GOV.UK in general
- Good ecosystem for common web app tasks, quick productivity
- Matches team skill set
- Analysis/RAP pipelines will sit in the DAP platform and not this application directly so optimising for web framework tasks makes sense.
Testing
- Rspec for unit testing
- Capybara or Cypress-Rails for front end testing
- TDD or ATDD approach
- No specific code coverage target or deploy gate as we feel this leads to arbitrary metric chasing and is counter-productive
## Testing
- Rspec for unit testing
- Capybara or Cypress-Rails for front end testing
- TDD or ATDD approach
- No specific code coverage target or deploy gate as we feel this leads to arbitrary metric chasing and is counter-productive
## Front end
Front end
- In the same app codebase
- ERB templates
Code style and linting
- Gov.uk Rubocop for Ruby style
- .editorconfig for whitespace, newlines etc
## Code style and linting
<br />
- GOV.UK Rubocop for Ruby style
- `.editorconfig` for whitespace, newlines etc
#### Ways of Working
## Ways of working
- Flexible approach to branching. Generally Trunk based CI (every TDD round results in a commit and push to master) when pairing, branches and PR when doing solo or more exploratory work.
- Github actions for automated test, build, deploy pipeline

6
docs/adr/adr-002-repositories.md

@ -1,6 +1,8 @@
### ADR - 002: Initial Architecture Decisions
---
parent: Architecture decisions
---
#### Repositories
# 002: Repositories
There will be two git repositories for this project.

16
docs/adr/adr-003-form-submission-flow.md

@ -1,19 +1,21 @@
### ADR - 003: Form Submission Flow
---
parent: Architecture decisions
---
Turbo Frames (https://github.com/hotwired/turbo-rails) for form pages/questions with data saved (but not necessarily fully validated) to Active Record model on each submit.
# 003: Form submission flow
[Turbo Frames](https://github.com/hotwired/turbo-rails) for form pages/questions with data saved (but not necessarily fully validated) to Active Record model on each submit.
#### Impact on Performance
## Impact on performance
Using Turbo Frames allows us to swap out just the question part of the page without needing full page refreshes as you go through the form and provides a "Single Page Application like" user experience. Each question still gets a unique URL that can be navigated to directly with the Case Log ID and the overall user experience is that form navigation feels faster.
#### Impact on interrupted sessions
## Impact on interrupted sessions
We currently have a single Active Record model for Case Logs that contains all the question fields. Every time a question is submitted the answer will be saved in the Active Record model instance before the next frame is rendered. This model will need to be able to handle partial records and partial validation anyway since not all API users will have all the required data. Validation can occur based on the data already saved and/or once the form is finally submitted. Front end validation will still happen additionally as you go through the form to help make sure users don't get a long list of errors at the end. Using session data here and updating the model only once the form is completed would not seem to have any advantages over this approach.
We currently have a single Active Record model for Case Logs that contains all the question fields. Every time a question is submitted the answer will be saved in the Active Record model instance before the next frame is rendered. This model will need to be able to handle partial records and partial validation anyway since not all API users will have all the required data. Validation can occur based on the data already saved and/or once the form is finally submitted. Front end validation will still happen additionally as you go through the form to help make sure users dont get a long list of errors at the end. Using session data here and updating the model only once the form is completed would not seem to have any advantages over this approach.
This means that when a user navigates away from the form or closes the tab etc, they can use the URL to navigate directly back to where they left off, or follow the form flow through again, and in both cases their submitted answers will still be there.
#### Impact on API
## Impact on API
The API will still expect to take a JSON describing the case log, instantiate the model with the given fields, and run validations as if it had been submitted.

16
docs/adr/adr-004-gov-paas.md

@ -1,8 +1,10 @@
### ADR - 004: Infrastructure Switch
---
parent: Architecture decisions
---
#### Gov PaaS
# 004: Infrastructure switch to GOV.UK PaaS
The application infrastructure will be moved from the initial AWS set up to Gov PaaS. The initial expectation is to have a Gov PaaS account `dluhc-core` with 2 spaces `sandbox`, `production`.
The application infrastructure will be moved from the initial AWS set up to GOV.UK PaaS. The initial expectation is to have a GOV.UK PaaS account `dluhc-core` with 2 spaces `sandbox`, `production`.
Sandbox will consist of 2 small instances (512M) and 1 tiny-unencrypted-13 Postgres instance.
@ -10,11 +12,11 @@ Production infrastructure sizing will be decided at a later time and once our ac
The reasoning for this is:
- Department policy is to use Gov PaaS whenever possible
- DLUHC does not have a lot of internal dev ops skills/capacity so by leveraging Gov PaaS we can leverage having most of the monitoring, running, scaling and security already provided.
- Department policy is to use GOV.UK PaaS whenever possible
- DLUHC does not have a lot of internal dev ops skills/capacity so by leveraging GOV.UK PaaS we can leverage having most of the monitoring, running, scaling and security already provided.
- We get a simpler infrastructure setup than the AWS setup we currently have
- All of the infrastructure we currently require is well supported on Gov PaaS
- All of the infrastructure we currently require is well supported on GOV.UK PaaS
One potential downside is that data replication to CDS may be slightly more complicated as adding our database to a VPC requires the Gov PaaS support team to do that on our behalf.
This also means the Github repository previously used for [Infrastructure](https://github.com/communitiesuk/mhclg-data-collection-beta-infrastructure) will be archived after this change goes in as it won't be needed anymore.
This also means the GitHub repository previously used for [Infrastructure](https://github.com/communitiesuk/mhclg-data-collection-beta-infrastructure) will be archived after this change goes in as it won’t be needed any more.

14
docs/adr/adr-005-form-definition.md

@ -1,6 +1,8 @@
### ADR - 005: Form Definition
---
parent: Architecture decisions
---
#### Config driven front-end
# 005: Config driven frontend
We will initially try to model the form as a JSON structure that should describe all the information needed to display the form to the user. That means it will need to describe the sections, subsections, pages, questions, answer options etc.
@ -10,16 +12,16 @@ This should also mean that in the future it could be possible to create a UI tha
Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on).
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application.
#### JSON Structure
## JSON Structure
First pass of a form definition
```
```json
{
form_type: [lettings/sales]
start_year: yyyy

18
docs/adr/adr-006-saving-values.md

@ -1,8 +1,10 @@
### ADR - 006: Saving values to the database
---
parent: Architecture decisions
---
We have opted to save values to the database directly instead of saving keys/numbers that need to be converted with enums in models using active record.
# 006: Saving values to the database
### Saving values to the database
We have opted to save values to the database directly instead of saving keys/numbers that need to be converted with enums in models using active record.
There are a few reasons we have opted to save the values directly, they are as follows
@ -10,14 +12,12 @@ There are a few reasons we have opted to save the values directly, they are as f
- Currently there is no need to abstract the data as the data should be safe from being accessed by anyone external to the project
- It doesn't require additional dev work to map keys/numbers to values, we can just pull the values out directly and use them in the code, for example on the check answers page
- It doesn’t require additional dev work to map keys/numbers to values, we can just pull the values out directly and use them in the code, for example on the check answers page
### Drawbacks
## Drawbacks
- Changing the wording/casing of the answers could result in discrepancies in the database
- Changing the wording/casing of the answers could result in discrepancies in the database.
- There is a small risk that if the database is accessed by someone unauthorised they would have access to personally identifiable information if we were to collect Any. We will be mitigating this risk by encrypting the production database
- There is a small risk that if the database is accessed by someone unauthorised they would have access to personally identifiable information if we were to collect Any. We will be mitigating this risk by encrypting the production database.
This decision is not too difficult to change and can be revisited in the future if there is sufficient reason to switch to storing keys/numbers and using enums and active record to convert those to the appropriate values.

19
docs/adr/adr-007-data-validations.md

@ -1,4 +1,8 @@
### ADR - 007: Data Validations
---
parent: Architecture decisions
---
# 007: Data validations
Data validations that happen in CORE at the point of data collection fall into two categories:
@ -7,21 +11,20 @@ Data validations that happen in CORE at the point of data collection fall into t
These are handled slightly differently:
##### Validity checks
## Validity checks
These run for all submitted data. Every time a form page (in the UI) is submitted, the fields related to that form page will be checked to ensure that any responses given are valid. If they are not, an error message will be shown on screen, and it will not be possible to "Save and continue" until the response is fixed or removed.
These run for all submitted data. Every time a form page (in the UI) is submitted, the fields related to that form page will be checked to ensure that any responses given are valid. If they are not, an error message will be shown on screen, and it will not be possible to ‘Save and continue’ until the response is fixed or removed.
Similarly if an API request is made to create a case log with data that contains _invalid_ fields, that data will be rejected, and an error message will be returned.
## Presence checks
##### Presence checks
These are not strictly error checks since it's possible to submit partial data. In the form UI it is possible to click "Save and continue" and move past questions that you might not know right now, and leave them to come back to later. We shouldn't prevent this workflow.
These are not strictly error checks since it’s possible to submit partial data. In the form UI it is possible to click ‘Save and continue’ and move past questions that you might not know right now, and leave them to come back to later. We shouldn’t prevent this workflow.
Similarly the API client (3rd party software system) may not have all the required data and may only be submitting a partial log. This is still a valid use case so we should not be enforcing presence checks and returning errors based on them for either submission type.
Instead we determine the _status_ of the case log based the presence checks. Every time data is submitted (via a form page, bulk upload or API), before saving the data, the system will check whether all fields have been completed *and* pass validity checks. If so, the case log will be marked as *completed*, if not it will be marked as *in progress*.
Instead we determine the _status_ of the case log based the presence checks. Every time data is submitted (via a form page, bulk upload or API), before saving the data, the system will check whether all fields have been completed _and_ pass validity checks. If so, the case log will be marked as _completed_, if not it will be marked as _in progress_.
By default all fields that a Case Log has will be assumed to be required unless explicitly marked as not required (for example as a result of other answers rendering a question inapplicable).
On the form UI this will work by not allowing you to "submit" the form, until all presence checks have been satisfied, but all other navigation is allowed. On the API this will work by returning a Case Log that is "in progress" if you've submitted a partial log, or "completed" if you've submitted a full log, or "Errors" if you've submitted an invalid log.
On the form UI this will work by not allowing you to submit the form, until all presence checks have been satisfied, but all other navigation is allowed. On the API this will work by returning a Case Log that is ‘in progress’ if you’ve submitted a partial log, or ‘completed’ if you’ve submitted a full log, or ‘errors’ if you’ve submitted an invalid log.

16
docs/adr/adr-008-field-names.md

@ -1,11 +1,11 @@
### ADR - 008: Field Names
---
parent: Architecture decisions
---
We are changing the schema to reflect the way the data is stored in CORE.
This is due to the SPSS queries that are being performed by ADD and the complexity that would come with changing them.
# 008: Field names
The field names are saved lowercase as opposed to the uppercase versions we see in CORE.
This is due to Ruby expecting the uppercase parameters to be constants and database fields are expected to be lower case.
These fields could be mapped to their uppercase versions during the replication if needed.
We are changing the schema to reflect the way the data is stored in CORE. This is due to the SPSS queries that are being performed by ADD and the complexity that would come with changing them.
A lot of the values are now also being stored as enums.
This gives as some validation by default as the values not defined in the enums will fail to save.
The field names are saved lowercase as opposed to the uppercase versions we see in CORE. This is due to Ruby expecting the uppercase parameters to be constants and database fields are expected to be lower case. These fields could be mapped to their uppercase versions during the replication if needed.
A lot of the values are now also being stored as enums. This gives as some validation by default as the values not defined in the enums will fail to save.

10
docs/adr/adr-009-form-routing-logic.md

@ -1,12 +1,16 @@
### ADR - 009: Form Routing Logic
---
parent: Architecture decisions
---
# 009: Form routing logic
There are 2 ways you can think about form (page) routing logic:
1. Based on the answer you give to a page you are navigated to some point in the form, i.e. a "Jump to"
1. Based on the answer you give to a page you are navigated to some point in the form, i.e. a ‘jump to’
2. Each question is considered sequentially and independently and we evaluate whether it should be shown or not
Our Form Definition DSL takes the second approach. This has a couple of advantages:
- It makes the check answers pattern easier to code as you can ask each page directly: "Have the conditions for you to be shown been met?", with approach 1, you would effectively have to traverse the full route branch to see if a particular page was shown for each page/question which adds complexity.
- It makes the check answers pattern easier to code as you can ask each page directly: “Have the conditions for you to be shown been met?”, with approach 1, you would effectively have to traverse the full route branch to see if a particular page was shown for each page/question which adds complexity.
- It makes it easier to look at the JSON and see at a glance what conditions will show or hide a page, which is closer to how the business logic is discussed and is easier to reason about.

12
docs/adr/adr-010-admin-users-vs-users.md

@ -1,9 +1,13 @@
### ADR - 010: Admin Users vs Users
---
parent: Architecture decisions
---
#### Why do we have 2 User classes, AdminUser and User?
# 010: Admin users vs Users
This is modelling a real life split. `AdminUsers` are internal DLUHC users or helpdesk employees. While `Users` are external users working at data providing organisations. So local authority/housing association's "admin" users, i.e. Data Co-ordinators are a type of the User class. They have the ability to add or remove other users to or from their organisation, and to update their organisation details etc, but only through the designed UI. They do not get direct access to ActiveAdmin.
## Why do we have 2 user classes, `AdminUser` and `User`?
This is modelling a real life split. `AdminUsers` are internal DLUHC users or help desk employees. While `Users` are external users working at data providing organisations. So local authority/housing association’s "admin" users, i.e. Data Co-ordinators are a type of the User class. They have the ability to add or remove other users to or from their organisation, and to update their organisation details etc, but only through the designed UI. They do not get direct access to ActiveAdmin.
AdminUsers on the other hand get direct access to ActiveAdmin. From there they can download entire datasets (via CSV, XML, JSON), view any log from any organisation, and add or remove users of any type including other Admin users. This means TDA will likely also require more stringent authentication for them using MFA (which users will likely not require). So the class split also helps there.
A potential downside to this approach is that it does not currently allow for `AdminUsers` to sign into the application UI itself with their Admin credentials. However, we need to see if there's an actual use case for this and what it would be (since they aren't part of an organisation to be uploading data for, but could add or amend data or user or org details through ActiveAdmin anyway). If there is a strong use case for it this could be work around by either: providing them with two sets of credentials, or modifying the `authenticate_user` method to also check `AdminUser` credentials.
A potential downside to this approach is that it does not currently allow for `AdminUsers` to sign into the application UI itself with their Admin credentials. However, we need to see if there’s an actual use case for this and what it would be (since they aren’t part of an organisation to be uploading data for, but could add or amend data or user or org details through ActiveAdmin anyway). If there is a strong use case for it this could be work around by either: providing them with two sets of credentials, or modifying the `authenticate_user` method to also check `AdminUser` credentials.

14
docs/adr/adr-011-form-oop-refactor.md

@ -1,10 +1,16 @@
### ADR - 011: Splitting the form parsing into objects
---
parent: Architecture decisions
---
Initially a single "Form" class handled the parsing of the form definition JSON as well as a lot of the logic around what different sections meant. This works fine but led to a lot of places in code where we're passing around arguments to determine whether a page or section should or shouldn't do something rather than being able to ask it directly. Refactoring this into smaller form domain object classes has several benefits:
# 011: Splitting the form parsing into objects
- It's easier to compare the form definition JSON to the code classes and reason about what fields can be passed and what effect they'll have
Initially a single `Form` class handled the parsing of the form definition JSON as well as a lot of the logic around what different sections meant. This works fine but led to a lot of places in code where we’re passing around arguments to determine whether a page or section should or shouldn’t do something rather than being able to ask it directly.
Refactoring this into smaller form domain object classes has several benefits:
- It’s easier to compare the form definition JSON to the code classes and reason about what fields can be passed and what effect they’ll have
- It moves business logic out of the helpers and keeps them to just dealing with display logic
- It makes it easier to unit test form functionality, and group that into smaller chunks
- It allows for less passing of arguments. e.g. `page.routed_to?(case_log)` vs `form.was_page_routed_to?(page, case_log)`
This abstraction is likely still not the best (the form vs case log split) but this seems like an improvement that can be iterated on.
This abstraction is likely still not the best (the form vs case log split) but this seems like an improvement that can be iterated on.

8
docs/adr/adr-012-controller-http-return-statuses.md

@ -1,4 +1,8 @@
### ADR - 012: Controller HTTP return statuses
---
parent: Architecture decisions
---
# 012: Controller HTTP return statuses
Controllers assess authentication by 3 criteria:
@ -6,7 +10,7 @@ Controllers assess authentication by 3 criteria:
2. Are you signed in and requesting an action that your role/user type has access to?
3. Are you signed in, requesting an action that your role/user type has access to and requesting a resource that your user has access to.
When these aren't met they fail with the following response types:
When these arent met they fail with the following response types:
1. 401: Unauthorized. Redirect to sign-in page.
2. 401: Unauthorized

17
docs/adr/adr-013-inferring-la-from-postcode.md

@ -1,12 +1,13 @@
### ADR - 013: Inferring LA from postcode
---
parent: Architecture decisions
---
# 013: Inferring LA from postcode
We use ONS data to infer local authority from postcode in the property information section.
The Office for National Statistics (ONS) publishes the National Statistics
Postcode Lookup (NSPL) and ONS Postcode Directory (ONSPD) datasets,
which may be used to find a local authority district for a postcode when compiling statistics.
We're using postcodes.io API with postcodes_io gem.
Postcodes.io uses OS and ONS data which is updated as soon as new data becomes available.
The Office for National Statistics (ONS) publishes the National Statistics Postcode Lookup (NSPL) and ONS Postcode Directory (ONSPD) datasets, which may be used to find a local authority district for a postcode when compiling statistics.
We’re using postcodes.io API with postcodes_io gem. Postcodes.io uses OS and ONS data which is updated as soon as new data becomes available.
We are not using OS places API due to the lack of data.
Closest datapoint to LA in OS places api is ADMINISTRATIVE_AREA which does not always match with local authority.
We are not using OS places API due to the lack of data. Closest data point to LA in OS places api is ADMINISTRATIVE_AREA which does not always match with local authority.

6
docs/adr/adr-014-annual-form-changes.md

@ -1,4 +1,8 @@
### ADR - 014: Annual form changes
---
parent: Architecture decisions
---
# 014: Annual form changes
Given that the data collection form changes annually and that the data collection windows overlap by several months to allow for late submissions of data from the previous year, we need to be able to run at least two different versions of a form concurrently. We can do this in one of at least two ways:

8
docs/adr/index.md

@ -0,0 +1,8 @@
---
has_children: true
nav_order: 9
---
# Architecture decisions
A record of architectural decisions made on this project.

20
docs/api/index.html

@ -0,0 +1,20 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" type="text/css" href="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui.css">
<title>DLUHC CORE Data Collection API</title>
</head>
<body>
<div id="openapi"></div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui-bundle.min.js"></script>
<script>
window.onload = function () {
const ui = SwaggerUIBundle({
url: "v1.json",
dom_id: "#openapi"
})
}
</script>
</body>

6
docs/api/DLUHC-CORE-Data.v1.json → docs/api/v1.json

@ -1,13 +1,13 @@
{
"openapi": "3.0.0",
"info": {
"title": "DLUHC CORE Data",
"title": "DLUHC CORE Data Collection API",
"version": "1.0",
"description": "Submit or Update CORE Case Log Data on Lettings and Sales of Social Housing in England"
"description": "Submit social housing lettings and sales data (CORE)"
},
"servers": [
{
"url": "https://dluhc-core.london.cloudapps.digital",
"url": "https://dluhc-core-staging.london.cloudapps.digital/logs",
"description": "Staging"
}
],

6
docs/exports.md

@ -1,3 +1,7 @@
---
nav_order: 7
---
# Exporting to CDS
All data collected by the application needs to be exported to the Consolidated Data Store (CDS) which is a data warehouse based on MS SQL running in the DAP (Data Analytics Platform).
@ -6,7 +10,7 @@ This is done via XML exports saved in an S3 bucket located in the DAP VPC using
Initially the application database field names and field types were chosen to match the existing CDS data as closely as possible to minimise the amount of transformation needed. This has led to a less than optimal data model though and increasingly we should look to transform at the mapping layer where beneficial for our application.
The export service is triggered nightly using [Gov PaaS tasks](https://docs.cloudfoundry.org/devguide/using-tasks.html). These tasks are triggered from a Github action, as Gov PaaS does not currently support the Cloud Foundry Task Scheduler.
The export service is triggered nightly using [Gov PaaS tasks](https://docs.cloudfoundry.org/devguide/using-tasks.html). These tasks are triggered from a GitHub action, as Gov PaaS does not currently support the Cloud Foundry Task Scheduler.
The S3 bucket is located in the DAP VPC rather than the application VPC as DAP runs in an AWS account directly so access to the S3 bucket can be restricted to only the IPs used by the application. This is not possible the other way around as [Gov PaaS does not support restricting S3 access by IP](https://github.com/alphagov/paas-roadmap/issues/107).

35
docs/form_builder.md → docs/form/builder.md

@ -1,36 +1,15 @@
# Form Builder
---
parent: Generating forms
nav_order: 1
---
## Background
Social housing lettings and sales data is collected in annual collection windows that run from 1st April to 1st April.
During this window the form and questions generally stay constant. The form will generally change by small amounts between each collection window. Typical changes are adding new questions, adding or removing answer options from questions or tweaking question wording for clarity.
A paper form is produced for guidance and to help data providers collect the data offline, and a bulk upload template is circulated which need to match the online form.
Data is accepted for a collection window for up to 3 months after it’s finished to allow for late data submission. This means that between April and July two version of the form run simultaneously.
Other considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a user interface.
We haven’t used micro-services, preferring to deploy a single application but we have modelled the form itself as configuration in the form of a JSON structure that acts as a sort of DSL/form builder for the form.
The idea is to decouple the code that creates the required routes, controller methods, views etc to display the form from the actual wording of questions or order of pages such that it becomes possible to make changes to the form with little or no code changes.
This should also mean that in the future it could be possible to create an interface that can construct the JSON config, which would open up the ability to make form changes to a wider audience. Doing this fully would require generating and running the necessary migrations for data storage, generating the required ActiveRecord methods to validate the data server side, and generating/updating API endpoints and documentation. All of this is likely to be beyond the scope of initial MVP but could be looked at in the future.
Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application. We’ll call this the Form Runner part of the application.
# Form builder
## Setup this log
The setup this log section is treated slightly differently from the rest of the form. It is more accurately viewed as providing metadata about the form than as being part of the form itself. It also needs to know far more about the application specific context than other parts of the form such as who the current user is, what organisation they’re part of and what role they have etc.
As a result it’s not modelled as part of the config but rather as code. It still uses the same Form Runner components though.
As a result it’s not modelled as part of the config but rather as code. It still uses the same [Form Runner](runner) components though.
## Features the Form Config supports
@ -183,7 +162,7 @@ This will validate all forms in directories `["config/forms", "spec/fixtures/for
## Form models and definition
For information about the form model and related models (section, subsection, page, question) and how these relate to each other follow [this link](/docs/form/form.md)
For information about the form model and related models (section, subsection, page, question) and how these relate to each other see [form definition](/form/definition).
## Improvements that could be made

67
docs/form/form.md → docs/form/definition.md

@ -1,4 +1,10 @@
## Form Definition
---
parent: Generating forms
has_children: true
nav_order: 3
---
# Form definition
The current system is built around a form definition written in JSON. At the top level every form will expect to have the following attributes:
@ -8,7 +14,8 @@ The current system is built around a form definition written in JSON. At the top
- Sections: the sections in the form, this block is where the bulk of the form definition will be.
An example of this might look like the following:
```JSON
```json
{
"form_type": "lettings",
"start_date": "2021-04-01T00:00:00.000+01:00",
@ -21,50 +28,39 @@ An example of this might look like the following:
Note that the end date of one form will overlap the start date of another to allow for late submissions. This means that every year there will be a period of time in which two forms are running simultaneously.
### How is the form split up?
A summary of how the form is split up is as follows:
- A form is divided up into one or more sections.
- Each section can have one or more subsections.
- Each subsection can have one or more pages.
- Each page can have one or more questions.
More information about these form elements can be found in the following links:
A form is split up is as follows:
- [Section](/docs/form/section.md)
- [Subsection](/docs/form/subsection.md)
- [Page](/docs/form/page.md)
- [Question](/docs/form/question.md)
- A form is divided up into one or more [sections](section)
- Each section can have one or more [subsections](subsection)
- Each subsection can have one or more [pages](page)
- Each page can have one or more [questions](question)
### The Form Model, Views and Controller
Rails uses the model, view, controller (MVC) pattern which we follow.
Rails uses the Model, View, Controller (MVC) pattern which we follow.
#### The Form Model
## Form model
There is no need to manually initialise a form object as this is handled by the FormHandler class at boot time. If a new form needs to be added then a JSON file containing the form definition should be added to `config/forms` where the FormHandler will be able to locate it and instantiate it.
A form has the following attributes:
- name: The name of the form
- setup_sections: The setup section (this is not defined in the JSON, for more information see this)
- form_definition: The parsed form JSON
- form_sections: The sections found within the form definition JSON
- type: The type of form (this is used to indicate if the form is for a sale or a letting)
- sections: The combination of the setup section with those found in the JSON definition
- subsections: The subsections of the form (these live under the sections)
- pages: The pages of the form (these live under the subsections)
- questions: The questions of the form (these live under the pages)
- start_date: The start date of the form, in iso8601 format
- end_date: The end date of the form, in iso8601 format
- `name`: The name of the form
- `setup_sections`: The setup section (this is not defined in the JSON, for more information see this)
- `form_definition`: The parsed form JSON
- `form_sections`: The sections found within the form definition JSON
- `type`: The type of form (this is used to indicate if the form is for a sale or a letting)
- `sections`: The combination of the setup section with those found in the JSON definition
- `subsections`: The subsections of the form (these live under the sections)
- `pages`: The pages of the form (these live under the subsections)
- `questions`: The questions of the form (these live under the pages)
- `start_date`: The start date of the form, in ISO 8601 format
- `end_date`: The end date of the form, in ISO 8601 format
#### The Form Views
## Form views
The main view used for rendering the form is the `app/views/form/page.html.erb` view as the Form contains multiple pages (which live in subsections within sections). This page view then renders the appropriate partials for the question types of the questions on the current page.
We currently have views for the following question types:
- Numerical
- Date
- Checkbox
@ -76,11 +72,10 @@ We currently have views for the following question types:
Interruption screen questions are radio questions used for soft validation of fields. They usually have yes and no options for a user to confirm a value is correct.
#### The Form Controller
## Form controller
The form controller handles the form submission as well as the rendering of the check answers page and the review page.
### The FormHandler helper class
## FormHandler helper class
The FormHandler helper is a helper that loads all of the defined forms and initialises them as Form objects. It can also be used to get specific forms if needed.

30
docs/form/index.md

@ -0,0 +1,30 @@
---
has_children: true
nav_order: 8
---
# Generating forms
Social housing lettings and sales data is collected in annual collection windows that run from 1 April to 1 April the following year.
During this window the form and questions generally stay constant. The form will generally change by small amounts between each collection window. Typical changes are adding new questions, adding or removing answer options from questions or tweaking question wording for clarity.
A paper form is produced for guidance and to help data providers collect the data offline, and a bulk upload template is circulated which need to match the online form.
Data is accepted for a collection window for up to 3 months after it’s finished to allow for late data submission. This means that between April and July 2 versions of the form run simultaneously.
Other considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a user interface.
We haven’t used micro-services, preferring to deploy a single application but we have modelled the form itself as configuration in the form of a JSON structure that acts as a sort of DSL/form builder for the form.
The idea is to decouple the code that creates the required routes, controller methods, views etc to display the form from the actual wording of questions or order of pages such that it becomes possible to make changes to the form with little or no code changes.
This should also mean that in the future it could be possible to create an interface that can construct the JSON config, which would open up the ability to make form changes to a wider audience. Doing this fully would require generating and running the necessary migrations for data storage, generating the required ActiveRecord methods to validate the data server side, and generating/updating API endpoints and documentation. All of this is likely to be beyond the scope of initial MVP but could be looked at in the future.
Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application. We’ll call this the Form Runner part of the application.

18
docs/form/page.md

@ -1,8 +1,16 @@
## Page
---
parent: Form definition
grand_parent: Generating forms
nav_order: 3
---
Pages are under the subsection level of the form definition. A example page might look something like this:
# Page
```JSON
Pages sit below the [`Subsection`](subsection) level of a form definition.
An example page might look something like this:
```json
"property_postcode": {
"header": "",
"description": "",
@ -23,6 +31,6 @@ The header is optional but if provided is used for the heading displayed on the
The description is optional but if provided is used for a paragraph displayed under the page header.
It's worth noting that like subsections a page can also have a `depends_on` which contains the set of conditions that must be met for the section to be accessibile to a data provider. If the conditions are not met then the page is not routed to as part of the form flow. The `depends_on` for a page will usually depend on answers given to questions, most likely to be questions in the setup section. In the above example the page is dependent on the answer to the `needstype` question being `1`, which corresponds to picking `General needs` on that question as displayed to the data provider.
Its worth noting that like subsections a page can also have a `depends_on` which contains the set of conditions that must be met for the section to be accessible to a data provider. If the conditions are not met then the page is not routed to as part of the form flow. The `depends_on` for a page will usually depend on answers given to questions, most likely to be questions in the setup section. In the above example the page is dependent on the answer to the `needstype` question being `1`, which corresponds to picking `General needs` on that question as displayed to the data provider.
Pages can contain one or more questions.
Pages can contain one or more [questions](question).

24
docs/form/question.md

@ -1,8 +1,16 @@
## Question
---
parent: Form definition
grand_parent: Generating forms
nav_order: 4
---
Questions are under the page level of the form definition. A example question might look something like this:
# Question
```JSON
Questions are under the page level of the form definition.
An example question might look something like this:
```json
"postcode_known": {
"check_answer_label": "Do you know the property postcode?",
"header": "Do you know the property’s postcode?",
@ -27,7 +35,7 @@ In the above example the the question has the id `postcode_known`.
The `check_answer_label` contains the text that will be displayed in the label of the table on the check answers page.
The header is text that is displayed for the question.
The header is text that is displayed for the question.
Hint text is optional, but if provided it sits under the header and is normally given to provide the data inputters with guidance when answering the question, for example it might inform them about terms used in the question.
@ -35,9 +43,9 @@ The type is question type, which is used to determine the view rendered for the
The `conditional_for` contains the value needed to be selected by the data inputter in order to display another question that appears on the same page. In the example above the `postcode_full` question depends on the answer to `postcode_known` being selected as `1` or `Yes`, this would then display the `postcode_full` underneath the `Yes` option on the page, allowing the provide the provide the postcode if they have indicated they know it. If the user has JavaScript enabled then this realtime conditional display is handled by the `app/frontend/controllers/conditional_question_controller.js` file.
the `hidden_in_check_answers` is used to hide a value from displaying on the check answers page. You only need to provide this if you want to set it to true in order to hide the value for some reason e.g. it's one of two questions appearing on a page and the other question is displayed on the check answers page. It's also worth noting that you can declare this as a with a `depends_on` which can be useful for conditionally displaying values on the check answers page. For example:
the `hidden_in_check_answers` is used to hide a value from displaying on the check answers page. You only need to provide this if you want to set it to true in order to hide the value for some reason e.g. it's one of two questions appearing on a page and the other question is displayed on the check answers page. It's also worth noting that you can declare this as a with a `depends_on` which can be useful for conditionally displaying values on the check answers page. For example:
```JSON
```json
"hidden_in_check_answers": {
"depends_on": [
{
@ -54,7 +62,7 @@ Would mean the question the above is attached to would be hidden in the check an
The answer the data inputter provides to some questions allows us to infer the values of other questions we might have asked in the form, allowing us to save the data inputters some time. An example of how this might look is as follows:
```JSON
```json
"postcode_full": {
"check_answer_label": "Postcode",
"header": "What is the property’s postcode?",
@ -79,4 +87,4 @@ In the above example the width is an optional attribute and can be provided for
The above example links to the first example as both of these questions would be on the same page. The `inferred_check_answers_value` is what should be displayed on the check answers page for this question if we infer it. If the value of `postcode_known` was given as `0` (which is a no), as seen in the condition part of `inferred_check_answers_value` then we can infer that the data inputter does not know the postcode and so we would display the value of `Not known` on the check answers page for the postcode.
In the above example the `inferred_answers` refers to a question where we can infer the answer based on the answer of this question. In this case the `la` question can be inferred from the postcode value given by the data inputter as we are able to lookup the local authority based on the postcode given. We then set a property on the case log `is_la_inferred` to true to indicate that this is an answer we've inferred.
In the above example the `inferred_answers` refers to a question where we can infer the answer based on the answer of this question. In this case the `la` question can be inferred from the postcode value given by the data inputter as we are able to lookup the local authority based on the postcode given. We then set a property on the case log `is_la_inferred` to true to indicate that this is an answer we've inferred.

15
docs/form_runner.md → docs/form/runner.md

@ -1,12 +1,17 @@
# Form Runner
---
parent: Generating forms
nav_order: 2
---
The Form Runner is composed of:
# Form runner
Ruby Classes:
The Form runner is composed of:
Ruby classes:
- A singleton form handler that instantiates an instances of each form definition (config file we have) combined with the setup section that is common to all forms. This is created at rails boot time. (`app/models/form_handler.rb`)
- A `Form` class that is the entry point for parsing a form definition and handles most of the associated logic (`app/models/form.rb`)
- `Section`, `Subsection`, `Page` and `Question` classes (`app/models/form/`)
- [`Section`](section), [`Subsection`](subsection), [`Page`](page) and [`Question`](question) classes (`app/models/form/`)
- Setup subsection specific instances (subclasses) of `Section`, `Subsection`, `Pages` and `Questions` (`app/form/setup/`)
ERB templates:
@ -22,4 +27,4 @@ All form pages submit to the same controller method (`app/controllers/form_contr
## Form models and definition
For information about the form model and related models (section, subsection, page, question) and how these relate to each other follow [this link](/docs/form/form.md)
For information about the form model and related models (section, subsection, page, question) and how these relate to each other see [form definition](/form/definition).

40
docs/form/section.md

@ -1,26 +1,34 @@
## Section
---
parent: Form definition
grand_parent: Generating forms
nav_order: 1
---
Sections are under the top level of the form definition. A example section might look something like this:
# Section
```JSON
Sections sit at the top level of a form definition.
An example section might look something like this:
```json
"sections": {
"tenancy_and_property": {
"label": "Property and tenancy information",
"subsections": {
"property_information": {
...
},
"tenancy_information": {
...
}
"tenancy_and_property": {
"label": "Property and tenancy information",
"subsections": {
"property_information": {
...
},
"tenancy_information": {
...
}
},
...
}
},
...
}
```
In the above example the section id would be `tenancy_and_property` and its subsections would be `property_information` and `tenancy_information`.
The label contains the text that users will see for that section in the tasklist page of a case log.
The label contains the text that users will see for that section in the task list page of a case log.
Sections can contain one or more subsections.
Sections can contain one or more [subsections](subsection).

22
docs/form/subsection.md

@ -1,8 +1,16 @@
## Subsection
---
parent: Form definition
grand_parent: Generating forms
nav_order: 2
---
Subsections are under the section level of the form definition. A example subsection might look something like this:
# Subsection
```JSON
Subsections sit below the [`Section`](section) level of a form definition.
An example subsection might look something like this:
```json
"property_information": {
"label": "Property information",
"depends_on": [
@ -21,8 +29,10 @@ Subsections are under the section level of the form definition. A example subsec
}
```
In the above example the the subsection has the id `property_information`. The `depends_on` contains the set of conditions that must be met for the section to be accessibile to a data provider, in this example subsection depends on the completion of the setup section/subsection (note that this is a common condition as the answers provided to questions in the setup subsection often have an impact on what questions are asked of the data provider in later subsections of the form).
In the above example the the subsection has the id `property_information`. The `depends_on` contains the set of conditions that must be met for the section to be accessible to a data provider, in this example subsection depends on the completion of the setup section/subsection (note that this is a common condition as the answers provided to questions in the setup subsection often have an impact on what questions are asked of the data provider in later subsections of the form).
The label contains the text that users will see for that subsection in the task list page of a case log.
The label contains the text that users will see for that subsection in the tasklist page of a case log.
The pages of the subsection in the example would be `property_postcode` and `property_local_authority`.
The pages of the subsection in the example would be `property_postcode` and `property_local_authority`. Subsections can contain one or more pages.
Subsections can contain one or more [pages](page).

4
docs/frontend.md

@ -1,3 +1,7 @@
---
nav_order: 2
---
# Frontend
## GOV.UK Design System components

17
docs/index.html

@ -1,17 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" type="text/css" href="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui.css">
<title>OpenAPI DLUHC CORE Data Collection</title>
<body><div id="openapi"><script src="https://cdnjs.cloudflare.com/ajax/libs/swagger-ui/4.12.0/swagger-ui-bundle.min.js"></script>
<script>
window.onload = function () {
const ui = SwaggerUIBundle({
url: "api/DLUHC-CORE-Data.v1.json",
dom_id: "#openapi"
})
}
</script>
</body>

67
docs/index.md

@ -0,0 +1,67 @@
---
nav_order: 1
---
# Overview
All lettings and and sales of social housing in England need to be logged with the Department for levelling up, housing and communities (DLUHC). This is done by data providing organisations: Local Authorities and Private Registered Providers (PRPs, i.e. housing associations).
Data is collected via a form that runs on an annual data collection window basis. Form changes are made annually to add new questions, remove any that are no longer needed, or adjust wording or answer options etc.
Each data collection window runs from 1 April to 1 April the following year (plus an extra 3 months to allow for any late submissions). This means that between April and June, 2 collection windows are open simultaneously and logs can be submitted for either.
ADD (Analytics & Data Directorate) statisticians are the other primary users of the service. The data collected is transferred to DLUHCs consolidated data store (CDS) via nightly XML exports to an S3 bucket. CDS ingests and transforms this data, ultimately storing it in a MS SQL database and exposing it to analysts and statisticians via Amazon Workspaces.
![Diagram of the CORE system architecture](../images/architecture.drawio.png)
## Users
External data providing organisations have 2 main user types:
- **Data coordinators** are administrators for their organisation, but may also complete logs
- **Data providers** complete the logs
Additionally there are data protection officers (DPO). For some organisations this is a separate role, but in our codebase this is modelled as an attribute of a user (i.e. a data coordinator or provider can additionally be a DPO). They are responsible for ensuring the organisation has signed the data sharing agreement.
There are also 2 internal user types:
- **Customer support:** can administrate all organisations
- **Statisticians:** primary consumers of the collected data
## Organisations
There are 2 types of organisation:
- An **owning organisations** own housing stock. It may manage the allocation of people in and out of their accommodation, or contract this function out to managing agents.
- A **managing organisation** (or managing agent) is responsible for the allocation of people in and out of accommodation, and/or responsible for the services provided to support those people in the accommodation (in the case of supported housing).
### Relationships between organisations
Organisations that own stock can contract out the management of that stock to another organisation. This relationship is often referred to as a parent/child relationship.
This is a useful analogy as a parent can have multiple children, and a child can have many parents. A child organisation can also be a parent, and a parent organisation can also be a child organisation:
![Organisational relationships](../images/organisational_relationships.png)
### User permissions within organisations
The case logs that a user can see depends on their role:
- Customer support users can access any case log
- Data coordinators can access any case log for which the organisation they work for is ultimately responsible for, meaning they can see logs managed by a child organisation
- Data providers can only access case logs for which their organisation manages (or directly owns)
Taking the relationships from the above diagram, and looking at which logs each user can access:
![User log access permissions](../images/user_log_permissions.png)
## Supported housing schemes
A supported housing scheme (or service) provides shared or self-contained housing for a particular client group, for example younger or vulnerable people. A scheme can be run at multiple locations, and a single location may contain multiple units (for example bedrooms in shared houses or a bungalow with 3 bedrooms).
Logs for supported housing will share a number of similar characteristics at this location. Additional data also needs to be collected specifically regarding the supported housing scheme, such as the type of client groups served and type of support provided.
Asking these questions would require data inputters to re-enter the same information repeatedly and answer more questions than those asked for general needs lettings. Schemes exist in CORE to reduce this burden, and effectively act as predefined answer sets.

6
docs/infrastructure.md

@ -1,3 +1,7 @@
---
nav_order: 5
---
# Infrastructure
## Deployment
@ -26,7 +30,7 @@ This application is running on [GOV.UK PaaS](https://www.cloud.service.gov.uk/).
cf push dluhc-core --strategy rolling
```
This will use the [manifest file](staging_manifest.yml)
This will use the [manifest file](https://github.com/communitiesuk/submit-social-housing-lettings-and-sales-data/blob/main/manifest.yml)
Once the app is deployed:

6
docs/monitoring.md

@ -1,6 +1,10 @@
---
nav_order: 6
---
# Monitoring
We use self-hosted Prometheus and Grafana for monitoring infrastructure metrics. These are run in a dedicated Gov PaaS space called "monitoring" and are deployed as Docker images using Github action pipelines. The repository for these and more information is here: [dluhc-data-collection-monitoring](https://github.com/communitiesuk/dluhc-data-collection-monitoring).
We use self-hosted Prometheus and Grafana for monitoring infrastructure metrics. These are run in a dedicated Gov PaaS space called "monitoring" and are deployed as Docker images using GitHub action pipelines. The repository for these and more information is here: [dluhc-data-collection-monitoring](https://github.com/communitiesuk/dluhc-data-collection-monitoring).
## Performance monitoring and alerting

25
docs/organisations.md

@ -1,25 +0,0 @@
# Organisational relationships
## Definitions
- **Stock owning organisation**: An organisation that owns housing stock. It may manage the allocation of people in and out of their accommodation, or it may contract this out to managing agents.
- **Managing agent**: In scenarios where one organisation owns stock and another organisation is contracted to manage the stock and tenants, the latter organisation is often called a ‘managing agent’. Managing agents are responsible for the allocation of people in and out of the accommodation, and/or responsible for the services provided to support those people in the accommodation (in the case of supported housing).
## Permissions
Organisations that own stock can contract out the management of that stock to another organisation. This relationship is often referred to as a parent/child relationship. This is a useful analogy as a parent can have multiple children, and a child can have many parents. A child organisation can also be a parent, and a parent organisation can also be a child organisation:
![Organisational relationships](images/organisational_relationships.png)
The case logs that a user can see depends on their role:
- Customer support users can access any case log
- Data coordinators can access any case log for which the organisation they work for is ultimately responsible for, meaning they can see logs managed by a child organisation
- Data providers can only access case logs for which their organisation manages (or directly owns)
Taking the relationships from the above diagram, and looking at which logs each user can access:
![User log access permissions](images/user_log_permissions.png)

7
docs/schemes.md

@ -1,7 +0,0 @@
# Supported housing schemes
A supported housing scheme (or service) provides shared or self-contained housing for a particular client group, for example younger or vulnerable people. A scheme can be run at multiple locations, and a single location may contain multiple units (for example bedrooms in shared houses or a bungalow with 3 bedrooms).
Logs for supported housing will share a number of similar characteristics at this location. Additional data also needs to be collected specifically regarding the supported housing scheme, such as the type of client groups served and type of support provided.
Asking these questions would require data inputters to re-enter the same information repeatedly and answer more questions than those asked for general needs lettings. Schemes exist in CORE to reduce this burden, and effectively act as predefined answer sets.

5
docs/service_overview.md

@ -1,5 +0,0 @@
# Service overview
All lettings and and sales of social housing in England need to be logged with the Department for levelling up, housing and communities (DLUHC). This is done by Local Authorities and Housing Associations, who are the primary users of this service. Data is collected via a form that runs on an annual data collection window basis. Form changes are made annually to add new questions, remove any that are no longer needed, or adjust wording or answer options etc. Each data collection window runs from 1st April to 1st April + an extra 3 months to allow for any late submissions, meaning that between April and June, two collection windows are open simultaneously and logs can be submitted for either.
ADD (Analytics & Data Directorate) statisticians are the other primary users of the service. The data collected is transferred to DLUHCs data warehouse (CDS - consolidated data store), via nightly exports to XML which are transferred to S3 and ingested from there. CDS ingests and transforms the data, ultimately storing it in a MS SQL database and exposing it to analysts and statisticians via Amazon Workspaces.

10
docs/developer_setup.md → docs/setup.md

@ -1,4 +1,8 @@
# Developing locally on host machine
---
nav_order: 2
---
# Local development
The most common way to run a development version of the application is run with local dependencies.
@ -8,7 +12,7 @@ Dependencies:
- [Rails](https://rubyonrails.org/)
- [PostgreSQL](https://www.postgresql.org/)
- [NodeJS](https://nodejs.org/en/)
- [Gecko driver](https://github.com/mozilla/geckodriver/releases) [for running Selenium tests]
- [Gecko driver](https://github.com/mozilla/geckodriver/releases) (for running Selenium tests)
We recommend using [RBenv](https://github.com/rbenv/rbenv) to manage Ruby versions.
@ -34,7 +38,7 @@ We recommend using [RBenv](https://github.com/rbenv/rbenv) to manage Ruby versio
sudo su - postgres -c "createuser <username> -s -P"
```
3. Install RBenv & Ruby-build
3. Install RBenv and Ruby-build
macOS:

6
docs/testing.md

@ -1,4 +1,8 @@
# Testing strategy
---
nav_order: 4
---
# Testing
- We use [RSpec](https://rspec.info/) and [Capybara](https://teamcapybara.github.io/capybara/)

15
docs/users.md

@ -1,15 +0,0 @@
# User roles
## External users
The primary users of the system are external data providing organisations: Local Authorities and Private Registered Providers (Housing Associations). These have 2 main user types:
- Data coordinators – administrators for their own organisation, can also complete logs
- Data providers – complete the logs
Additionally there are Data Protection Officers (DPO), which for some organisations is a separate role, but in our codebase is modelled as an attribute of the user (i.e. a data coordinator or provider can additionally be a DPO). They are responsible for ensuring the organisation has signed the data sharing agreement.
## Internal users
- Customer support (help desk) – can administrate all organisations
- ADD statisticians – primary consumers of the data collected via CDS/DAP
Loading…
Cancel
Save