@ -12,4 +12,4 @@ In order to use the app as an API, you will need to configure requests to the AP
- `Content-Type = application/json`
- `Content-Type = application/json`
- `Action = application/json` N.B. If you use `*/*` instead, the request won't be recognised as an API request`
- `Action = application/json` N.B. If you use `*/*` instead, the request won't be recognised as an API request`
Currently only the logs controller is configured to accept and authenticate API requests, when the above API environment variables are set.
Currently, only the Logs Controller is configured to accept and authenticate API requests, provided that the specified API environment variables are set. Please note that the API has not been actively maintained for an extended period and may not function as expected. Additionally, the required environment variables are not configured on any of the environments deployed on AWS, rendering API requests to those environments non-functional.
All data collected by the application needs to be exported to the Consolidated Data Store (CDS) which is a data warehouse based on MS SQL running in the DAP (Data Analytics Platform).
All data collected by the application needs to be exported to the Consolidated Data Store (CDS) which is a data warehouse based on MS SQL running in the DAP (Data Analytics Platform).
This is done via XML exports saved in an S3 bucket located in the DAP VPC using dedicated credentials shared out of band. The data mapping for this export can be found in `app/services/exports/lettings_log_export_service.rb`
This is done via XML exports saved in an S3 bucket.
We currently export lettings logs, users and organisations.
The data mapping for these exports can be found in:
Initially the application database field names and field types were chosen to match the existing CDS data as closely as possible to minimise the amount of transformation needed. This has led to a less than optimal data model though and increasingly we should look to transform at the mapping layer where beneficial for our application.
Initially the application database field names and field types were chosen to match the existing CDS data as closely as possible to minimise the amount of transformation needed. This has led to a less than optimal data model though and increasingly we should look to transform at the mapping layer where beneficial for our application.
We have a cron job triggering the export service daily at 5am.
We have a cron job triggering the export service daily at 5am.
The S3 bucket is located in the DAP VPC rather than the application VPC as DAP runs in an AWS account directly so access to the S3 bucket can be restricted to only the IPs used by the application. This is not possible the other way around as [Gov PaaS does not support restricting S3 access by IP](https://github.com/alphagov/paas-roadmap/issues/107).
## Other options previously considered
- CDC replication using a managed service such as [AWS DMS](https://aws.amazon.com/dms/)
- Would require VPC peering which [Gov PaaS does not currently support](https://github.com/alphagov/paas-roadmap/issues/105)
- Would require CDS to make changes to their ingestion model
The setup this log section is treated slightly differently from the rest of the form. It is more accurately viewed as providing metadata about the form than as being part of the form itself. It also needs to know far more about the application specific context than other parts of the form such as who the current user is, what organisation they’re part of and what role they have etc.
The setup this log section is treated slightly differently from the rest of the form. It is more accurately viewed as providing metadata about the form than as being part of the form itself. It also needs to know far more about the application specific context than other parts of the form such as who the current user is, what organisation they’re part of and what role they have etc.
As a result it’s not modelled as part of the config but rather as code. It still uses the same [Form Runner](/form/runner) components though.
## Features the Form supports
## Features the Form Config supports
- Defining sections, subsections, pages and questions that fit the GOV.UK task list pattern
- Defining sections, subsections, pages and questions that fit the GOV.UK task list pattern
- Auto-generated routes –URLs are automatically created from dasherized page names
- Auto-generated routes –URLs are automatically created from dasherized page names (ids)
- Data persistence requires a database field to exist which matches the name/id for each question (and answer option for checkbox questions)
- Data persistence requires a database field to exist which matches the name/id for each question (and answer option for checkbox questions)
@ -39,63 +37,84 @@ As a result it’s not modelled as part of the config but rather as code. It sti
- For complex HTML guidance partials can be referenced
- For complex HTML guidance partials can be referenced
## JSON Config
## Form definition
The form for this is driven by a JSON file in `/config/forms/{start_year}_{end_year}.json`
@ -127,47 +146,8 @@ Assumptions made by the format:
Form navigation works by stepping sequentially through every page defined in the JSON form definition for the given subsection. For every page it checks if it has "depends_on" conditions. If it does, it evaluates them to determine whether that page should be show or not.
Form navigation works by stepping sequentially through every page defined in the JSON form definition for the given subsection. For every page it checks if it has "depends_on" conditions. If it does, it evaluates them to determine whether that page should be show or not.
In this way we can build up whole branches by having:
We can also define custom `routed_to?` methods on pages for more complex routing logic.
This will validate the given form definition against the schema in `config/forms/schema/generic.json`.
You can also run:
```bash
rake form_definition:validate_all
```
This will validate all forms in directories `["config/forms", "spec/fixtures/forms"]`
## Form models and definition
## Form models and definition
For information about the form model and related models (section, subsection, page, question) and how these relate to each other see [form definition](/form/definition).
For information about the form model and related models (section, subsection, page, question) and how these relate to each other see [form definition](/form/definition).
## Improvements that could be made
- JSON schema definition could be expanded such that we can better automatically validate that a given config is valid and internally consistent
- Generators could parse a given valid JSON form and generate the required database migrations to ensure all the expected fields exist and are of a compatible type
- The parsed form could be visualised using something like GraphViz to help manually verify the coded config meets requirements
The current system is built around a form definition written in JSON. At the top level every form will expect to have the following attributes:
The current system is built around a form definition constructed from various Form subclasses. At the top level every form will expect to have the following attributes:
- Form type: this is to define whether the form is a lettings form or a sales form. The questions will differ between the types.
- Form type: this is to define whether the form is a lettings form or a sales form. The questions will differ between the types.
- Start date: the start of the collection window for the form, this will usually be in April.
- Start date: the start of the collection window for the form, this will usually be in April.
- End date: the end date of the collection window for the form, this will usually be in July, a year after the start date.
- Submission deadline: the official end date of the collection window for the form, this will usually be in July, a year after the start date.
- New logs end date: the end date for creating any new logs for this form
- Edit end date: the end date for editing any existing logs for this form
- Sections: the sections in the form, this block is where the bulk of the form definition will be.
- Sections: the sections in the form, this block is where the bulk of the form definition will be.
An example of this might look like the following:
```json
{
"form_type": "lettings",
"start_date": "2021-04-01T00:00:00.000+01:00",
"end_date": "2022-07-01T00:00:00.000+01:00",
"sections": {
...
}
}
```
Note that the end date of one form will overlap the start date of another to allow for late submissions. This means that every year there will be a period of time in which two forms are running simultaneously.
Note that the end date of one form will overlap the start date of another to allow for late submissions. This means that every year there will be a period of time in which two forms are running simultaneously.
A form is split up is as follows:
A form is split up is as follows:
@ -39,24 +28,24 @@ Rails uses the model, view, controller (MVC) pattern which we follow.
## Form model
## Form model
There is no need to manually initialise a form object as this is handled by the FormHandler class at boot time. If a new form needs to be added then a JSON file containing the form definition should be added to `config/forms` where the FormHandler will be able to locate it and instantiate it.
There is no need to manually initialise a form object as this is handled by the FormHandler class at boot time.
A form has the following attributes:
A form has the following attributes:
- `name`: The name of the form
- `name`: The name of the form
- `setup_sections`: The setup section (this is not defined in the JSON, for more information see this)
- `setup_sections`: The setup section
- `form_definition`: The parsed form JSON
- `form_sections`: The sections passed to form on init
- `form_sections`: The sections found within the form definition JSON
- `type`: The type of form (this is used to indicate if the form is for a sale or a letting)
- `type`: The type of form (this is used to indicate if the form is for a sale or a letting)
- `sections`: The combination of the setup section with those found in the JSON definition
- `sections`: The combination of the setup section with form sections
- `subsections`: The subsections of the form (these live under the sections)
- `subsections`: The subsections of the form (these live under the sections)
- `pages`: The pages of the form (these live under the subsections)
- `pages`: The pages of the form (these live under the subsections)
- `questions`: The questions of the form (these live under the pages)
- `questions`: The questions of the form (these live under the pages)
- `start_date`: The start date of the form, in ISO 8601 format
- `start_date`: The start date of the form, in ISO 8601 format
- `end_date`: The end date of the form, in ISO 8601 format
- `submission_deadline`: The official end date of the form, in ISO 8601 format
- `new_logs_end_date`: The new logs end date of the form, in ISO 8601 format
- `edit_end_date`: The edit end date of the form, in ISO 8601 format
Each form has an `end_date` which for JSON forms is defined in the form definition JSON file and for code defined forms it is set to 1st July, 1 year after the start year.
Logs with a form that has `edit_end_date` in the past can no longer be edited through the UI.
Logs with a form that has `end_date` in the past can no longer be edited through the UI.
@ -13,18 +13,12 @@ A paper form is produced for guidance and to help data providers collect the dat
Data is accepted for a collection window for up to 3 months after it’s finished to allow for late data submission. This means that between April and July 2 versions of the form run simultaneously.
Data is accepted for a collection window for up to 3 months after it’s finished to allow for late data submission. This means that between April and July 2 versions of the form run simultaneously.
Other considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a user interface.
Other initial considerations that went into our design are being able to re-use as much of this solution for other data collections, and possibly having the ability to generate the form and/or form changes from a user interface.
We haven’t used micro-services, preferring to deploy a single application but we have modelled the form itself as configuration in the form of a JSON structure that acts as a sort of DSL/form builder for the form.
Each form has historically been defined as a JSON configuration, but has since been replaced with subsection, page and question classes that contruct a form in code due to increased complexity.
The idea is to decouple the code that creates the required routes, controller methods, views etc to display the form from the actual wording of questions or order of pages such that it becomes possible to make changes to the form with little or no code changes.
To allow for easier content changes, the copy for questions has been extracted into translation files. The reasoning for this is the following assumptions:
This should also mean that in the future it could be possible to create an interface that can construct the JSON config, which would open up the ability to make form changes to a wider audience. Doing this fully would require generating and running the necessary migrations for data storage, generating the required ActiveRecord methods to validate the data server side, and generating/updating API endpoints and documentation. All of this is likely to be beyond the scope of initial MVP but could be looked at in the future.
Since initially the JSON config will not create database migrations or ActiveRecord model validations, it will instead assume that these have been correctly created for the config provided. The reasoning for this is the following assumptions:
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
- The form will be tweaked regularly (amending questions wording, changing the order of questions or the page a question is displayed on)
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
- The actual data collected will change very infrequently. Time series continuity is very important to ADD (Analysis and Data Directorate) so the actual data collected should stay largely consistent i.e. in general we can change the question wording in ways that makes the intent clearer or easier to understand, but not in ways that would make the data provider give a different answer.
A form parser class will parse this config into ruby objects/methods that can be used as an API by the rest of the application, such that we could change the underlying config if needed (for example swap JSON for YAML or for DataBase objects) without needing to change the rest of the application. We’ll call this the Form Runner part of the application.
In the above example the the subsection has the id `property_postcode`. This id is used for the url of the web page, but the underscore is replaced with a hash, so the url for this page would be `[environment-url]/logs/[log-id]/property-postcode` e.g. on staging this url might look like the following: `https://dluhc-core-staging.london.cloudapps.digital/logs/1234/property-postcode`.
In the above example the the subsection has the id `property_postcode`. This id is used for the url of the web page, but the underscore is replaced with a dash, so the url for this page would be `[environment-url]/logs/[log-id]/property-postcode` e.g. on staging this url might look like the following: `https://staging.submit-social-housing-data.communities.gov.uk/logs/1234/property-postcode`.
The header is optional but if provided is used for the heading displayed on the page.
The header is optional but if provided is used for the heading displayed on the page.
@ -10,25 +10,25 @@ Questions are under the page level of the form definition.
An example question might look something like this:
An example question might look something like this:
```json
```
"postcode_known": {
class Form::Sales::Questions::PostcodeKnown <::Form::Question
"check_answer_label": "Do you know the property postcode?",
def initialize(id, hsh, page)
"header": "Do you know the property’s postcode?",
super
"hint_text": "",
@id = postcode_known
"type": "radio",
@hint_text = ""
"answer_options": {
@header = "Do you know the property postcode?"
"1": {
@check_answer_label = "Do you know the property postcode?"
"value": "Yes"
@type = "radio"
},
@answer_options = {
"0": {
"1" => { "value" => "Yes" },
"value": "No"
"0" => { "value" => "No" }
}
},
},
"conditional_for": {
@conditional_for = {
"postcode_full": [1]
"postcode_full" => [1]
},
},
"hidden_in_check_answers": true
@hidden_in_check_answers = true
}
end
end
```
```
In the above example the the question has the id `postcode_known`.
In the above example the the question has the id `postcode_known`.
@ -45,15 +45,11 @@ The `conditional_for` contains the value needed to be selected by the data input
the `hidden_in_check_answers` is used to hide a value from displaying on the check answers page. You only need to provide this if you want to set it to true in order to hide the value for some reason e.g. it's one of two questions appearing on a page and the other question is displayed on the check answers page. It's also worth noting that you can declare this as a with a `depends_on` which can be useful for conditionally displaying values on the check answers page. For example:
the `hidden_in_check_answers` is used to hide a value from displaying on the check answers page. You only need to provide this if you want to set it to true in order to hide the value for some reason e.g. it's one of two questions appearing on a page and the other question is displayed on the check answers page. It's also worth noting that you can declare this as a with a `depends_on` which can be useful for conditionally displaying values on the check answers page. For example:
```json
```
"hidden_in_check_answers": {
@hidden_in_check_answers = {
"depends_on": [
"depends_on" => [
{
{ "age6_known" => 0 },
"age6_known": 0
{ "age6_known" => 1 }
},
{
"age6_known": 1
}
]
]
}
}
```
```
@ -62,25 +58,25 @@ Would mean the question the above is attached to would be hidden in the check an
The answer the data inputter provides to some questions allows us to infer the values of other questions we might have asked in the form, allowing us to save the data inputters some time. An example of how this might look is as follows:
The answer the data inputter provides to some questions allows us to infer the values of other questions we might have asked in the form, allowing us to save the data inputters some time. An example of how this might look is as follows:
```json
```
"postcode_full": {
class Form::Sales::Questions::PostcodeFull <::Form::Question
"check_answer_label": "Postcode",
def initialize(id, hsh, page)
"header": "What is the property’s postcode?",
super
"hint_text": "",
@id = postcode_full
"type": "text",
@hint_text = ""
"width": 5,
@header = "What is the property’s postcode?""
"inferred_answers": {
@check_answer_label = "Postcode""
"la": {
@type = "text"
"is_la_inferred": true
@width = 5
@inferred_answers = {
"la" => { "is_la_inferred" => true }
}
}
},
@inferred_check_answers_value => [{
"inferred_check_answers_value": [{
"condition" => { "postcode_known" => 0 },
"condition": {
"postcode_known": 0
},
"value": "Not known"
"value": "Not known"
}]
}]
}
end
end
```
```
In the above example the width is an optional attribute and can be provided for text type questions to determine the width of the text box on the page when when the question is displayed to a user (this allows you to match the width of the text box on the page to that of the design for a question).
In the above example the width is an optional attribute and can be provided for text type questions to determine the width of the text box on the page when when the question is displayed to a user (this allows you to match the width of the text box on the page to that of the design for a question).
In the above example the the subsection has the id `property_information`. The `depends_on` contains the set of conditions that must be met for the section to be accessible to a data provider, in this example subsection depends on the completion of the setup section/subsection (note that this is a common condition as the answers provided to questions in the setup subsection often have an impact on what questions are asked of the data provider in later subsections of the form).
In the above example the the subsection has the id `property_information`. The `depends_on` contains the set of conditions that must be met for the section to be accessible to a data provider, in this example subsection depends on the completion of the setup section/subsection (note that this is a common condition as the answers provided to questions in the setup subsection often have an impact on what questions are asked of the data provider in later subsections of the form).
The label contains the text that users will see for that subsection in the task list page of a lettings log.
The label contains the text that users will see for that subsection in the task list page of a lettings log.
The pages of the subsection in the example would be `property_postcode` and `property_local_authority`.
The pages of the subsection in the example would be `PropertyPostcode` and `PropertyLocalAuthority`.
Subsections can contain one or more [pages](page).
Subsections can contain one or more [pages](page).