Created 3 years ago, updated 3 months ago

This house price per square metre dataset is created through complex address-based matching between the Land Registry’s Price Paid Data (LR-PPD) and property size information from the Domestic Energy Performance Certificates (EPC) data published by the Department for Levelling Up, Housing and Communities (DLUHC, formerly MHCLG). Details of the data linkage are published in the UCL Open: Environment along with the related linkage code via the UK Data Service ReShare repository.

During this data linkage process, the transactions assigned as category B (Additional Price Paid entry) and other property types are removed. Here we publish our latest limited attribute version of the uncorrected house price per square metre dataset in England and Wales with the LR-PPD data (1/1/1995-26/2/2021) and Domestic EPCs data (the sixth version, up to 20/9/2020) downloaded on 1/4/2021 for non-commercial purpose. This uncorrected version of house price per square metre dataset records over 18 million transactions with 16 variables in England and Wales since 1995. Unlike in our published article, in this uncorrected version we have not removed transactions with any improbable price per square metre values - i.e. where either the transaction price or total floor area values are null, 0 or too low to be realistic. This uncorrected version of the data will offer the most flexibility for researchers.

We offer technical validation and data cleaning code via the UKDA ReShare repository to help users evaluate the representation of the linked data for a given time period. The data cleaning code shows our methods for cleaning up unlikely floor size records before using this data in analysis. Users can create their own rules and undertake this clean-up process based on their own experience and research aims.

This limited attribute version is published by local authority (2021 version). Details of the 16 variables are described in the explanation file. The National Statistics Postcode Lookup NSPL (May 2021 version) is used to assign the local authority unit for your production of area-based statistics. Users can match historical changes in LA boundaries by choosing appropriate aggregations using, for instance ONSPD, and the postcode variable in our dataset.

An extended version of this dataset containing additional variables is available from UK Data Service Reshare service. Users can directly access this full version dataset (tranall_link_01042021.zip) via the following link:  https://reshare.ukdataservice.ac.uk/855033/ . Accompanying LR-PPD and EPC data are also supplied through the ReShare service. Users who would like to attach their own additional variables from the LR-PPD data are advised to use the transactionid variable to link to the LR-PPD (LRPPD_01042021.zip). Users who would like to attach additional variables from the EPC data are advised to use the id variable to link to the sixth version Domestic EPCs (epc6_id.zip). 

The 2023 update

 In 2023, a third updated version of the house price per square metre dataset has been published on the London Datastore ( hpm_la_2023.zip ). This new version extends the data coverage to 2023, but keeps the same publication format as the first publication. The new version is the linked result based on LR-PPD data (1/1/1995-31/10/2023) and Domestic EPCs data (end with 31/10/2023) downloaded on 2/12/2023 for non-commercial purposes. This uncorrected version of the house price per square metre dataset records almost 21 million transactions in England and Wales since 1995.

The match rate of this linked data varies across time, thus we suggest the user chooses the time coverage needed based on their own research. The id variable has been newly generated for this new version of the Domestic EPC dataset published by DLUHC at the end of the 2023, so it is not the same as the id variable in our previous publication( hpm_la.zip  or hpm_la_2022.zip ) and should not be used to link between the new and the previous publication.

 The National Statistics Postcode Lookup NSPL (November 2023 version) is used to assign the local authority unit for this second publication. An extended version of this dataset containing additional variables from LR-PPD and Domestic EPCs will be available in due course from UK Data Service Reshare service.

The 2022 update

In 2022, a second updated version of the house price per square metre dataset has been published on the London Datastore (hpm_la_2022.zip). This new version extends the data coverage to 2022, but keeps the same publication format as the first publication. The new version is the linked result based on LR-PPD data (1/1/1995-27/6/2022) and Domestic EPCs data (the twelfth version: end with 30/6/2022) downloaded on 13/8/2022 for non-commercial purposes. This uncorrected version of the house price per square metre dataset records almost 20 million transactions in England and Wales since 1995.

The match rate of this linked data before 2008 is lower, thus we suggest the user chooses the time coverage needed based on their own research. The id variable has been newly generated for this new data from the twelfth version of the Domestic EPC dataset published by DLUHC, so it is not the same as the id variable in our previous publication(hpm_la.zip) and should not be used to link between the new and the previous publication.

The National Statistics Postcode Lookup NSPL (August 2022 version) is used to assign the local authority unit for this second publication. An extended version of this dataset containing additional variables from LR-PPD and Domestic EPCs is available on the UK Data Service ( https://reshare.ukdataservice.ac.uk/856204/ )

Warning: Large File Size

Warning: Large File Size