TEDS Data Dictionary

Postcode-Linked Measures

Contents of this page:

Introduction

This page describes measures that have been linked to TEDS participant postcodes, and which now form part of the TEDS dataset. The measures that have been linked from public data sources are available for sharing outside TEDS, in the same way as other phenotypic variables. The postcodes themselves are identifiers and are not available for sharing.

The table below summarises the linked measures. Each linkage was carried out using participant postcodes dating from specific years, as shown. The linked measures themselves are typically available in different dated versions, and the table indicates which version was chosen (or available) for linkage.

Measure Date or version of measure Date of participant postcodes
UK Census variables 2001 1st Contact study (roughly 1998)
2001 2008
2001 TEDS21 study (roughly 2018)
2011 1st Contact study (roughly 1998)
2011 2008
2011 TEDS21 study (roughly 2018)
Atmospheric pollution variables 2001-2003 1st Contact study (roughly 1998)
2008 2008
2018 TEDS21 study (roughly 2018)
English Deprivation Indices 2015 16 Year study (2011-12)
2019 TEDS21 study (roughly 2018)
Acorn classification [unknown] 1st Contact study (roughly 1998)
2003-2012 2008

Data sharing

Most of the postcode-linked variables documented on this page are available for data sharing in the same way as phenotypic variables, via the TEDS data request mechanism.

The exception is the Acorn classification. Versions of this were linked to TEDS postcodes at two time points by a commercial company (CACI). The linked Acorn data were provided to TEDS under a commercial licence, which prevents us from sharing these variables with researchers outside TEDS.

Where the results of analysis of the linked measures are published, the sources of the measures should be appropriately cited.

The linkage of the census and pollution measures was carried out for TEDS by Jessye Maxwell. Before requesting these postcode-linked measures in a data request, a collaborator should contact Jessye Maxwell (jessye.maxwell@kcl.ac.uk) as a potential collaborator and co-author.

Linkage of both census and pollution data made use of both National Statistics (census or UK Air) and boundary shapefiles. The boundary shapefiles were used to map postcode areas to source data-coverage areas, and were downloaded from the OS (Ordinance Survey). The National Statistics and OS data were obtained free of charge but under licence, and are subject to copyright. Appropriate copyright statements are as follows (year refers to the dataset used for a given linkage):
Contains National Statistics data © Crown copyright and database right [year]
Contains OS data © Crown copyright [and database right] (year)

Census variables

The census measures, along with the pollution data, were linked to TEDS participant postcodes in 2020 by Jessye Maxwell. When publishing new results from these linked TEDS data, please acknowledge this paper: Maxwell et al (2022), Genetic and Geographical Associations With Six Dimensions of Psychotic Experiences in Adolesence, Schizophrenia Bulletin, DOI: 10.1093/schbul/sbac149 (https://doi.org/10.1093/schbul/sbac149).

The census data were downloaded from the Nomis web site, maintained by University of Durham, and operating on behalf of the Office for National Statistics (ONS). Terms and conditions of use of the data are given here: https://www.nomisweb.co.uk/home/terms.asp. Both Nomis and the ONS should be acknowledged in any publication of results from these data.

The raw census data were downloaded from this link: https://www.nomisweb.co.uk/query/select/getdatasetbytheme.asp.

Sets of participant postcodes dating from approximately 1998 (1st Contact study), 2008, and 2018 (TEDS21 study) were linked. The 1998 and 2008 postcodes were for parent addresses, and as such were each linked to one set of data per family. The 2018 postcodes were for twin addresses, and the data are linked to individual twins. However, in most families both twins still had the same family address and postcode, and in these cases the linked data are identical for both twins.

Two sets of census variables were downloaded and linked: from the 2001 census and from the 2011 census. Each of these two census datasets was linked to each of the three sets of TEDS participant postcodes, resulting in 6 linked datasets.

Variable names in the TEDS postcode-linked census datasets generally have this structure: censAAbbbCCdddd. AA is the census year, 01 or 11. bbb (three letters) is an abbreviation of the census theme or category, for example pop for resident population. CC is the linked postcode year, 98 or 08 or 18. dddd (often more than four letters) is an abbreviation of the specific variable's meaning, for example pfemale for percentage female in the population. Census variables linked to TEDS postcodes (pdf) provides documentation of all the available linked census variables in the dataset.

In the raw data, most of the census variables are counts of people who gave specific responses to questions, within the relevant postcode area, in the given census. Such raw counts are not comparable between postcode areas and hence are not comparable between participants in the TEDS dataset. Therefore, for the purpose of the dataset, raw counts have been converted into proportions (or percentages). For example, the number of female inhabitants (raw variable) has been converted into % female (dataset variable).

In some cases, raw response categories were aggregated when deriving the dataset variables. For example, the numbers of inhabitants aged 0-4, 5-7, 8-9, 10-14 and 15 years were aggregated into the total number aged 0-15 then converted to % aged 0-15. There are several reasons for such aggregation. One is data reduction, in order to simplify the dataset and reduce the number of variables. Another is to ensure compatibility between the 2001 and 2011 census datasets, because categories differed for some measures between datasets. Another reason is that some subcategories had very small Ns unlikely to be useful in analysis.

Pollution variables

The pollution measures, along with the census data, were linked to TEDS participant postcodes in 2020 by Jessye Maxwell. When publishing new results from these linked TEDS data, please acknowledge this paper: Maxwell et al (2022), Genetic and Geographical Associations With Six Dimensions of Psychotic Experiences in Adolesence, Schizophrenia Bulletin, DOI: 10.1093/schbul/sbac149 (https://doi.org/10.1093/schbul/sbac149).

The pollution data were downloaded from UK Air, part of the UK Government's Defra web site (Department for Environment Food & Rural Affairs): https://uk-air.defra.gov.uk/data/pcm-data. They provide measures of common gaseous and particulate atmospheric pollutants in given locations during a given year (see below). UK Air should be acknowledged in any publication of results from these data.

All content is available under the Open Government Licence v3.0, except where otherwise stated on the web site. More details on terms and conditions can be found here: https://uk-air.defra.gov.uk/about-these-pages.

The sets of TEDS postcodes used to linked the pollution data were the same as those used to link the census data (see above), dating from roughly 1998, 2008 and 2018.

Three sets of pollution variables were downloaded and linked, resulting in three TEDS datasets:

  • 2001-03 pollution data, linked to 1998 postcodes
  • 2008 pollutions data, linked to 2008 postcodes
  • 2018 pollutions data, linked to 2018 postcodes

Pollution data were not available from 1998, so instead the nearest available years were chosen for linking to 1998 postcodes; these were 2003 for benzene and ozone, 2002 for sulphur dioxide and for particulates less than 2.5 microns, and 2001 for the other pollutant measures.

The linked pollution variables are as shown in this table:

Pollutant Measurement Units Variable name
(YYYY = postcode year)
Particulate matter with diameter less than 10 micrometres Annual mean concentration TEOM units pollutionYYYYpm10
Particulate matter with diameter less than 2.5 micrometres Annual mean concentration microgrammes per square metre pollutionYYYYpm25
Nitrogen dioxide (NO2) Annual mean concentration microgrammes per square metre pollutionYYYYno2
Nitrogen oxides (NOx) Annual mean concentration microgrammes per square metre pollutionYYYYnox
Sulphur dioxide (SO2) Annual mean concentration microgrammes per square metre pollutionYYYYno2
Benzene Annual mean concentration microgrammes per square metre pollutionYYYYbenzene
Ozone Number of days in which the maximum 8-hour concentration exceeded 120 microgrammes per square metre Number of days (per year) pollutionYYYYozone

Flag variables named pollutionYYYYdata are also in the datasets, to show the presence or absence of linked data for a given year (YYYY=1998, 2008 or 2018).

The 1998 and 2008 postcodes were for parent addresses, and as such were linked to one set of data per family. The 2018 postcodes were for twin addresses, and the data are linked to individual twins. However, in most families both twins still had the same family address and postcode and in these cases the linked data are identical for both twins. Flag variable pollution2018same indicates whether or not the two twins in a pair were linked to the same postcode.

Deprivation Index variables

The English Deprivation Indices were linked to TEDS postcodes using the tools at this Governmental web site: https://www.gov.uk/government/collections/english-indices-of-deprivation. Users of the linked data should refer to this website for further documentation of the measures and conditions of use. Two TEDS datasets were linked:

  1. 2019 English Indices of Deprivation (http://imd-by-postcode.opendatacommunities.org/imd/2015) were linked to postcodes dating from roughly 2018, when the TEDS21 study data were collected.
  2. 2015 English Indices of Deprivation (http://imd-by-postcode.opendatacommunities.org/imd/2019) were linked to postcodes dating from roughly 2012, when the 16 Year study data were collected.

Variables in the two datasets are described in the table below. Variables are named dep15pc2012zzzz for 2012 postcodes linked to 2015 deprivation indices, or dep19pc2018zzzz for 2018 postcodes linked to 2012 deprivation indices (where zzzz is an abbreviation of the variable's meaning). Hence, in the table below, XX is either 15 or 19 (deprivation index year) while YYYY is either 2012 or 2018 (postcode year) respectively.

Variable name Measurement Values
depXXpcYYYYdata Data flag: are the linked data present 1Y, 0N
depXXpcYYYYimddec Multiple deprivation index decile integers, 1 to 10
depXXpcYYYYincdec Income decile integers, 1 to 10
depXXpcYYYYempdec Employment decile integers, 1 to 10
depXXpcYYYYedskdec Education and Skills decile integers, 1 to 10
depXXpcYYYYhlthdec Health and Disability decile integers, 1 to 10
depXXpcYYYYcrimedec Crime decile integers, 1 to 10
depXXpcYYYYservdec Barriers to Housing and Services decile integers, 1 to 10
depXXpcYYYYenvdec Living Environment decile integers, 1 to 10
depXXpcYYYYidacidec IDACI decile integers, 1 to 10
depXXpcYYYYidaopidec IDAOPI decile integers, 1 to 10
depXXpcYYYYincsc Income score decimals, 0 to 1
depXXpcYYYYempsc Employment score decimals, 0 to 1
depXXpcYYYYidacisc IDACI score decimals, 0 to 1
depXXpcYYYYidaopisc IDAOPI score decimals, 0 to 1

The 2012 postcodes were for parent addresses, and as such were linked to one set of data per family. The 2018 postcodes were for twin addresses, and the data are linked to individual twins. However, in most families both twins still had the same family address and postcode and in these cases the linked data are identical for both twins. Flag variable dep19pc2018same indicates whether or not the two twins in a pair were linked to the same postcode.

Acorn codes

The Acorn code data were linked by commercial company CACI (see https://acorn.caci.co.uk/). The data were provided to TEDS under commercial licence. They are not available for sharing with researchers outside of TEDS.

Acorn data were linked at two specific time points: firstly, around the time of the 1st Contact study (roughly 1998); secondly, in 2008. On both occasions, the linkage was carried out for all available TEDS family UK postcodes. In both cases, these were family postcodes and so the data are identical for both twins of any given pair.

The Acorn coding differs between these time points, as outlined below. The version of Acorn coding used at 1st Contact is now unknown. The version used in 2008 was the 2003-2012 Acorn classification. Historical documentation of these versions is no longer available. However, in both cases, the codes have increasing integer values starting from 1, where 1 is the highest-SES category, with higher values indicate lower SES.

Linkage year Variable Meaning Values
1st Contact (1998) acorn1data Data flag 1Y 0N
acorn1type Acorn type code 1 to 54
acorn1category Acorn category code 1 to 6
2008 acorn08data Data flag 1Y 0N
acorn08type Acorn type code 1 to 56
acorn08category Acorn category code 1 to 5
acorn08group Acorn group code 1 to 17 (labelled with letters A to Q)

The table below shows the correspondence of Acorn "category", "type" and "group" codes at the two linkage time points, compared with the current Acorn version (not linked in TEDS). For documentation of the current (2014) Acorn coding, see the published Acorn User Guide: https://acorn.caci.co.uk/downloads/Acorn-User-guide.pdf.

Category code Corresponding type codes (1998) Corresponding type codes (2008) Corresponding group codes (2008) Corresponding type and group codes and descriptions (current)
1 1-9 1-12 1-3 (A-C) 1-13: "Affluent achievers" (A-C)
2 10-15 13-23 4-6 (D-F) 14-20: "Rising prosperity" (D-E)
3 16-25 24-36 7-10 (G-J) 21-33: "Comfortable communities" (F-J)
4 26-32 37-43 11-13 (K-M) 34-48: "Financially stretched" (K-N)
5 33-38 44-56 14-17 (N-Q) 49-59: "Urban adversity" (O-Q)
6 39-54 - - 60-62: "Not private households" (R)

These comparisons show an approximate correspondence, but with varying numbers of categories, between the 3 Acorn versions. In all versions, higher numbered categories indicate lower levels of SES.

Limitations of the data

The postcode-linked variables do not directly describe participants. Instead, they measure some aggregated environmental property of the geographical area in which the participant's postcode is or was located.

Participant postcodes at each date originated from the TEDS admin system in which contact details have always been recorded. At any given time, this is an imperfect record of postcodes, because participants generally do not immediately inform TEDS when their addresses change, and a change of address is often discovered only when a TEDS mailing is returned undelivered. Where possible, postcodes known to be incorrect were removed prior to linkage. Overseas (non-UK) postcodes were also removed.

Participant contact details, including postcodes, are permanently deleted when a participant withdraws from TEDS. Furthermore, the TEDS admin system only records each particant's current address and postcode, but does not contain a log of every past or changed postcode. Therefore, the postcodes used for linkage at a given time represent a snapshot of current contact details, and it generally is not possible to recreate historical sets of postcodes used for linkage.

For many participants, whose addresses have rarely if ever changed, the postcode has remained unchanged across different linkage times. Therefore, the linked measures for postcodes at different dates may be very similar and may show only small longitudinal changes.

Until the TEDS twins became adults, all contact was made through the parent of each twin pair, and each linked postcode was the same for the two twins of each pair. Hence, for twins as children, the postcode-linked data are not twin-specific. The exceptions in these datasets are the postcodes from the time of the TEDS21 study (roughly 2018).

At the time of the TEDS21 study, TEDS twins were adults. By this time, many twins had moved away from their parents' homes and had their own postcodes. However, even at this time, at least 85% of twin pairs were still recorded as living at their parents' addresses. In many cases, where participants had not made contact with TEDS for several years, it was assumed that the twins still lived at their parents' addresses, when in fact this may not have been the case for many twins. Hence, while the TEDS21 (2018) postcodes were linked for individual twins, in many cases the data are the same for the two twins of a pair.

Because the linked data are identical for all twin pairs as children, and for a large majority of twin pairs as adults, such data may not be appropriate for twin modelling. For this reason, the variables have not been double-entered.

Each postcode-linked measure is available in the TEDS data only at the dates given on this page. The measures relate to a set of participant postcodes that were accessed at a specific point in time, for example postcodes that were current at the time of the TEDS21 study. The postcodes were then linked to measures that are also time-specific, for example census variables dating from either 2001 or 2011. These linkage "snapshots", as described on this page, are not likely to be repeated at other dates.