TEDS Data Dictionary

14 Year Study

Contents of this page:

Introduction

The 14 Year study data were collected in the following ways:

  • Twin web tests.
    Science and cognitive tests completed by twins over the Internet.
  • Parent booklets.
    Parent-reported data relating to the family and to the twins.
  • Twin booklets.
    Twin self-reported data.
  • Parent SLQ questionnaires.
    National Curriculum results, with a language questionnaire.
  • Twin telephone tests (cohort 1 only).
    Language tests, conducted by TEDS staff over the telephone.
  • Teacher questionnaires (cohort 1 only).
    Teacher-reported data relating to the twins.

The measures used in the study are described in full in a separate page.

The web and family booklet data were collected in two broad waves: wave 1 (school cohort 1) in 2007/08, and wave 2 (school cohorts 2, 3 and 4) in 2009/10. The SLQ data were collected year by year for cohorts 1, 2, 3 and 4 in 2008, 2009, 2010 and 2011 respectively. Teacher data, and telephone test data, were collected in wave 1 only.

The initial intention was to collect data from cohort 1 but not necessarily from the later cohorts, because of funding issues. The cohort 1 data collection, in 2007/8, was timed to coincide with the school year in which the twins reached their 14th birthdays, largely for the purpose of collecting teacher ratings. The parent-reported SLQ was initially designed to assess second language acquisition, but by the time of its administration in the summer of 2008 (for cohort 1) it had been adapted to include end-of-KS3 assessments, which schools were required to report to parents at the end of KS3. This proved to be a more successful method of gathering teacher NC assessment data than the teacher questionnaire, which had given poor returns in cohort 1.

In the summer of 2009, it was decided to extend the SLQ data collection to cohort 2, and at the same time to ask parents and twins to complete the family booklets. By autumn 2009, decisions had been made also to ask cohort 2 twins to participate in the web study, then to invite cohort 3 and 4 families to participate in both the booklet and web studies, but via the web only (questionnaires were administered using a web version but not the paper booklet for cohorts 3 and 4). These later parts of the study were carried out on a relatively small budget, without voucher rewards and with little use of calling. The use of web questionnaires reduced costs both of postage and of data entry. The teacher study was not continued, because returns had been poor in cohort 1 and because NC assessments were now to be collected by means of the parent-reported SLQ.

The SLQ data collection continued for cohorts 3 and 4 in the summers of 2010 and 2011 respectively, hence close to age 14 for all twins. In the booklet and web studies, however, twin ages varied from 12 to 14.

The sample

The sample for this study included all TEDS families except for the following: (a) those that had withdrawn from TEDS; (b) "inactive" families, who had not returned data in any previous studies; (c) families with address problems, that could not be traced; (d) some medical exclusions and other special cases, considered on a case-by-case basis - as a general rule, families of autistic twins were included, but other severe medical conditions were excluded.

Initial contact (by mail) was made with 11084 of the 13945 families from the original TEDS sample. Hence there were around 2860 families that were not contacted. Of these, roughly 960 had withdrawn; roughly 1500 were address problems that could not be contacted; and the remainder (roughly 400) were medical exclusions or other special cases.

For the Language telephone tests (wave 1 only), which started at a later date than the booklet and web studies, the sample was further restricted to included only those families in which both twins had completed the language web tests at age 12 (see the 12 Year study for details of these tests). Furthermore, families were excluded from the Language study for the following reasons: (a) telephone problems, that could not be traced; (b) overseas families, because of telephoning difficulties; (c) families that had already opted out of the 14 Year study; (d) various categories of analysis exclusions (medical, perinatal and language exclusions). This left a sub-sample of 650 families for the Language study.

The data returns for the 14 Year study are summarised in a separate page. There are further pages comparing samples and returns for different TEDS studies.

Data collection

Summary table

Web and booklet data were collected in two 'waves', while SLQ data were collected in four school cohorts, as summarised in this table:

Wave: Wave 1 Wave 2
School Cohort: 1 2 3 4
Twin birth dates: Jan-94 to Aug-94 Sep-94 to Aug-95 Sep-95 to Aug-96 Sep-96 to Dec-96
Number of families contacted: 2565 3950 3493 1215
Family booklet and SLQ data collection: Paper versions only. Paper versions sent initially. Logins for on line versions sent as reminder. Logins sent for on line versions.
Consent forms: Paper consent forms sent with family booklets.
On line consent for web tests.
On line consent only, for booklets, SLQ and web tests.
Booklet and login mailings: October 2007 August 2009 (booklets, SLQ);
October 2009 (web test logins)
September 2009
SLQ mailings: July 2008 August 2009 (with family booklets) July 2010 July 2011
Teacher questionnaires: Teachers contacted from February 2008 (not applicable)
Language twin telephone tests: Families called from April 2008 (not applicable)
Vouchers: Used for both the web tests and the language tests. No vouchers sent
Phone calling: Families called for teacher consent, to encourage web tests, and for the language study No systematic calling of families
Approx. average twin age (years) when booklet and web data collected: 14 13 12.5

Data collection was preceded by paper-based pilots of early versions of the Science tests. These pilot studies are not documented in this data dictionary, but are described briefly in the pilot studies page.

Timing of data collection

SLQ data were collected at the end of the school year (July/August) in which twins had reached both the age of 14 and the end of school Key Stage 3. At this point, schools were obliged to report end-of-KS3 teacher assessment to parents (in England and Wales); the SLQ asked parents then to report these assessments to TEDS. Hence, the SLQ was administered each summer from 2008 to 2011 respectively for the four TEDS cohorts. Twin ages were therefore close to 14 for all twins at the time of SLQ data collection. As described below, both paper and on line versions of the SLQ were used, with some variations in administration between cohorts.

For similar reasons, collection of teacher data in cohort 1 was planned to coincide with the school year (September 2007 to June 2008) in which twins reached age 14. Data collection started with the collection of teacher details from parent consent forms, which had been sent with the booklets in October 2007. However, the teacher study was not extended to the later cohorts, partly because of poor returns in cohort 1, and partly because end-of-KS3 assessments were now being collected via the parent SLQ.

The booklet and web studies for cohort 1 were timed to coincide with the start of the teacher study, in October 2007. Families were sent consent forms (asking for teacher details), parent and twin booklets, and login details for the web study.

Before data collection started for cohort 2, a decision had been made to abandon the teacher study but to continue with the SLQ study; initially, it was also decided not to continue with the web study, largely because of a lack of funding for reward vouchers. Therefore, initial invitations were sent in August 2009, including parent and twin booklets, the SLQ, and a consent form (asking for consent in the booklet study but not asking for teacher details).

Soon afterwards, a decision was made to ask cohort 2 families to complete the web tests as well, albeit without a voucher incentive. So a separate mailing of web study login details was made to cohort 2 families in October of the same year.

At around the same time, a further decision was made to include cohorts 3 and 4 in both the web and booklet studies, but without the use of paper booklets: parents and twins were asked to log in to complete an on line version of the questionnaire instead. Hence, families in cohorts 3 and 4 were invited in September 2009, and they were sent login details both for the web study and for the questionnaire study (these were administered on different web sites but with the same parent and twin usernames and passwords). No paper consent form was sent, because on line consents were used both for web and questionnaire participants. Paper booklets were not sent.

Because cohorts 2, 3 and 4 were all invited at roughly the same time (August to October 2009) to participate in the web and questionnaire studies, these three cohorts together are referred to as 'wave 2', while the earlier cohort 1 data collection is referred to as 'wave 1'. In wave 2, twin ages varied from around 12 (cohort 4) to 14 (cohort 2) at the time of data collection; in wave 1, twin ages were close to 14.

In cohort 1, a paper consent form was sent along with the family booklets and the web login details. In the cohort 1 consent form (pdf), parents were asked to consent to (or opt out of) the web and telephone studies; they were asked to supply their current phone numbers and email addresses; and they were also asked for consent to contact the twins' teachers (providing teacher/school contact details). Up to two written reminders were sent to families that did not return the consent forms promptly; these also served as reminders to return the family booklets. As described below, some families were also telephoned for verbal consent for participation in the web and telephone tests.

In cohort 2, a shortened consent form was sent with the family booklets, simply asking families to consent to (or opt out of) the booklet study, and to supply their current phone numbers and email addresses. No consent form was sent with the web test logins in cohort 2 (an on line consent form was used instead). No reminders were sent for unreturned consent forms in cohort 2, and there was no calling (see below).

In cohorts 3 and 4, no consent form was included with the initial mailing. Instead, on line consent forms were used both for the web tests and for the on line questionnaires. The on line consent form for the questionnaires was similar in design to the on line consent form for the web tests.

Family booklets

The child booklet and parent booklet (pdfs) were made available both in paper and on line versions, at various stages of the study. The two versions contained identical items and responses, and the on line version was designed to follow the paper version closely in its appearance. During the course of the 14 year study, there were changes in the way that the paper and on line versions were used.

In cohort 1, families were sent paper booklets only. These were sent at the same time as the consent form and the web login details. Families that had not returned consent forms and family booklets were sent up to 3 written reminders (the 3rd reminder was for the booklets only, not the consents).

In cohort 2, families were initially sent paper booklets, then later (by way of a reminder) they were sent login details for the on line versions.

In cohorts 3 and 4, families were sent login details for the on line versions, and were not sent paper booklets. In due course (with the 2010 newsletter), a general written reminder was sent to families asking them to complete both the on line questionnaires and the web tests.

Twin web tests

All cohorts of families were asked to attempt the twin web tests. There are separate pages describing the web login and consent, the organisation of the web tests, and the tests themselves (Science, Vocabulary and Ravens).

Voucher rewards, for completing the web tests, were used in wave 1 but not in wave 2. In wave 1, each twin was rewarded with a voucher (to the value of £5) on completion of the 3 web tests. In addition, for every wave 1 family in which the twins had attempted the tests, a £5 voucher was sent to the parents as a way of covering their internet connection costs.

In wave 1, the login details for the web tests were sent along with an information sheet (pdf). These were sent in the same pack as the consent form and family booklets, as described above.

In wave 2, the information sheet was not used; instead, essential information from the information sheet was incorporated into an expanded invitation letter. As described above, in cohort 2 this letter was sent separately from the family booklets and paper consent form. In cohorts 3/4 there were no paper booklets or consent form; however the web invitation letter was further expanded to ask families to complete the on line family questionnaires.

The progress of each family in their web activities was recorded on the web server. For administrative purposes (for example, in order to allocate families to web callers, and in order to organize mailings of vouchers), information had to be transferred from the web server to the TEDS administrative database. This was done by means of a "family status file", which was produced automatically every night by a program running on the web server. The analysis file was a plain text file (in csv format), with a row of data for every family. The information recorded for each family included fields to show whether each twin had started or finished each of the web tests, together with dates, times, and other information.

Language questionnaire (SLQ)

The SLQ (National Curriculum assessment results and second language questionnaire) was designed to be sent to parents at the end of school Key Stage 3, in other words at the end of the school year in which twins had reached 14 years of age. The cohort 1 SLQ (pdf) asked for SAT results in addition to teacher assessments. SATs were abolished the same year, so SAT levels were dropped from the cohort 2 SLQ (pdf). This cohort 2 version was also used in cohorts 3 and 4, with only slight modifications (see below). In cohort 1, the SLQ was sent at the end of the school year, in July, some months after the initial pack had been sent. In cohort 2, the SLQ was sent along with the family booklets after the end of the school year, in late August. In cohort 3, login details for the online version of the SLQ were sent to families in July 2010; the same was done for cohort 4 in July 2011. A written reminder and targeted telephone calls were used in cohort 1, for families that did not return the SLQ promptly. A written reminder (but no phone calls) was also used in each of cohorts 2, 3 and 4.

An on line version of the SLQ questionnaire, with identical items and responses to the paper version, was developed. This was set up after the initial SLQ mailing in cohort 1, so families were sent login details by way of a reminder. In cohort 2, the login details were sent along with the paper version, giving families the choice of returning data by either method from the start. In cohorts 3 and 4, families were required to complete the on line version, so login details but not paper questionnaires were sent.

The differences between the various versions of the SLQ questionnaire can be summarised as follows:

  • SAT results (for English, Maths, Science): included in the cohort 1 version, but not in later versions.
  • Teacher assessment - modern foreign language: the name of the language assessed was not collected in cohort 1, but was incorporated in the questionnaire from cohort 2 onwards. In the web version, space for two modern foreign languages was incorporated from cohort 3 onwards.
  • Question 1 (languages used at home): the cohort 1 version had space for two languages, while later versions had space for three languages.
  • Question 3 (languages studied at school): the cohort 1 version had space for just three languages, while later versions had space for five languages, with 'English' specified by default as the first in the list.

Where space was limited for recording languages, particularly in cohort 1, families often used ad hoc annotations on the questionnaire to record additional languages. Data entry procedures were adapted organically to incorporate these. Hence, for example, some cases in cohort 1 have more than three languages recorded for question 3, even though the questionnaire only provided space for recording three languages.

Language telephone tests

The Language tests were for a sub-study, with data collected initially within TEDS but collected on behalf of collaborators at York University. The data from the Language tests are not part of the TEDS dataset.

Cohort 1 families selected for inclusion in the language study were allocated to callers, as described below. Each family was contacted by telephone by their allocated caller, who would carry out the language tests (pdf) over the telephone, with each twin in turn. The telephone call to the twins was recorded directly onto CD, using equipment provided by TEDS for the callers. The CDs were then returned to TEDS by post, together with any notes made by the callers. On completion of the tests, each twin was sent a voucher (to the value of £5). The CDs were then copied (in the TEDS office), and a copy of each CD recording sent to the University of York for coding.

Teacher questionnaires

The cohort 1 teacher study included all families that had given consent and provided contact details for the twins' teachers and schools. Families may have done this in writing via the consent form, or verbally via the callers (see below). The teacher questionnaire (pdf) was made available in two versions: an on line version, and a conventional paper version. The on line version was designed to be as similar as possible to the paper version, with identical wording used for items and responses.

Each twin's teacher was initially sent a username and password for logging in to the on line questionnaire. Each username and password was unique to a particular twin, and the username incorporated the twin's forename (to avoid confusion for a teacher of both twins). The on line questionnaire was hosted on the TEDS web server, and the data submitted by teachers were stored directly in a secure database in the TEDS office. As well as simplifying communication with teachers, the on line version had the advantage of by-passing the need for data entry.

Up to three written reminders were sent to teachers that had not completed the on line questionnaire. The third and final reminder included the paper version of the questionnaire. A telephone reminder was also used for teachers in some prioritized cases.

Telephone calling

In cohort 1 (but not in later cohorts), some families were contacted by telephone at various times. Because of the size of the study, only targeted groups of families were selected for calling. Families were given some time to return the teacher consent details and complete the web tests under their own steam, before teacher consent callers and web callers were allocated. Callers were TEDS employees, and they fell into three different categories:

  1. Language callers. These callers contacted families primarily to conduct the Language telephone tests. They had secondary roles of teacher consent calling and web calling, as described below. The language callers were allocated to the families selected for the Language study (see the sample above).
  2. Teacher consent callers. The main role of these callers was to collect teacher consent details. They were not expected to act as web callers, although they were asked to mention the web tests by way of a reminder. Teacher consent callers were allocated to just 270 targeted families. These were families that had not already returned written teacher consent details, and were not allocated to language callers, who fell into any of the following categories: (1) they had returned the parent and twin questionnaires; (2) they had started or finished the web tests; (3) they had finished the first battery of web tests in the 12 Year study; (4) they had finished the web tests in the 10 Year study. Exclusions were made for families that had opted out or withdrawn, that were telephone problems, or that were medical exclusions or other special cases.
  3. Web callers. These callers had the job of encouraging families to complete the web tests. They did this using repeated telephone calls to the families. Web callers were allocated to roughly 450 targeted families. These were families that had not already completed the web tests, and were not allocated to language callers, who fell into any of the following categories: (1) families that had already started the web tests; (2) families that had returned the parent and twin questionnaires; (3) families that had completed the first battery of web tests in the 12 Year study; (4) families that returned DNA samples to TEDS during the previous year. Exclusions were made for families that had opted out or withdrawn, that were telephone problems, or that were medical exclusions or other special cases.

Data entry

General data entry and data cleaning issues (for all studies including the 14 Year) are described in a separate page.

In the 14 Year study, data from the twin web tests were effectively entered by the twins themselves. As they answered items on their computers, their responses were recorded on the TEDS web server.

The web server was programmed to produce, when required, "analysis files" containing the web test data. Each web test/activity had its own analysis file, and there were separate sets of analysis files for each of the two waves of data collection. Each analysis file was a plain text file, with comma-delimited variables, containing one row of data for each twin who completed the test. The analysis files were copied from the web server at the end of each wave. For each test/activity, the two analysis files from the two waves have been aggregated together, with unwanted identifying fields (other than ID) removed.

In a similar way, data from the web versions of the parent booket, parent SLQ, child booklet, and teacher questionnaire were entered by the parents, twins and teachers respectively. The data submitted this way were stored directly into a database on the TEDS database server. At the end of the study, these data were simply copied into the main 14 Year Access database for aggregation with the data from the paper versions of the questionnaires.

Data returned on paper were entered either by manual keying or by optical scanning. Manual keying was done by TEDS staff, using the Access databases. Manual data entry was used at age 14 for the paper versions of the twin booklet, teacher questionnaire and SLQ. Administrative data, such as return dates for booklets and reported problems with web tests, were entered manually in the TEDS admin system and have also now been transferred into the 14 Year study Access database.

Optical scanning was used to enter data from the paper version of the parent booklet. These had been returned by mail directly to the TEDS office by families. They were then delivered in large batches to Group Sigma for optical scanning. After scanning each batch of booklets, Group Sigma returned the data in plain text files. The paper booklets and questionnaires were subsequently returned to the TEDS office.

The 14 Year study Access database now contains all the raw data, in cleaned and aggregated form, from the booklets (parent, twin and teacher) and the parent-reported SLQ, and admin data such as return dates. This Access database is now the master copy of all these data and the source of these data for constructing the dataset. Any booklets returned late were manually entered into this same database.

The twin web test files are too large for convenient storage in the Access database, and have been stored separately as csv text files (one file per test). The Access database and other data files are described in more detail on the 14 year data files page.

Raw data from the language sub-study (telephone tests) consisted of audio recordings saved onto CDs. Each CD contained recordings of the interviews for one pair of twins. The CD recordings were sent to the University of York for coding. The coded language data are not available within TEDS.