Data collection
Summary
Data were collected in two 'waves', summarized in the table below:
Wave: | Wave 1 | Wave 2 | ||
---|---|---|---|---|
School Cohort: | 1 | 2 | 3 | 4 |
Twin birth dates: | Jan-94 to Aug-94 | Sep-94 to Aug-95 | Sep-95 to Aug-96 | Sep-96 to Dec-96 |
Criteria for including families (based on recent data returns): | Gave consent in 7yr or 9yr or 10yr study; or returned the 8yr qnr or the Eating qnr | Completed at least one 10yr web test; or returned the 9yr family booklets or the Eating qnr | Returned 7yr parent booklet or the Eating qnr | |
Number of Families sent consents: | 2189 | 2697 | 2642 | 910 |
Consent Mailing date: | December 2005 | September 2006 | ||
Login Mailing date: | February 2006 | January 2007 | ||
Collection of TOWRE data: | From all families in the web study; collected by web callers. | Only from families that completed the web tests; collected separately by TOWRE callers. | ||
Processing of teacher data: | Paper questionnaires only. Data entered manually. | Teachers had the option of either paper or on line versions of the questionnaire. Data from the paper version were optically scanned. | ||
School phase: | Secondary school (Year 7) | Primary school (Year 6) | Primary school (Year 5) | |
Approx. average twin age (years) when data collected: | 12 | 11 | 10.5 |
Initially, data collection periods were planned to be linked to the school year (September to August) in which twins reached the age of 12 years. The reason for this was to ensure that all twins were in the same school year when the teacher data were collected. However, these plans were changed for the second wave of the study: it was decided to collect data from school cohorts 2, 3 and 4 simultaneously, to avoid having to extend data collection for a further two years. Hence, in cohorts 1 and 2 families were contacted in the school year in which their twins reached the age of 12 years; but in cohorts 3 and 4 families were contacted in the school year in which their twins reached the ages of 11 and 10 years respectively. Thus, all families from cohorts 2, 3 and 4 were contacted at once, simplifying administration. Twin ages varied from roughly ten to twelve and a half years, at the time when the data were collected.
Data collection was preceded by at least two pilot studies, including pilot versions of the booklets and some of the cognitive web tests. There was also a paper-based retest of the maths and reading web tests, for a small selection of twins. The retest is fully documented but the previous pilots are not. All are described briefly in the pilot studies page.
Contacting the families
Initial contact with families was made by sending them a pack including the consent form, an information sheet describing the study, the parent booklet and two copies of the twin booklet, and the TOWRE test stimulus sheets (pdfs). In the consent form, parents were asked to consent to (or opt out of) the web and phone studies; they were asked to supply their current phone numbers and email addresses; they were asked about their home internet connections (broadband or dial-up); and they were also asked for consent to contact the twins' teachers (providing teacher/school contact details). One written reminder was sent to families that did not return the consent forms promptly. At later dates, up to two written reminders were also sent to families that had not returned the parent and twin booklets.
Every family that had not explicitly opted out of the study was later sent a login pack, inviting the family to carry out the web activities. This pack contained the family's login details (username and password) for the web activities, and a web guidance sheet (pdf). The latter contained basic instructions for accessing the web tests. For further details, see the login/consent page and the summary of 12 Year web tests.
Families were subsequently contacted by telephone; the way this was done changed from wave 1 to wave 2. Callers were TEDS employees, and for this study they fell into two categories: "web callers", who contacted families to encourage completion of the web tests; and "TOWRE callers", who contacted families to carry out the TOWRE telephone test with the twins. In wave 1, all families that had been sent a login pack were allocated to callers, and each caller had the dual role of web caller and TOWRE caller.
In wave 2, after sending the login packs, families were initially given the opportunity to complete the web activities under their own steam, without any web calling. Then, web callers were allocated to targeted families, to encourage them to complete the web tests. Families were not phoned by web callers if they had already completed the web activities, or if they had already opted out of the study, or if they had completed the web activities in the 10 Year study (this applied to school cohort 2 only; it was thought that these families should not need the help of web callers to complete the activities).
In wave 2, TOWRE calling took place separately from web calling, and different staff were used. TOWRE callers were only allocated to families that had completed the web tests. The instructions for carrying out the TOWRE test are recorded in a script (pdf) for the callers.
The parent-reported school attainment (NC levels) were collected by means of a simple questionnaire (pdf), sent at the end of the school year. These were sent in July 2006 for wave 1, and July 2007 for wave 2. The aim was to collect the teacher assessment levels as recorded on end-of-school-year school reports. In wave 1, these questionnaires were only sent to those families in which both twins had completed the first battery of web tests. In wave 2, the questionnaires were only sent to families in which at least one twin had started the web tests. Returns of the questionnaire were reasonably good in cohorts 1 and 2, where twins were in their first year of secondary school; but very poor in cohorts 3 and 4, where twins were still in primary school (perhaps because school reports did not often show NC levels).
During 2007, the SRS sub-study sent a parent questionnaire containing the CAST measure to 1873 TEDS families that had not been in touch in recent years, and were therefore not included in the TEDS 12 Year sample. 356 of these families completed and returned the questionnaire. As these families were contacted at the same time as the TEDS 12 Year study (so the twins were in the same age range), and because the SRS CAST questionnaire contained 30 items that were identical to the 30 CAST items in the TEDS 12 Year parent booklet, these data have been incorporated into the 12 Year analysis dataset. The item data from the SRS CAST questionnaire are aggregated together with the item data from the CAST measure in the 12 Year parent booklet. The dataset contains a flag variable to identify the cases involved.
Twin web tests
There is a separate page describing the organisation of the web tests, the tests themselves, and the roles of the web callers. Briefly, there were 15 web tests, and these were divided into two batteries, referred to as battery A and battery B. The order in which the test batteries were presented was A then B for some families, and B then A for others (with the aim of getting similar amounts of data for each test battery). The way that families were asked to complete the two batteries changed from wave 1 to wave 2.
In wave 1, families were all asked to complete their first battery (whether A or B), and to treat their second battery as an optional extra. Each twin was rewarded with a voucher (to the value of £5) on completion of the first battery; those that additionally completed the second battery were rewarded with a further £5 voucher. In addition, for every family in which the twins had attempted the tests, a £5 voucher was sent to the parents as a way of covering their internet connection costs. This approach was used because there were many tests (15 in all), and it was thought that completing both batteries of tests might be very difficult for families with slow dial-up internet connections, or without any home internet connection.
In wave 2, families were all asked to complete both batteries. Twins were not rewarded just for completing the first battery. Instead, each twin was sent a £10 voucher on completion of the second battery. As before, parents were also sent a £5 voucher. This change of approach in wave 2 was partly based on feedback received from wave 1, firstly that a large majority of families did in fact have home broadband connections, and secondly that more twins than initially expected were prepared to complete both batteries.
The progress of each family in their web activities was recorded on the web server. For administrative purposes (for example, in order to allocate families to web callers, and in order to organize mailings of vouchers), information had to be transferred from the web server to the TEDS administrative database. This was done by means of a "family status file", which was produced automatically every night by a program running on the web server. The analysis file was a plain text file (in csv format), with a row of data for every family. The information recorded for each family included fields to show whether each twin had started or finished each of the web tests, together with dates, times, and other information.
A follow-up phase of web data collection took place in 2008, for some of the wave 2 families. This was related to a new large-scale DNA study of reading and mathematics ability (for more details, see the WTCCC study on the DNA studies page). Families were selected for this follow-up study if they had not completed the 12 Year reading and mathematics web tests during 2007, and if they had previously returned twin DNA samples to TEDS, and if they were not in any of the exclusion categories for the study. Families from wave 1 were not selected, because in 2008 they were already involved in the 14 Year web study. The total number of families selected was 960. The four relevant web tests (Mathematics, PIAT, Reading Fluency and Reading Comprehension) were resurrected on the web server in such a way that twins who had already started these tests would carry on from where they left off. Each twin completing these four tests was rewarded with a £10 voucher. The data from this phase of the study have been added to the analysis dataset alongside the earlier data.
Teacher questionnaires
The teacher study included all families that had given consent and provided contact details for the twins' teachers and schools. Families may have done this in writing via the consent form, or verbally via the web callers. Teacher questionnaires (pdf) were sent directly to teachers, and were not seen by the families themselves. Up to three written reminders were sent to teachers that had not returned their questionnaires promptly. A telephone reminder was also used for teachers in some prioritized cases.
In wave 2, teachers were given the option of completing the questionnaire on line instead of returning the paper version. Each twin's teacher was sent both the paper questionnaire and a username and password for logging in to the on line questionnaire. Each username and password was unique to a particular twin, and the username incorporated the twin's forename (to avoid confusion for a teacher of both twins). The on line questionnaire was hosted on the TEDS web server, and the data submitted by teachers were stored directly in a TEDS database. As well as simplifying communication with teachers, the on line version had the advantage of by-passing the need for data entry. The items in the on line version were identical to those in the paper version, and every attempt was made to keep the visual layout as similar as possible.
At the end of the school year, parents were contacted again by mail to ask them to supply National Curriculum (NC) levels, as reported to them by teachers on the twins' end-of-year school reports. Each family was sent a one-page NC level questionnaire (pdf) for this purpose. In wave 1, this was only sent to families in which both twins had completed the web tests (at least the first battery). In wave 2, this was broadened to include all families in which at least one twin had started the web tests.
Data entry
General data entry and data cleaning issues (for all studies including the 12 Year) are described in a separate page.
In the 12 Year study, data from the twin web tests were effectively entered by the twins themselves. As they answered items on their computers, their responses were recorded on the TEDS web server.
The web server was programmed to produce, when required, "analysis files" containing the web test data. Each web test/activity had its own analysis file, and there were separate sets of analysis files for each of the three waves of data collection. Each analysis file was a plain text file, with comma-delimited variables, containing one row of data for each twin who completed the test. The analysis files were copied from the web server at the end of each wave. For each test/activity, the two analysis files from the two waves (or three waves for some tests) have been aggregated together, with unwanted identifying fields (other than ID) removed.
In a similar way, data from the on line version of the teacher questionnaire were effectively entered by the teachers themselves. The data submitted from web pages by teachers were stored directly into a database on the TEDS database server. At the end of wave 2, these data were simply copied into the main 12 Year Access database for aggregation with the data from the paper teacher questionnaires.
Data returned on paper were entered either by manual keying or by optical scanning. Manual keying was done by TEDS staff, using the Access database. Manual data entry was used at age 12 for the twin booklets, TOWRE test results and parent NC level questionnaires (both waves), and for teacher questionnaires (wave 1 only). Administrative data, such as return dates for booklets and reported problems with web tests, were entered manually in the TEDS admin system and have also now been transferred into the 12 Year study Access database.
Optical scanning was used to enter data from the parent booklet (in both waves) and the paper version of the teacher questionnaire (in wave 2). These had been returned by mail directly to the TEDS office by families and teachers respectively. They were then delivered in large batches to Group Sigma for optical scanning. After scanning each batch of booklets, Group Sigma returned the data to TEDS in plain text files. The paper booklets and questionnaires were subsequently returned to the TEDS office.
The 12 Year study Access database now contains all the raw data, in cleaned and aggregated form, from the booklets (parent, twin and teacher), the TOWRE test results, the parent-reported twin NC attainment levels, and admin data such as return dates. This Access database is now the master copy of all these data and the source of these data for constructing the dataset. Any booklets returned late were manually entered into this same database.
The twin web test files are too large for convenient storage in the Access database, and have been stored separately as csv text files (one file per test/activity). The Access database and other data files are described in more detail on the 12 year data files page.