Publications

Science - Trends in Year 5 science achievement 1994 to 2006

Publication Details

This report describes the science achievement of Year 5 students in TIMSS 2006/07. Trends in New Zealand’s achievement over the 12 years from 1994 to 2006 are examined, along with comparisons with other countries. Analyses of achievement by sub-groupings (such as gender and ethnicity) and background information are also presented. It was originally published in December 2008 and revised in September 2009 due to the mislabelling of the content domains knowing and applying. The current version rectifies this error.

Author(s): Robyn Caygill [Ministry of Education]

Date Published: December 2008

Definitions and technical notes

This section gives a brief overview of the technical details and definitions applicable to this report. For a comprehensive description of the technical details pertaining to 'TIMSS see the TIMSS 2007 technical report' (Olson, Martin, & Mullis, (Eds.), 2008).

Benchmarks

In order to describe more fully what achievement on the science scale means, the TIMSS international researchers have developed benchmarks. These benchmarks link student performance on the TIMSS science scale to performance on science questions and describe what students can typically do at set points on the science achievement scale. The international science benchmarks are four points on the science scale, the advanced benchmark (625), the high benchmark (550), the intermediate benchmark (475), and the low benchmark (400). The performance of students reaching each benchmark is described in relation to the types of questions they answered correctly.

Exclusions

Each country was permitted to exclude some students for whom the assessment was not appropriate or was difficult to administer. Countries were required to keep the amount of excluded students as small as possible, with a guideline of 5 percent of the ‘target’ population as the maximum. Any countries that exceeded this value are indicated in the international exhibits. The target population in New Zealand was Year 5 students.

School-level exclusions in New Zealand consisted of very small schools (less than 4 Year 5 students), special education schools, Rudolf Steiner schools, the Correspondence School, and schools that provide more than 80% of their instruction in te reo Māori. Within-school exclusions consisted of special education classes, special needs students, students with insufficient instruction in English, and units within schools that provide more than 80% of their instruction in te reo Māori.

The New Zealand exclusion rate was one of the largest at 5.4 percent and equivalent to Hong Kong SAR and Lithuania. Exclusion rates for most of the other countries were usually kept below the 5 percent maximum, with only the United States and the benchmarking participants exceeding this level.1

Making Models

The models in this report were formulated using two different methods. Regression analyses were used for the model at the student level that combined ethnicity, speaking English at home, and immigrant status. Custom-written programs described in the TIMSS user guide for the international database (to be published in early 2009) were used for this analysis. Multi-level modelling techniques were applied using the MLWin package for the analysis which examined school-, class-, and student-level variations in achievement. A range of background characteristics were included in the larger model initially and the model was then tested iteratively. At each iteration, any characteristics that were not statistically significant were removed until the model contained only variables with a significant influence on student achievement.

Mean, medians, and averages

There are three possible measures of central tendency, but only the mean and the median are used in this report.

The mean of a set of scores is the sum of the scores divided by the number of scores, and is also sometimes referred to as ‘the average’, particularly in the international reports. Note that for TIMSS, as with other large-scale studies, the means for a country are adjusted slightly (in technical terms ‘weighted’) to reflect the total population of Year 5 rather than just the sample.

A median is the middle number when all numbers are put in order.

In earlier cycles of TIMSS, an international mean was reported. However, as the number of countries participating changed, this mean shifted so that it was difficult to make comparisons across years. In TIMSS 2006/07 the TIMSS scale average is reported. This is the value to which the scores of each student are scaled (see later note on Scale score points for more details).

Minimum group size for reporting achievement data

In this report, student achievement data are not reported where the group size is less than 30 students or less than 10 schools. While group sizes of 30 to 50 students do have achievement reported in some cases, these are annotated and should be treated with caution as there is a lot of uncertainty in the measurement, as demonstrated by larger standard errors.

Percentile

The percentages of students performing below or above particular points on the scale can be used to describe the range of achievement. The lowest outer limit of achievement reported in ranges is the 5th percentile – the score at which only 5 percent of students achieved a lower score and 95 percent of students achieved a higher score. The highest outer limit is the 95th percentile – the score at which only 5 percent of students achieved a higher score and 95 percent of students a lower score. Therefore 90 percent of the Year 5 student scores lie between the 5th and 95th percentiles.

Sampling

Schools are sampled in TIMSS with a probability proportional to the number of Year 5 students. In order to improve the precision of sampling, the schools were ordered by decile, level of urbanisation, and size, so that the schools selected better represented the population of schools in New Zealand. Within each school, classes were sampled with equal probability and all Year 5 students within each class were selected.

Scale score points

The design of TIMSS allows for a large number of questions to be used in mathematics and science; each student answers only a portion of these questions. TIMSS employs techniques to enable population estimates of achievement to be produced for each country even though a sample of students responded to differing selections of questions. These techniques result in scaled scores that are on a scale with a mean of 500 and a standard deviation of 100.

Significance tests

In this report, all the comparisons that have been made are tested for statistical significance using the t statistic, with the probability of making an incorrect inference set at 5 percent. To compare the means of two groups of students, the formula to generate the test statistics computed in this report is:
    (1)
The calculation of sediff , the standard error of the difference, varies depending on whether the groups were sampled independently or not. If the means for two groups that were sampled independently are being compared, for example, boys’ achievement in 1994 and 2006, then the standard error of the difference is calculated as the square root of the sum of the squared standard errors of each mean:
    (2)
For most of the comparisons, this formula was not applicable and so the sediff is computed more accurately by combining variances using custom-written SAS programs. However as a rough estimate, the above formula will give a similar result.

Note that in all calculations, unrounded figures are used in these tests, which may account for some results appearing to be inconsistent.

Standard error

Because of the technical nature of TIMSS, the calculation of statistics such as means and proportions has some uncertainty due to:

(i)    generalising from the sample to the total Year 5 school population; and
(ii)    inferring each student’s proficiency from their performance on a subset of questions.

The standard errors provide a measure of this uncertainty. In general, we can be 95 percent confident that the true population value lies within an interval of 1.96 standard errors either side of the given statistic. This confidence interval is represented in graphs by the lines extending in either direction from the points.

Statistically significant

In order to determine whether a difference between two means is actual, it is usual to undertake tests of significance. These tests take into account the means and the error associated with them. If a result is reported as not being statistically significant, then, although the means might be slightly different, we do not have sufficient evidence to infer that they are different. All tests of statistical significance referred to in this report are at the 95 percent confidence level.

Weighting

Due to the use of sampling, weights need to be applied when analysing the TIMSS data. Weighting ensures that any information presented more closely reflects the total population of Year 5 students rather than just the sample. The TIMSS weighting takes into account school, class, and student level information and the overall sampling weight is a product of the school, class, and student weights.

 

Footnote

  1. See Martin, Mullis, & Foy, P. (2008), Exhibit A.4 for this information. 



 Copyright © Education Counts 2011   |   Contact information.officer@minedu.govt.nz for enquiries.