How much difference does it make? Notes on understanding, using, and calculating effect sizes for schools

Publication Details

A good way of presenting differences between groups or changes over time in test scores or other measures is by ‘effect sizes’, which allow us to compare things happening in different classes, schools or subjects regardless of how they are measured. This booklet is designed to help school staff to understand and use effect sizes, and includes handy tips and warnings as well as useful tables to calculate effect size values from change scores on standardised tests.

Author(s): Ian Schagen, Research Division, Ministry of Education and Edith Hodgen, New Zealand Council for Educational Research.

Date Published: March 2009

Please consider the environment before printing the contents of this report.

This report is available as a download (please refer to the 'Downloads' inset box).  To view the individual chapters please refer to the 'Sections' inset box.

Section 6: How big is big enough?

A frequent question is "How big should an effect size be to be educationally significant?" This is a bit like "How long is a bit of string?", as the answer depends a lot on circumstances and context. Some people have set up categories for effect sizes: e.g., below 0.2 is "small", around 0.4 is "medium" and above 0.6 is "large". But these can be misleading if taken too literally.

Suppose you teach a class a new topic, so that initially they pretty well all rate zero on your 8-point assessment scale. You would expect that most of them would reach at least the mid-point of the scale afterwards, with some doing even better. An effect size of 4.0/2.0 = 2.0 (mid-point of scale = 4; nominal standard deviation = 2) would not be an unreasonable expectation, but this would only be "large" within a restricted context. Similarly, if you took a small group with a limited attainment range and managed to raise their scores, you could get quite big effect sizes; but these might not be transferable to a larger population.

On the other hand, if you managed to raise the mathematics aptitude of the whole school population of New Zealand by an amount equivalent to an effect size of 0.1, this would raise our scores on international studies like PISA and TIMSS by 10 points - a result which would gain loud applause all round.


  1. It would move New Zealand's average PIRLS score in 2006 from 532 to 542, above England and the USA.

Contact Us

Education Data Requests
If you have any questions about education data then please contact us at:
Email:      Requests EDK
Phone:    +64 4 463 8065