How Much Difference Does It Make? Notes on Understanding, Using, and Calculating Effect Sizes for Schools
Publication Details
A good way of presenting differences between groups or changes over time in test scores or other measures is by ‘effect sizes’, which allow us to compare things happening in different classes, schools or subjects regardless of how they are measured. This booklet is designed to help school staff to understand and use effect sizes, and includes handy tips and warnings as well as useful tables to calculate effect size values from change scores on standardised tests.
Author(s): Ian Schagen, Research Division [Ministry of Education], Edith Hodgen [NZCER]
Date Published: March 2009
How big is big enough?
A frequent question is “How big should an effect size be to be educationally significant?” This is a bit like “How long is a bit of string?”, as the answer depends a lot on circumstances and context. Some people have set up categories for effect sizes: e.g., below 0.2 is “small”, around 0.4 is “medium” and above 0.6 is “large”. But these can be misleading if taken too literally.
Suppose you teach a class a new topic, so that initially they pretty well all rate zero on your 8-point assessment scale. You would expect that most of them would reach at least the mid-point of the scale afterwards, with some doing even better. An effect size of 4.0/2.0 = 2.0 (mid-point of scale = 4; nominal standard deviation = 2) would not be an unreasonable expectation, but this would only be “large” within a restricted context. Similarly, if you took a small group with a limited attainment range and managed to raise their scores, you could get quite big effect sizes; but these might not be transferable to a larger population.
On the other hand, if you managed to raise the mathematics aptitude of the whole school population of New Zealand by an amount equivalent to an effect size of 0.1, this would raise our scores on international studies like PISA and TIMSS by 10 points – a result which would gain loud applause all round.4
Footnote
- It would move New Zealand’s average PIRLS score in 2006 from 532 to 542, above England and the USA.
Downloads / Links
Sections
- Acknowledgements
- Introduction
- Getting a standard deviation
- Possible comparisons using effect sizes
- Uncertainty in effect sizes
- How do we knows effect sizes are real
- How big is big enough?
- Cautions, caveat, and Heffalump traps for the unwary
- How easy is it to calculate effect sizes for New Zealand standardised tests
Contact Us
For more publication-related information, please email: information.officer@minedu.govt.nz
Search Publications
Copyright © Education Counts 2011 | Contact information.officer@minedu.govt.nz for enquiries.