Assessing Assessment. A Testing Journey

When I began at my current school, testing in KS3 history was piecemeal.  Sometimes students were assessed on extended writing pieces on selected topics from the curriculum.  These pieces of work were open book with students allowed their exercise books and any other material they’d used in lessons.  In other assessments students weren’t expected to write at all; for example, in Y9, students took part in a debate on whether the “Lions led by Donkeys” interpretation of generals and soldiers in World War One was fair.  In these lessons teachers assessed students on their verbal contributions.  All KS3 assessments were leveled using the now extinct National Curriculum Levels.

This system was damaging planning, teaching and learning.  With no formal tests or proper examinations in KS3, the curriculum was presenting students ‘Theme Park’ history.  Students dipped into various topics but were not supported in making meaningful links between them, or retaining learned knowledge over time.

This generated enormous problems in KS4, where the assessment system was markedly different.  From the beginning of Y10 students were given half-termly mock GCSE exams based on the content they’d covered in the previous six weeks. The disconnect between these two systems was devastating.  Students felt they were improving at one thing, choosing the GCSE course, then finding what they’d got better at in years 7-9 was of almost no use to them.  Pupils were climbing one ladder, then being told that ladder didn’t exist any more and presented with a new one they knew nothing about. Many quickly became despondent.

A consequence of this inherently flawed system was dramatic underperformance in final GCSE examinations, which seemed mysterious to SLT who quite understandably could not understand how students assessed at Level 6 in Y9, were getting E and F grades at the end of Y11.

To address this we began by bringing KS3 assessment in line with KS4.  At both KS3 and KS4 students were given knowledge organisers and assessment preparation sheets before each exam, which formed a basis for revision.  KS3 assessments all became half-termly exams with the questions worded the same way as they were on GCSE papers.  We also revised KS3 Schemes of Work and planning to ensure that the focus of each lesson were the second order concepts identified as GCSE Assessment Objectives.

These changes improved results, with last year’s GCSE cohort, the first to have studied our revised curriculum, albeit only in Year 9, doing significantly better than the years that preceded them.  Despite this improvement, results were still well below national averages.  There is still a lot of work to do.

Last year I think I managed to work out what was going wrong.  Although our revised KS3 curriculum is giving students more practice at how to answer historical questions, its modular nature meant that students were not tested on their ability to recall knowledge over longer periods of time.  Students appeared to be doing better than they really were, because tests scores reflected only the material most recently covered.  This problem wasn’t isolated to KS3 as in KS4 students were being tested the same way.

This year, we refined our system again by adopting a model I first saw blogged about by Michael Fordham.  Now all students are assessed, every half term, on everything they’ve covered up to the point of the test.  This means that students are now expected to revise all content, not just the content they’ve covered most recently.

We’ve just entered our second set of data and results are already interesting.

Our first half term data looked largely as it did under our previous system.  This, rang some alarm bells with SLT because some students, especially the most able, performed considerably higher than they did in other subjects.  This was because students were being tested on material they’d learned perhaps only a week before, so had no trouble recalling it.

Fortunately I work in a school with a very understanding, open and reflective SLT and they were willing to indulge me on the understanding of a prediction; in history, under the new assessment system, test scores would dip as students are tested on knowledge over a longer period of time, then steadily rise as they understand the importance of revision and got better at it.

Data 2 has just gone in and I’ve had a dig into the scores of my classes.  I’ve found the first part of my prediction (that scores will initially go down) was sort of right in some cases, but mostly wrong.

Firstly, as predicted, the scores of lower achievers did dip and their exam papers shows that it was the questions on older material on which they did worst.  Although initially a bit depressing, it’s important to remember that while scores might have got worse, achievement really hasn’t.  Our new system has helpfully exposed the weak long term memory of our lower achievers and now we can be sure this is something we need to work on.

My second finding was more unexpected. The scores of mid-achievers didn’t dip very much and in some cases actually improved slightly.  I’ll need to look in to why this was the case but I’ve already got some lines of enquiry I’d like to pursue.  The first of these is the simplest.  Our mid-level students have better memories than we thought they did and were more responsive to our explanations of how much content they needed to revise than we expected them to be.  If this is true, our expectations have helpfully been raised.  The second is that my teaching has changed and now more emphasises the importance of long term recall.  I’m sure there’s something in this; knowing that my classes are going to be assessed on a greater body of knowledge has resulted in more low-stakes testing and more frequent references to previously covered topics. I’d be very surprised if this hasn’t made a difference.

My final finding has been rather thrilling.  The scores of our more able have improved significantly.  Their answers show this is because their greater knowledge base is allowing them to form connections and meaningful comparisons between different periods because they have deeper wells from which to draw water.  In one particularly good answer, a very able Y8 student compared Elizabeth’s success as a monarch with that of Henry VII, which we covered at the beginning of last half term.

Many of the conclusions I’ve drawn probably look very obvious and I do feel slightly foolish for taking so long to realise that the implementation of a knowledge based curriculum and a summative assessment system would be beneficial.  Without wanting to sound presumptuous, I urge any history department, or indeed teacher, to at least trial this method with one or two classes.  Results really do look dramatic and it frustrates me it took me so long to arrive at this system.  Of course, there are reasons for this and some of these infuriate me.

But that’s not the focus of this post, which I want to remain positive.

Of course we haven’t cracked it.  But we’re closer.

Leave a Reply

Your email address will not be published. Required fields are marked *