Principles of assessment

Reliability

If a particular assessment were totally reliable, assessors acting independently using the same criteria and mark scheme would come to exactly the same judgment about a given piece of work. In the interests of quality assurance, standards and fairness, whilst recognising that complete objectivity is impossible to achieve, when it comes to summative assessment it is a goal worth aiming for. To this end, what has been described as the 'connoisseur' approach to assessment (like a wine-taster or tea-blender of many years experience, not able to describe exactly what they are looking for but 'knowing it when they find it') is no longer acceptable. Explicitness in terms of learning outcomes and assessment criteria is vitally important in attempting to achieve reliability. They should be explicit to the students when the task is set, and where there are multiple markers they should be discussed, and preferably used on some sample cases prior to be using used 'for real'.

Validity

Just as important as reliability is the question of validity. Does the assessed task actually assess what you want it to? Just because an exam question includes the instruction 'analyse and evaluate' does not actually mean that the skills of analysis and evaluation are going to be assessed. They may be, if the student is presented with a case study scenario and data they have never seen before. But if they can answer perfectly adequately by regurgitating the notes they took from the lecture you gave on the subject then little more may be being assessed than the ability to memorise. There is an argument that all too often in British higher education we assess the things which are easy to assess, which tend to be basic factual knowledge and comprehension rather than the higher order objectives of analysis, synthesis and evaluation.

Relevance and transferability

There is much evidence that human beings do not find it easy to transfer skills from one context to another, and there is in fact a debate as to whether transferability is in itself a separate skill which needs to be taught and learnt. Whatever the outcome of that, the transfer of skills is certainly more likely to be successful when the contexts in which they are developed and used are similar. It is also true to say that academic assessment has traditionally been based on a fairly narrow range of tasks with arguably an emphasis on knowing rather than doing; it has therefore tended to develop a fairly narrow range of skills. For these two reasons, when devising an assessment task it is important that it both addresses the skills you want the student to develop and that as much as possible it puts them into a recognisable context with a sense of 'real purpose' behind why the task would be undertaken and a sense of a 'real audience', beyond the tutor, for whom the task would be done.

Criterion v Norm referenced assessment

In criterion-referenced assessment particular abilities, skills or behaviours are each specified as a criterion which must be reached. The driving test is the classic example of a criterion-referenced test. The examiner has a list of criteria each of which must be satisfactorily demonstrated in order to pass - completing a three-point turn without hitting either kerb for example. The important thing is that failure in one criterion cannot be compensated for by above average performance in others; neither can you fail despite meeting every criterion simply because everybody else that day surpassed the criteria and was better than you.

Norm-referenced assessment makes judgments on how well the individual did in relation to others who took the test. Often used in conjunction with this is the curve of 'normal distribution' which assumes that a few will do exceptionally well and a few will do badly and the majority will peak in the middle as average. Despite the fact that a cohort may not fit this assumption for any number of reasons (it may have been a poor intake, or a very good intake, they have been taught well, or badly, or in introductory courses in particular you may have half who have done it all before and half who are just starting the subject giving a bimodal distribution) there are even some assessment systems which require results to be manipulated to fit.

The logic of a model of course design built on learning outcomes is that the assessment should be criterion-referenced at least to the extent that sufficiently meeting each outcome becomes a 'threshold' minimum to passing the course. If grades and marks have to be generated, a more complex system than pass/fail can be devised by defining the criteria for each grade either holistically grade by grade, or grade by grade for each criterion (see below).

Writing and using assessment criteria

Assessment criteria describe how well a student has to be able to achieve the learning outcome, either in order to pass (in a simple pass/fail system) or in order to be awarded a particular grade; essentially they describe standards. Most importantly they need to be more than a set of headings. Use of theory, for example, is not on its own a criterion. Criteria about theory must describe what aspects of the use of theory are being looked for. You may value any one of the following: the students' ability to make an appropriate choice of theory to address a particular problem, or to give an accurate summary of that theory as it applies to the problem, or to apply it correctly, or imaginatively, or with originality, or to critique the theory, or to compare and contrast it with other theories. And remember, as soon as you have more than one assessment criterion you will also have to make decisions about their relative importance (or weighting).

Graded criteria are criteria related to a particular band of marks or honours classification or grade framework such as Pass, Merit, Distinction. If you write these, be very careful about the statement at the 'pass' level. Preferably start writing at this level and work upwards. The danger in starting from, eg first class honours, is that as you move downwards, the criteria become more and more negative. When drafted, ask yourself whether you would be happy for someone meeting the standard expressed for pass, or third class, to receive an award from your institution. Where possible, discuss draft assessment activities, and particularly criteria, with colleagues before issuing them.

Once decided, the criteria and weightings should be given to the students at the time the task is set, and preferably some time should be spent discussing and clarifying what they mean. Apart from the argument of fairness, this hopefully then gives the student a clear idea of the standard they should aim for and increases the chances they will produce a better piece of work (and hence have learnt what you wanted them to). And feedback to the student on the work produced should be explicitly in terms of the extent to which each criterion has been met.