To the "Learning" side of the site "Teaching" home

 Contents
Links

Forms of Assessment
Marking
The Problem of Assessment

  

 

 

Assessment

Validity

Reliability

Fairness

Summative and formative

Norm- criterion- and ipsative referencing

 

Assessment makes teaching into teaching. Mere presentation—without assessment of what the learners have made of what you have offered them—is not teaching. So assessment is not a discrete process, but as integral to every stage of teaching, from minute to minute as much as module to module.

And informal assessment (or evaluation) is going on all the time. Every time a student answers a question, or asks one, or starts looking out of the window, or cracks a joke, he is providing you with feedback about whether learning is taking place. It's more an evaluation of the teaching session than about his learning, but the two are inextricable.

Assessment “reaches back” into the rest of teaching: in particular, poorly designed formal assessment regimes can severely hinder student learning and distort the process and subject matter.

All assessment is ultimately subjective: there is no such thing as an “objective test”. Even when there is a high degree of standardisation, the judgement of what things are tested and what constitutes a criterion of satisfactory performance is in the hands of the assessor.

However, we can still make every effort to ensure that assessment is valid, reliable and fair.

Validity

A valid form of assessment is one which measures what it is supposed to measure.
  • It does not assess memory, when it is supposed to be assessing problem-solving (and vice versa).
  • It does not grade someone on the quality of their writing, when writing skills are not relevant to the topic being assessed, but it does when they are. 
  • It does seek to cover as much of the assessable material as practicable, not relying on inference from a small and arbitrary sample (and here it spills over into reliability).

Unfortunately, no assessment is completely valid.

Reliability

Or "replicability". A reliable assessment will produce the same results on re-test, and will produce similar results with a similar cohort of students, so it is consistent in its methods and criteria.

Fairness

This is really an aspect of validity, but important enough to note in its own right. Fairness ensures that everyone has an equal chance of getting a good assessment. This may include (where appropriate) anonymity of submitted material, so that extraneous considerations (such as the quality of contributions in seminars, if they are not part of the assessment scheme) cannot influence the final result.

Purposes of Assessment

The traditional distinction is between summative and formative assessment.

Summative assessment is what students tend to focus on. It is the assessment, usually on completion of a course or module, which says whether or not you have "passed". It is—or should be—undertaken with reference to all the objectives or outcomes of the course, and is usually fairly formal.

Considerations of security—ensuring that the student who gets the credit is the person who did the work—assume considerable importance in summative assessment, which may push in the direction of using conservative approaches such as examinations, which are not necessarily highly valid.

Note that all summative assessment can also be formative, if the feedback offered is sufficient.

Formative assessment is going on all the time. Its purpose is to provide feedback on what students are learning:

  • to the student: to identify achievement and areas for further work
  • to the teacher: to evaluate the effectiveness of teaching to date, and to focus future plans.

While grades or marks may assume primary importance in summative assessment, their role in formative assessment is simply to contribute to the feedback process: marks against specific criteria (such as "use of sources", "presentation of argument") may be much more use than global judgements.

One more distinction

It is also possible to distinguish between Norm- and Criterion- and even Ipsative-referenced assessment schemes.

Norm-referencing is basically competitive: it is a ranking exercise. Out of any given group, the top 5% get "A"s, the next 10% get "B"s, etc. and the bottom 50% fail. (The figures are of course arbitrary) This may be fair enough when the purpose is to select for a fixed and limited number of positions, such as jobs or places on a course or a sports team. The quality, however, can vary widely from group to group of candidates. It may reassure the public in sensitive areas, because a fixed proportion of candidates is always rejected, but can be grossly unfair. It also effectively demands a test in which less able candidates are progressively rejected, like a high-jump competition in which the bar is progressively raised until competitors fail to jump it (or contestants are progressively voted off a reality-TV show). IQ tests tend to be structured like this, and of course the IQ is a norm-referenced measure.

Criterion-referencing is the term used for assessment against fixed criteria. [Personal beef here: "criterion" is the singular, "criteria" the plural: I heard someone refers to "criterias" the other day!] Theoretically, it can mean that everyone who undertakes a given assessment may pass it, or no-one might. Even norm-referencing requires reference to criteria, of course, but full criterion-referencing ignores the statistical implications of the assessment profile: it is thus inherently fairer, as long as the criteria are determined in advance, and they are valid and reliable.

And then there is ipsative assessment, which is assessment against yourself, or more particularly against your own "personal best" performance. It is more relevant to performance coaching, special needs education and therapy than to most mainstream teaching.

The story goes that at a college of one of our ancient universities, the rowing crew desperately needed the services of a certain undergraduate, who was not noted for his academic prowess, and who was in danger of being thrown out if he did not pass his end-of-year history exams, which took the form of a viva.

He had to score a minimum of 50% to pass and retain his place. The examiner first asked him when the New Poor Law was passed. He guessed at 1650. This  was incorrect. [Look it up!]

The examiners were getting desperate—the college's reputation was at stake—so the examiner said: "Now listen carefully, and do not guess. Do you know what significant event took place in 1776?"

The undergraduate thought for a moment and then said, regretfully, "No". This was obviously correct, so he passed and the college was saved.

Valid? Reliable? Fair? Norm-referenced? Criterion-referenced? Ipsative? Discuss!

[ The problem of assessment ] [ Marking ]
[ Forms of assessment ]

 

To reference this page
copy and paste the text below:
(Note that if you are using Internet Explorer, and it is doing its "nanny" thing, the full reference will not display. There will be a bar across the top of the screen advising you of "blocked content". Click on it and select "Allow blocked content" and confirm in the pop-up box. I know it's a pain, but we're stuck with it.)

ATHERTON J S (2005) Teaching and Learning:    [On-line] UK: Available:  Accessed:

Original material © James Atherton: last up-dated 15 August, 2005

Back to top

Google
www search Learning and Teaching site search Doceo site search

Click here to send to a friend