Standard setting simplified - Angoff

What is Angoff?

Simply, Angoff is a method that uses a group of experts to judge how difficult each item is in an exam to determine the cut-off score. The cut-off score (or mark) is like a line in the sand that divides students in to two groups; those below the cut-off and those above the cut-off. Below the cut-off may indicate a fail and above the cut-off may indicate a pass.

The Angoff method calculates a cut-off mark based on the performance of candidates in relation to a defined standard (absolute) as opposed to how they perform in relation to their peers (relative). It involves a judgement being made on exam items (test-centred) as opposed to exam candidates (examinee-centred), and is widely used to standard set high stakes examinations. You will find there are different variations of the Angoff method, but this blog will bring you up to speed with the process that is used by most organisations.

How is Angoff calculated?

In Angoff, a group of subject experts are asked “What percentage of borderline candidates would answer this item correctly?” Before making a judgement, the experts must agree on the definition of a ‘borderline’ candidate. All judges must have the same definition of a borderline candidate for the Angoff cut-off score to be reliable. This requires a good understanding of what makes a candidate competent enough to be passing, hence why the judges must be subject experts.

Each expert’s judgement for an item should be the same or within a close, defined range (around 10%). The mean of everyone’s judgement is calculated for each item; this is often referred to as the ‘predicted difficulty’. Each predicted difficulty (mean) is added together and divided by the total number of items in the exam to get the cut-off percentage. This percentage of the total marks for the exam indicates the cut-off mark.

What if the experts disagree?

If the expert’s judgements are not unanimous, they discuss how they came to their decision in an effort to come to an agreement. The Angoff is then recalculated based on the new judgements. This process may be repeated.

The following table shows how a cut-off mark may be calculated using the Angoff method:

In this example, the total average percentage is 56.7. This can be rounded to 57, giving a cut-off percentage of 57%. If the test were out of 100 marks, a borderline candidate would be expected to get 57/100 marks.

Backing up your Angoff

In addition to the judges’ predictions of difficulty, there are a number of other factors that can be considered to ensure the validity and reliability of the cut score. A psychometrician will generally take a sample of past marks and candidate’s expected results to reinforce this method. Another way of supporting the Angoff method is to use another standard setting method, such as borderline regression, to provide results based on real candidate data for comparison. Also, if the final candidate results don’t reflect the standard that would be expected of the students taking the exam, the standard setting method can be re-evaluated.

Should you use Angoff?

Angoff is a well-established method of standard setting. It’s most commonly used in high stakes exams such as OSCEs and MCQs and is most reliable when supported by another standard setting method.

Advantages of using Angoff

Holds up in court - Angoff is the most widely used, formal method of standard setting. There are many published works on Angoff and it is justifiable for use in high stakes examinations. If questioned, the Angoff method would hold up in court.

Reflects the difficulty of the content - Angoff focuses on just the content of the exam and the level at which candidates should be performing to meet a certain standard.

Simple when you know how - This is a fairly straightforward process once all judges are trained.

Recyclable - If an item is reused in another exam with the same context (same year group) the Angoff ‘predicted difficulty’ can be re-used so subject experts have less items to judge.

Disadvantages of using Angoff

Needs back-up - It doesn’t use real exam data to estimate a cut-off mark so it is considered more accurate and reliable if backed up by a criterion-referenced method e.g. borderline regression.

Long process - The process can be time consuming and labour intensive as the judges must look at every test item. This can lead to the judges becoming fatigued, impatient and can encourage rushing through the items.

Confidence is key - Judges must be experts in their field. This method relies on the judges being confident and consistent with their definition of a ‘borderline’ candidate, and not just assuming an ‘average’ candidate.

Time and a place - You need a large sample of judges for accuracy and reliability. As well as a good range of different ages, genders, ethnicities and levels of seniority. It can be difficult to collect a large sample with such specifications.  Digital exam software like Maxexam makes it easier to collect experts’ judgments as they can complete it remotely so you don’t need everyone in one place at one time.

To summarize, Angoff is a standard setting method that requires subject experts making judgements about how difficult each item in an exam by predicting the percentage of borderline candidates that would get this question correct. It is widely used in high stakes exams and holds up in court if challenged.

