## Check Your Last Multiple Choice Exam

Would you like to check how likely it would be for a student to guess their way to a passing grade in your most recent multiple choice exam? Use this calculator by entering in your key details from your past exam, including the number of students in your class, the number of multiple choice questions on the exam, the passing grade for the exam, and the average number of answer options available for each question. Cell G24 will show you the estimate of how many students from your class could have passed by purely random guessing. Was your number 0? If not, read on to find out how to make your exams more effective! If so, read on to find out if you could save time in writing your exams!

## Building An Effective Exam

Designing an effective exam is one of the most important and time-consuming tasks any instructor faces when teaching a group of students. Once the students exceeds the maximum group size that an instructor can sustain focused one-on-one mentoring effectively, which might be as few as two students, we must turn to some form of assessment to gauge a student’s progress.

Two main forms of assessment are commonly identified; formative and summative. Each has its own purpose. Formative assessment is primarily an opportunity for a student to self-evaluate knowledge and to practice skills. Summative assessment is primarily a yardstick to evaluate the student’s current understanding of a topic. In an ideal world, a student would have access to as much formative assessment as they desire, and still face an effective and discriminating summative assessment that appropriately identifies strengths and weaknesses of the student’s understanding.

However, time is an important factor, and instructors and other assessment designers need to find a realistic balance between the ideal of near-infinite, high-quality questions and the ineffectiveness of too few questions. One common approach is designing assessments partially or completely composed of multiple choice questions (MCQ). To assist in this balance, I would like to present some statistical considerations for MCQ assessment, so that instructors can focus their effort to meet the needs of their students. My analyses are based on my custom-built spreadsheet, which is free to explore and use.

## How Many Questions Do You Need?

Answering this question typically involves many considerations: the number of students in a class, the amount of time students have to complete the exam, and the breadth of content knowledge that needs to be evaluated, to name a few. It also depends on the balance between time and effectiveness mentioned above. In my analysis, I examine the impact of the following generic MCQ characteristics on that question: the number of answer options, including one right answer among one or more distractors; the number of questions posed during the exam; the number of available variations in each of those question slots; the number of students in the class; and the bar for success on the exam.

With values for these few characteristics, we can find the probability that a student will pass an exam by random guessing. We can also consider a range in distribution of lucky guessers and estimate the likely number of students in a given class that might happen to guess correctly 1, 2, or 3 standard deviations better than the average student. Finally, we can estimate what percentage of the exam questions a successful student would have to know in order to pass the exam, again accounting for the luckier guessers.

This last insight is perhaps the most enlightening, and it is remarkable to see the effect of the average number of answer options on this result. For example, a very lucky guesser (two standard deviations above average, occurring about once per 46 guessing students) on a 20 question exam that uses five answer options per exam, will need to know roughly 70% of the answers apart from guessing in order to reach an 80% passing grade. All other considerations the same, the same student will only need to know 28% of the correct answers if the exam is true/false or another two answer option design. Even a student of average luck would only need to know 60% of the answers in the true/false exam versus 75% in the 5-option exam. Note also that the lower the bar for passing an exam, the more amplified the effect of guessing becomes on the fraction of known answers.

Formative assessment serves a different purpose, but the balance of time versus effectiveness remains relevant. By exploring a variety of combinations, we can see that formative quizzes are more reusable when effort is focused on building extra answer options and question variations that can randomly be substituted into question slots for each attempt by the student. One rough estimate of the relative amount of work needed to create a quiz can be made by multiplying the number of question slots by the number of question variations per slot and the average number of answer options per question.

In this framework, a 10 question quiz with two alternating variations per question slot, each with four answer options is roughly equivalent to a four question quiz with four alternating variations per slot, each with five answer options, in terms of work effort needed. In the first scenario with a 10 question quiz, however, students will start to see questions repeated frequently enough to guarantee correct answers without knowledge beginning around their fourth attempt and reaching a passing grade around the eighth attempt, while in the second scenario of just four questions, they would only start to guarantee correct answers without knowledge at the 10th attempt!

## Examples

We can also investigate the impact of class size on the size of an effective quiz or exam. With a modest class size of 25 students, it is possible to proctor an exam just ten questions long with three answer options per question and a 70% success rate to pass, with a low likelihood of random guessing reaching a passing score of seven. In a larger course, perhaps an intro course of 450 students, odds are in favor of more than nine students being able to guess seven correct answers. Taking it to the modern extreme, an online course with 100,000 students would likely have 1,966 students answer seven questions or more correct solely by guessing.

In order to reduce the effectiveness of lucky guessers, the online course instructor would need to create an exam of 40 questions (with three answer options and pass bar of 70%), an exam of 17 questions (with three answer options) but an increased pass bar of 90%, or an exam of 20 questions with five answer options and a pass bar of 70%. The intro course instructor has an easier task. She can double the number of questions, or increase the pass bar to 90%, or increase the average number of answer options up to five; any of those modifications reduces the odds of random guessing success enough that it is unlikely that any one student would manage it.

We see now how the statistics can inform our choice of exam design; the easiest exam to write would be the second option, but not all instructors may be comfortable with the higher success rate that students would need achieve, so the third option may be most appealing and provides a clear recommended target for the size and style of the MCQ exam. If no single alternative best fits an instructor’s needs, they can use the calculator to find a combination that works for them.

## Conclusion

There are many considerations when designing an assessment, including class size and the volume and arrangement of question items. Knowing the exam size and format necessary to accomplish your goals can be a useful step towards effectiveness while being mindful of your own time. I hope that you find this calculator and analysis helpful when considering your own course assessments.