Adaptive testing technologies, or computerized adaptive testing (CAT), are slowly merging into the educational systems all over the world. CAT is basically a computer-based test that can somehow detect the level of knowledge of an examinee in the topic being tested, and adjust the level of the exam accordingly. CAT has many advantages and disadvantages. In this short blog, a brief about this system, how it works, what its benefits are, and the overheads and problems accompanied with it will be discussed.
Computerized adaptive testing is a system made up of multiple modules. Together they make the whole system capable of estimating the examinee’s knowledge level, selecting a suitable question from a calibrated pool of questions, and determining when to end the exam. Having that said, the following diagram should make sense:
The main parts of the systems are:
• A calibrated pool of questions, which includes many questions that are suitable for all students’ levels. Question parameters and their calibration is another topic for another blog.
• A selection algorithm that selects the most suitable question from a pool of questions based on the examinee’s previous answers and their current level of knowledge.
• An estimation algorithm that determines the level of the examinee given their previous answers and the question level along with other question parameters.
• A stopper module that decides whether to continue and pass another question or to end the exam.
Those together typically work in the following scenario:
An examiner builds the question bank and determines some needed parameters. One of them is the initial level of knowledge. This parameter determines the level every student starts with. The next step is that the selector selects some starting questions to allow the estimating algorithm to get to know the examinee. These steps are followed by the examinee receiving questions , answering them, and having the answers recorded , then the estimator algorithm takes the parameters of those questions that have been given to the examinee along with their answers and estimates the level of knowledge of the examinee. This level of knowledge is then passed to the selector algorithm again if the stopper algorithm decides whether the exam should continue or stop. Should the exam continue : The selector then selects another question and the cycle is repeated and should the exam end: the score of the examiner is then calculated.
Seeing all that, It seems like it is a complicated system with lots of parameter tuning and overheads. So what are the benefits that this system would offer? And what are the problems?
Starting with the advantages, Adaptive tests are usually fair for everyone as the questions are selected to match the level of the examinee. This can not be made in regular tests as they are only designed to target the average examinee, a higher level examinee can easily pass the process, and a lower level student may be struggling with it. Another advantage is that adaptive tests are usually shorter as questions are more directed towards the examinee’s level and the exam stops once it reaches a certain level of certainty about the examinee’s level. In contrast, regular exams give the same questions to all examinees with the same time allowance. This last point leads to one more important advantage, adaptive tests reduce exposure rates of each question. This means that each exam is a unique exam which is more secure when it comes to cheating.
Assuming that someone is trying to hack the system with cheating, questions go harder and time allowance will go shorter(as a matter of increasing question difficulty) and it will be way less likely that this examinee will be able to maintain the same rate only depending on cheating. Add to that the uniqueness of each exam and it becomes almost impossible for any examinee to get any level higher than their real one.
On the other hand, adaptive tests have their downsides too. Just like any other sophisticated system, the amount of parameters that have to be tuned is really a challenge. Building a good question bank with calibrated parameters is one of the hardest tasks in this system. Also, some people sometimes manage to trick the system , if not properly designed and calibrated, to get higher scores, one of these methods is as follows: examinee starts by deliberately selecting wrong answers, which leads to easier exams, and then answers correctly, resulting in much higher score than just trying to answer all the questions correctly from the beginning which may lead to harder questions and thus lower score.
To summarize, this blog has discussed what a CAT is, what the components that make it are, how they work, and what the pros and cons of using this system are. It may be a little bit harder for some to shift their mindset to this approach, but it has very fruitful results when it is properly designed.