In interpreting test scores, it is first helpful to remember that it is just a test score. In fact, a TABE® score is only a range where a student’s true score might fall. A scale score is only a description of an examinee’s performance on a particular test at a particular point in time. Therefore, it makes sense that one should use caution in placing too much emphasis on one particular test score – prior test scores should be considered in concert with all other available information about the examinee in order to come to the best understanding of that person’s true ability. For this reason, The Standard Error of Measurement (SEM) should be noted and considered for every achievement test performance.
There is some measure of statistical error with every norm-referenced score – and the SEM represents the estimated error associated with each possible scale score. The magnitude of the standard error varies by test and along the scale of each test. Scores near the top and bottom of each test’s scale have a higher estimated statistical error. The smaller the SEM, the more accurate the score – the closer the performance description is to the examinee’s true abilities. Therefore, to obtain “optimal information” from an examination, the test-taker should score in the midrange (or “valid and reliable range”) by answering 40-75 percent of the items correctly. This range roughly represents the trough in the bell-shaped, standard error curve below.
In the graphic below, the most accurate scores that could be associated with an assessment using Complete Battery Language Level D fall along the trough of the curve. These scores are associated with a SEM of around 20 (See TABE® 9&10 Technical Report or the TABE 9&10 Norms Book).
Administrators and instructors should not be hesitant to discard a suspect test score. Suspect scores could come from scores outside the “valid and reliable range,” scores associated with high SEMs, and observations of the student during the test. Suspect scores vary significantly from previous results, other indications of student progress in the classroom, and/or teacher’s expectations. Scores in the chance range are those scores that could be obtained by guessing – i.e., scores in the 25% number-correct range. These scores are also suspect, and could indicate that the test level is too high. Guessing could also be observed by engaged test administrators.
Certain behaviors such as wandering eyes and finishing unusually quickly can be indications of an inaccurate assessment due to lack of effort by the examinee. It is in the best interest of administrators to discard such scores. Perfect scores are another example of inaccurate descriptions of student achievement since they don’t account for all of the student’s abilities. Perfect scores indicate that the examinee should be retested with a higher level test. These scores should also be discarded or interpreted with caution.
As we have seen, there are many different ways to describe an individual’s test performance, in addition to a raw number-correct score. In order to make use of these different descriptions, it is important to understand what is meant by them. A good understanding of how to interpret TABE® test scores provides an accurate view of a student’s abilities, which translates into an optimal instructional plan for that individual. Understanding how a student’s test performance describes a student’s abilities is therefore the first step toward creating success for each student.
Reference: Guide to Administering TABE 9&10, Copyright CTB/Mcgraw-Hill LLC, Copyright© 2004. TABE® materials are Copyright 2016 by CTB/DRC (Data Recognition Corporation).