Standardized tests: Test reliability

As with other research procedures and tools, reliability and validity are major considerations when using standardized tests and inventories.

Test reliability

The reliability of a test refers to stability of measurement over time. When a person's data entry skills are measured on two occasions (with no special training in between), the two sets of scores should be similar. Reliability is often measured with a reliability coefficient, which is simply a correlation between sets of scores from people who have been given the test on two occasions (X = first time score on the test, Y = second time score on the test) - see the correlation module for review.

There are three (3) ways to measure the reliability of a test or inventory: Test-retest, Split-half, and Alternate forms.

Split-half - after being taken by a sample, the answers to the test are divided into two halves (e.g., the odd-numbered versus the even-numbered items). Scores on each half are correlated. If the test is reliable, the scores on the two halves should show a high positive reliability coefficient (correlation).
Test-retest - the same test is given to the same people on two occasions. The scores are correlated, and if the reliability coefficient is positive and high, the test is reliable.
Alternate forms - two versions of the test are constructed, and given to the same people on two occasions. Scores on the two forms should show a high positive reliabiilty coefficient (correlation).


It is also possible to build a reliability check into an inventory. For example, the Edwards Personal Preference Schedule (an occupational test) has 210 pairs of items, plus 15 that are repeated. Comparing the responses to the 15 repeated items reveals whether or not the person's responses are consistent (reliable).

Next section: Test validity