Reliability and Validity

In order for research data to be of value and of use, they must be both reliable and valid.


Reliability refers to the repeatability of findings. If the study were to be done a second time, would it yield the same results? If so, the data are reliable. If more than one person is observing behavior or some event, all observers should agree on what is being recorded in order to claim that the data are reliable.

Reliability also applies to individual measures. When people take a vocabulary test two times, their scores on the two occasions should be very similar. If so, the test can then be described as reliable. To be reliable, an inventory measuring self-esteem should give the same result if given twice to the same person within a short period of time. IQ tests should not give different results over time (as intelligence is assumed to be a stable characteristic).


Validity refers to the credibility or believability of the research. Are the findings genuine? Is hand strength a valid measure of intelligence? Almost certainly the answer is "No, it is not." Is score on the SAT a valid predictor of GPA during the first year of college? The answer depends on the amount of research support for such a relationship.

There are two aspects of validity:

Internal validity - the instruments or procedures used in the research measured what they were supposed to measure. Example: As part of a stress experiment, people are shown photos of war atrocities. After the study, they are asked how the pictures made them feel, and they respond that the pictures were very upsetting. In this study, the photos have good internal validity as stress producers.

External validity - the results can be generalized beyond the immediate study. In order to have external validity, the claim that spaced study (studying in several sessions ahead of time) is better than cramming for exams should apply to more than one subject (e.g., to math as well as history). It should also apply to people beyond the sample in the study.

Different methods vary with regard to these two aspects of validity. Experiments, because they tend to be structured and controlled, are often high on internal validity. However, their strength with regard to structure and control, may result in low external validity. The results may be so limited as to prevent generalizing to other situations. In contrast, observational research may have high external validity (generalizability) because it has taken place in the real world. However, the presence of so many uncontrolled variables may lead to low internal validity in that we can't be sure which variables are affecting the observed behaviors.

Relationship between reliability and validity

If data are valid, they must be reliable. If people receive very different scores on a test every time they take it, the test is not likely to predict anything. However, if a test is reliable, that does not mean that it is valid. For example, we can measure strength of grip very reliably, but that does not make it a valid measure of intelligence or even of mechanical ability. Reliability is a necessary, but not sufficient, condition for validity.

To summary