Chapter 2: Psychological Research
Reliability and validity
Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
Unfortunately, being consistent in measurement does not necessarily mean that you have measured something correctly. To illustrate this concept, consider a kitchen scale that would be used to measure the weight of cereal that you eat in the morning. If the scale is not properly calibrated, it may consistently under- or overestimate the amount of cereal that’s being measured. While the scale is highly reliable in producing consistent results (e.g., the same amount of cereal poured onto the scale produces the same reading each time), those results are incorrect. This is where validity comes into play. Validity refers to the extent to which a given instrument or tool accurately measures what it’s supposed to measure. While any valid measure is by necessity reliable, the reverse is not necessarily true. Researchers strive to use instruments that are both highly reliable and valid.
Video 2.3. Introduction to Reliability and Validity.
Everyday Connection: How Valid Is the SAT?
Standardized tests like the SAT are supposed to measure an individual’s aptitude for a college education, but how reliable and valid are such tests? Research conducted by the College Board suggests that scores on the SAT have high predictive validity for first-year college students’ GPA (Kobrin et al., 2008). In this context, predictive validity refers to the test’s ability to effectively predict the GPA of college freshmen. Recent studies show that there is a positive correlation between SAT scores and first-year GPA, specifically when they take high school GPA into account (Marini et al., 2019). However, the SAT is most predictive of students’ performance in areas such as English and math courses, which are measured on the SAT (Westrick et al., 2020). Additionally, the SAT is is differentially predictive for different students, with higher correlations for females than males, Asian and White students, and for students with higher education levels (Marini et al., 2019).
Check Your Understanding
the ability to consistently produce a given result
the extent to which a given instrument or tool accurately measures what it’s supposed to measure