Chapter 1: Foundations

This chapter provides a foundational description of key elements (terms) relevant to educational and psychological testing.This material represents core concepts that are important across all types and domains of psychological testing.  If you perform psychological testing, whether in a research or practice setting, you must understand these concepts to function competently.

The selection and ordering of material are based on two premises. First, measurement in any scientific domain can be evaluated by its power, that is, its ability to detect and observe phenomena of interest. Second, the power of a scientific field’s measurement methods largely depends upon the history and maturity of that field. At the start of a scientific area, theory, research, and measurement methods all exist in a primitive state. As sciences mature, improvements in measurement methodologies should lead to advances in theory and research.

The history of astronomical observation offers examples of this progression. In Figure 1, Saturn is shown as drawn by early observers in the 17th century based on what they saw with primitive telescopes. Some astronomers interpreted what they saw as evidence that Saturn had ears. Figure 2 displays a photo taken in 2004 by the Cassini spacecraft. This observation evidences considerably more information about Saturn’s rings, including shape and number of rings.

Figure 1

Early Drawings of Saturn

Early drawings of Saturn and its rings. White circles and elliptical figures on a black background.

Note. Early drawings of Saturn and its rings, around 1610-1644. Observers using early telescopes could barely detect Saturn’s rings. Source is Alexander (1962), cited in Fletcher (2013).

Figure 2

Cassini 2004 Photo

image

Note. Cassini space probe 2004 photo of Saturn’s rings. Source is Jet Propulsion Laboratory, National Aeronautics and Space Administration For more photos, go to https://www.nasa.gov/mission_pages/cassini/images/index.html.

Early astronomers probably saw ears on Saturn for several reasons. The most obvious reason is likely the primitive state of the optics; the telescopes were better than the naked eye, but could not produce images with better resolution. Astronomers who saw these ambiguous shapes naturally interpreted them in terms of more familiar images, such as ears on a head. Contemporary educational and psychological measures are comparatively more advanced than these primitive images, perhaps comparable to the image in Figure 3.  That is, most current tests can detect phenomena of interest, but  reliably classify them into only a few quantitative categories (such as high, medium, and low).

Figure 3

Hooker 1943 Telescope Photo

Note. Saturn and rings photographed in 1943 by ground-based Hooker telescope, Mount Wilson Observatory. Copyright 1962 by Carnegie Observatories, reprinted by permission.

Efforts to observe educational and psychological phenomena systematically have occurred for little over a century. Consequently, measurement technology in this field remains relatively under-developed, at least compared to more mature sciences. This state is complicated by the fact that commercial interests, not nonprofit considerations, have frequently driven the creation of educational and psychological tests. One result has been an emphasis on selection tests that measure psychological traits relevant to performance in academic and vocational settings. A developer constructing a test for selection of student applicants for admission to college, for example, would identify item and scale scores that provide high estimates of reliability (e.g., temporal stability, to demonstrate that measured traits endure over time, and internal consistency) and validity (e.g., the ability to predict future outcomes such as GPA and college graduation rates). These selection criteria have been adopted as the central paradigm for evaluating tests, although other criteria can be important for creating tests useful for other purposes.