Observational Units
In Forming Connections [1C] you learned that the observational units are the individuals we are asking a question or the entities about whom we want to measure some characteristic. Observational units can be humans, but they can also be any individual of interest. We may wish to collect data about humans, animals, or even non-living items like books, math tests, or U.S. states.
observational units
[perspective video — a 3 instructor video offering a couple of examples of good statistical questions along with a short list of possible observational units associated with the situation to choose from to help answer the question. Use a human example, a non-human living example, and a non-living example <– emphasize that in a situation like, for example, “does active learning tend to increase student success?” that the obs. units would be “test scores” or “course grades” and not “students.” This can lead into a brief mention of ethics and the practice of analyzing anonymous or de-identified data.]
Note: this video could reference questions of similar technical style as those in the example below.
See the example below before answering Question 2.
example
This example could be a shorter version of the pick-your-dataset examples we used in Module 2. For instance, one option could be 3 questions surrounding a particular issue of social justice while another option could be 3 questions surrounding a particular issue of inclusion and a third option could be 3 questions surrounding identity. But they should each be in the technical style of the 3 questions below: brief and uncomplex.
Identify the observational units in the data used to answer each of the questions below.
- Which species of fish tends to contain more mercury?
- Are the observational units a) species of fish, b) mercury, or c) waterways?
- Which city has the longest commute time for workers per year?
- Are the observational units a) workers, b) commute times, or c) cities?
- What climate tends to attract more tourists?
- Are the observational units a) tourists, b) modes of transportation, or c) climates?
Now it’s your turn. Return to the question, “Which U.S. state has the worst drivers?” to answer Question 2.
Question 2
Suppose we wanted to try to answer the question, “Which U.S. state has the worst drivers?” Since we’re asking a question that begins “Which U.S. state…,” which of the following should be the observational units in the data we use to answer this question?
- Drivers
- Vehicles
- Car accidents
- U.S. states
Variables
Anticipating variability is key in a good statistical question; in other words, a good statistical question anticipates that there will be variability in the data collected to answer the question. That means that the variables (or characteristics) we measure about our observational units are expected to have different values among the different observational units. Understanding which kinds of questions anticipate variability can help us to understand what kind of variables can be used to explore a statistical question. See the video below for a demonstration of how to identify variables in data, then answer the remaining questions.
Qualities of good statistical questions
[Worked example — a 3 instructor video that follows the themes used in the perspective video above but provides a worked example for Question 3 and Question 4 below.]
Hopefully you are feeling more confident about identifying questions that anticipate variability and variables present in the data. Now it’s your turn to assess your understanding by answering the remaining questions.
Question 3
Which of the following questions anticipate variability in the data required to answer them? Select all that apply. There may be more than one correct answer.
- a) Which states have the most automobile accidents per year?
- b) Which states tend to have stricter cell phone laws for drivers?
- c) Does New York have a state-wide hands-free cell phone law?
- d) Which state has the fewest drivers on the road per day?
- e) How many speeding tickets were given in the United States in 2019?
- f) What time of day has the most traffic in Alabama?
The final question requires you to pull information from an article in which statistical data is used to answer a relevant and interesting question (one of the qualities of a good statistical question). Don’t skip the article when trying to answer the question!
Question 4
In the FiveThirtyEight article, “Dear Mona, Which State Has The Worst Drivers,” the author, Mona Chalabi, attempts to answer the title question. Read the article and identify the variables that the author uses to explore this question.
https://fivethirtyeight.com/features/which-state-has-the-worst-drivers/
Which of the following variables does the author use to answer the question? Select all that apply. There may be more than one correct answer.
- a) Number of registered vehicles per state
- b) Number of drivers on the state’s roads per day
- c) Number of drivers involved in fatal collisions per billion miles traveled
- d) Number of fatalities due to automobile wrecks per year
- e) Percentage of drivers involved in fatal collisions who were not distracted
- f) Whether or not the state has a hands-free cell phone law
- g) Percentage of drivers involved in fatal collisions who were not involved in previous accidents
- h) Percentage of fatal collisions where a driver was speeding
- i) Number of speeding tickets given per year
- j) Percentage of fatal collisions that occurred on a road with a speed limit over 60 miles per hour
- k) Percentage of fatal collisions in which alcohol impairment was involved
- l) Average combined car insurance premium