Who Is Killed by Police? (Chi-squared Goodness of Fit)
- Due May 10, 2024 by 11:59pm
- Points 5
- Available until May 17, 2024 at 11:59pm
Race and Ethnicity: Who Is Killed by Police?
Several websites (such as the Washington Post and KilledbyPolice.net Links to an external site. and FatalEncounters.org) Links to an external site. contain data on fatal shootings by police officers in the United States. Over 30,000 such deaths have been recorded
When we study the data, we don't use their names. But let's remember that every person on these lists was an individual with family, friends, and a personal story.
- Go to KilledbyPolice.net and scan the list of names. Are any of these familiar to you? Choose one that is NOT familiar, and write down (or tell the others in your group) what is reported about this person. SKIP THIS QUESTION, THE WEBSITE KILLEDBYPOLICE.NET WAS TAKEN OVER BY A GUN SALES COMPANY
A Random Sample from a Population
In this assignment, we will look at a random sample of deaths taken from from a database including 8,263 deaths during the time period January, 2013 to June, 2020.
A random-integer generator Links to an external site. from Random.org was used to generate three sets of 100 integers each out of the 8,263 numbered records. (Actually, they are numbered from 2 to 8,264). Here are the results:
The records selected using the first list were classified using race (White non-Hispanic, Black, Hispanic, Asian/Pacific Islander/Native American, Unknown) and age and gender. The race/gender results are summarized in the following table:
Race/Ethnicity | Male | Female | Total |
---|---|---|---|
White (non-Hispanic) | 40 | 5 | 45 |
Black | 30 | 1 | 31 |
Hispanic | 13 | 0 | 13 |
Asian or Pacific Islander or Native American | 2 | 0 | 2 |
Listed as "Unknown" | 9 | 0 | 9 |
2. What do you notice about this table? What questions do you have about these results?
One thing we might want to know is how the distribution of deaths at the hands of police would compare to the proportions of each group in the total US population. Remember, we are interested in the population of deaths by police, and the table only records a random sample. So we might also want to know whether the random sample is a "good" representation of the population of deaths.
Comparing the Sample to the Population
First, let's compare the sample (n = 100) to the population (N=8263):
Race/Ethnicity | Population (all deaths recorded in table) | Percent | expected out of 100 | actual in sample | ||
---|---|---|---|---|---|---|
White-non-Hispanic | 3615 | .437 | 43.7 | 45 | ||
Black | 2075 | .251 | 25.1 | 31 | ||
Hispanic | 1418 | .172 | 17.2 | 13 | ||
Unknown or Asian or Pacific Islander or Native American* | 1155 | .140 | 14 | 11 | ||
TOTAL | 8263 | 1.000 | 100 | 100 |
Note: The reason for grouping Asians, Pacific Islanders and Native Americans together with Unknown is that the expected values must all be at least 5. If we separate these groups, they won't be.
Notice that the "actual" distribution in the sample is not equal to the "expected" distribution. This should not be surprising, as every random sample (n=100) taken from the population will have a different distribution. Our question here is really: What is the probability of randomly selecting a sample this different from the population? If it is "very low" then we would wonder whether we were just unlucky, or whether there might have been something wrong with the sampling method.
Null hypothesis: Sample data fits the given (population) distribution
Alternative hypothesis: Data does not fit the distribution
We can use the usual alpha = .05. The distribution is the Chi-Squared distribution. The test statistic Chi-squared is calculated as with degrees of freedom = n-1 where n is the number of rows in the table.
- Using your calculator, calculate the value of Chi-Squared. Review the videos if necessary. If you have a "ChiSq GoF" test function, use it.
Race/Ethnicity | Population (all deaths recorded in table) | Percent | EXPECTED out of a sample of 100 | OBSERVED in actual sample | |
---|---|---|---|---|---|
White-non-Hispanic | 3615 | 0.437 | 43.7 | 45 | 0.039 |
Black | 2075 | 0.251 | 25.1 | 31 | 1.387 |
Hispanic | 1418 | 0.172 | 17.2 | 13 | 1.026 |
Unknown or Asian or Pacific Islander or Native American* | 1155 | 0.140 | 14 | 11 | 0.643 |
TOTAL | 8263 | 1.000 | 100 | 100 | 3.095 |
So we have chi-squared = 3.095 with d.f. = 4-1=3. Chi-squared is always a right-tailed test. From a table we find that the critical value for a .05 level of significance is 7.81. The test statistic is NOT in the critical region so we do NOT reject the null hypothesis. There is no reason to doubt that the sample fits the population distribution. If you used a TI-84 or any other automated ChiSq GOF test, it would automatically generate a p-value (which would be much more than alpha).
Comparing the sample to the race/ethnicity distribution of the US
Race/Ethnicity | US POPULATION (2016 data) - % | EXPECTED out of a sample of 100 | OBSERVED in actual sample | |
---|---|---|---|---|
White-non-Hispanic | 61.0% | 61 | 45 | |
Black | 13.4% | 13.4 | 31 | |
Hispanic | 18.0% | 18 | 13 | |
Unknown or Asian or Pacific Islander or Native American* | 7.6% | 7.6 | 11 | |
TOTAL | 100% | 100 | 100 |
- Conduct a Chi-Square Goodness-of-Fit test on the data. Include the null and alternative hypotheses. You may use either the critical-region method or (if your calculator automatically provides a p-value) the p-value method. What do you conclude?
Post comments, answers to questions, etc. on this assignment in the Who Is Killed by Police? (Chi-squared Goodness of Fit) Discussion.