Trial by Jury Statistical Hypothesis Testing
Prosecutor Statistician
Trial Collection of Data
Jury decides on the verdict Statistical test
Assume defendant is innocent Assume the null hypothesis is true
Weigh the evidence provided by Assess the evidence provided by testimony and exhibits the data (as summarized in the test statistic) assuming defendant is innocent assuming null hypothesis is true
Evidence against the defendant Calculate a p-value for the test statistic assuming defendant is innocent assuming null hypothesis is true
Defendant found guilty Reject the null hypothesis if beyond a reasonable doubt p-value less than the significance level
4+2+0+1+2 9 --------- = - = 1.8 heads 5 5
The Law of Large Numbers states that as the sample size (number of
observations) increases, the sample mean will approach the actual mean.
_ x1 + x2 + ... + xn x = ------------------ , n
the Central Limit Theorem states that as the sample size increases, the distribution of
_ x becomes closer to a normal distribution.} Also, the distribution of the sum of the random observations, \pfeq{\bV n SUM x i=1 iSUM_i=1^n x_i becomes closer to a normal distribution.
Preferred Newspaper Gender Globe and Mail Toronto Star Toronto Sun
Male O_(1,1) O_(1,2) O_(1,3)
Female O_(2,1) O_(2,2) O_(2,3)
n = SUM SUM O i j i,j). Assume that there is no relationship between gender and newspaper preference. Then applying our fact from probability theory, the probability of a male preferring the Globe and Mail is the proportion of males times the proportion of Globe readers; the expected number of male Globe readers is n times that. Call this expected count in category (i,j): E_(i,j).
2 2 (O(i,j) - E(i,j)) X = SUM SUM ------------------ . i j E(i,j)
If gender and newspaper are truly independent, X^2 has a chi^2 distribution on
rc-1-(r-1)-(c-1)=(r-1)(c-1)degrees of freedom, where r is the number of rows in our table and c is the number of columns.
Note 1: We lose a degree of freedom each time we treat something as fixed, for example, the total number of males, the total number of Sun readers, etc.
Note 2: The distribution of X^2 follows from the above distribution theory, plus some calculation. See, for example, Mathematical Statistics with Applications, by Mendenhall, Wackerly, and Scheaffer.
The distribution of the test statistic assuming the null hypothesis
is true: chi^2 with (r-1)(c-1) degrees of freedom.
The conclusion: If the probability of getting
an X^2 that is as large or larger
than what we got is small, we have evidence that our null
hypothesis is false.
Newspaper
Gender Globe Star Sun
Male
Female
Words Sense and Sensibility Emma Sanditon I Sandition II
a PB such 14 16 8 2
a NPB such 133 180 93 81
and FB I 12 14 12 1
and NFP I 241 285 139 153
the PB on 11 6 8 17
the NPB on 259 265 221 204
Hit No hit
Regular season 2584 7280 World Series 35 63