Probability, Statistics & Modeling II
Bayesian statistics
Rewind:
Null hypothesis significance testing
NULL hypothesis testing
Directed hypotheses:
NULL hypothesis testing
Purpose:
In fact: all we can ever say is whether \(H_0\) was rejected or not!
So: we deseperately need hypotheses, but NHST is weak
Rooted in two ideas of probability:
Frequentist vs Bayesian
I have misplaced my phone somewhere in the home. I can use the phone locator on the base of the instrument to locate the phone and when I press the phone locator the phone starts beeping.
Problem: Which area of my home should I search?
From this SO post
Frequentist Reasoning
I can hear the phone beeping. I also have a mental model which helps me identify the area from which the sound is coming. Therefore, upon hearing the beep, I infer the area of my home I must search to locate the phone.
Bayesian Reasoning
I can hear the phone beeping. Now, apart from a mental model which helps me identify the area from which the sound is coming from, I also know the locations where I have misplaced the phone in the past. So, I combine my inferences using the beeps and my prior information about the locations I have misplaced the phone in the past to identify an area I must search to locate the phone.
Remember?
\(P(A|B) = \frac{P(B|A)*P(A)}{P(B)}\)
\(P(terrorist|alarm) = \frac{P(alarm|terrorist)*P(terrorist)}{P(alarm)}\)
We want to know:
\(P(H|D)\) : prob. of the hyp. given the data (posterior)
\(P(H|D) = \frac{P(D|H)*P(H)}{P(D)}\)
\(posterior = \frac{likelihood*prior}{marginal}\)
Since: \(P(D)\) does not involve the hypothesis, …
\(P(H|D) \propto P(D|H)*P(H)\)
\(posterior \propto likelihood*prior\)
Bayesian inference: about updating beliefs with the data.
If for any H:
\(P(H|D) \propto P(D|H)*P(H)\)
… then maybe we can compare the evidence \(P(H_0|D)\) with the evidence \(P(H_A|D)\)?
Important: no special status for \(H_0\)!
Suppose we have not seen the data, then:
\(odds_{0A} = \frac{P(H_0)}{P(H_A)}\)
or:
\(odds_{prior} = \frac{prior_{H_0}}{prior_{H_A}}\)
What we want for two hypotheses \(H_0\) and \(H_A\) is:
\(\frac{P(H_A|D)}{P(H_0|D)} = \frac{P(D|H_A)}{P(D|H_0)}*\frac{P(H_A)}{P(H_0)}\)
How much more likely the data are under \(H_A\) compared to \(H_0\).
Called the Bayes Factor \(BF_{A0}\)
The evidence in the data favors one hypothesis, relative to another, exactly to the degree that the hypothesis predicts the observed data better than the other.
Suppose we have two lines of thought re. successful re-integration after prison:
Optimists say that 65% of offenders can be re-integrated in society; skeptics say it’s 40%.
Data: 100 offenders and their outcome (successful vs fail)
Closer to the optmists, but how much?
How much does the evidence change our beliefs?
Plausibility of the hypotheses \(H_{opt.} = 0.65\) and \(H_{skept.} = 0.40\) changes according to Bayes’ rule!
58 successes:
for \(H_{skept.} = 0.40\) = 0.0001
So: \(\frac{H_{opt.}}{H_{skept.}} = \frac{0.0284}{0.0001} = 250.03\)
\(BF = \frac{P(D|H_{opt.})}{P(D|H_{skept.})} = 250.03\)
The data are 250 times more likely under \(H_{opt.}\) than under \(H_{skept.}\)
Bayesian estimation can handle this.
It can solve the sh** \(H_0\) problem!!!!!
Now we can quantify relative evidence:
\(BF_{01} = \frac{P(H_0|D)}{P(H_1|D)}\)
Relative evidence of \(H_0\) over \(H_1\)
Two approaches in PSM2:
tapply(mydata$score, mydata$group, mean)
## A B
## 101.2419 100.6370
t.test(score ~ group
, data = mydata
, var.eq=TRUE)
##
## Two Sample t-test
##
## data: score by group
## t = 0.90114, df = 1998, p-value = 0.3676
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.7115953 1.9214737
## sample estimates:
## mean in group A mean in group B
## 101.2419 100.6370
Cohen’s d
d = 0.90*(sqrt(1/1000 + 1/1000))
d
## [1] 0.04024922
library(BayesFactor)
ttestBF(formula = score ~ group
, data = mydata)
## Bayes factor analysis
## --------------
## [1] Alt., r=0.707 : 0.07520633 ±0%
##
## Against denominator:
## Null, mu1-mu2 = 0
## ---
## Bayes factor type: BFindepSample, JZS
\(BF_{10} = 0.075\), so:
\(BF_{01} = 1/0.075 = 13.33\)
–> Evidence quantified for both hypotheses!