Probability, Statistics & Modeling II
Behind the problem:
What is the claim?
chance_day_1 = 0.5 chance_day_2 = 0.5 chance_day_3 = 0.5 #...
Probability for correct prediction?
P(prediction == 1) = p_correct = 0.5
… on 10 consecutive days?
p_correct * p_correct * p_correct ...
p_correct = 0.5 # for d = 10 days d = 10 #Formal: p_correct ^ d
##  0.0009765625
Equivalent to: 1/2^10 = 1/1024
Even very, very, rare events happen…
You need probability theory to tell the lucky from the likely.
(and proper statistics notations)
Maria is 26 years old, single, outspoken, and very bright. She majored in law. As a student, she was deeply concerned with issues of discrimination and miscarriage of justice, and also participated in animal-rights demonstrations.
Adapted from Tversky & Kahneman (1983)
There’s something special with P(B)
P(B) = P(A) + "something else"
P(B) contains two ‘events’: P(A) and ‘pro bono work’
Let 'pro bono work' be P(C)
P(B) = P(A) and P(C)
P(B) = P(A and C)
Prob_A = 0.4 Prob_C = 0.3
P(A and B) = P(A)*P(C)
(Prob_A_and_C = Prob_A * Prob_C)
##  0.12
P(X) > P(X and Y)
P(‘M is a lawyer’) > P(‘M is a lawyer’ and ‘pro-bono work’)
P(EVENT_A AND EVENT_B) = P(EVENT_A)*P(EVENT_B)
Probability of two independent events is always smaller than the probability of each single events.
What are the chances that this man is a terrorist?
Probability of TERRORIST given that there is an ALARM
P(terrorist GIVEN alarm)
P(terrorist|alarm) = 950/5900 = 16.10%
Setting the stage:
accuracy = 0.95 #P(A|T) baserate = 0.01 #P(T)
P(T|A) = ( P(A|T) * P(T) ) / P(A)
P(A) –> probability of any alarm???
P(A) = P(A|T) * P(T) + P(A|notT) * P(notT)
(Prob_notT = 1 - baserate) #P(notT) = 1 - P(T)
##  0.99
(Prob_A_given_notT = 1 - accuracy) #P(A|notT) = 1 = P(A|T)
##  0.05
Putting it together:
#Bayes' rule: Prob_A = accuracy * baserate + Prob_A_given_notT * Prob_notT #P(A) = P(A|T) * P(T) + P(A|notT) * P(notT) Prob_A
##  0.059
Prob_T_given_A = (accuracy * baserate) / Prob_A #P(T|A) = ( P(A|T) * P(T) ) / P(A) Prob_T_given_A
##  0.1610169
! Revise this rule here
P(EVENT_A GIVEN EVENT_B) = P(EVENT_A|EVENT_B)
Probability of one event given that another event is true.
BEWARE OF THE BASERATE FALLACY
Solving gang crime
Problem: gang crime in London
Mayor proposes two programmes:
100 gang-members in two areas.
Outcome measure: number of gang members who disengaged
|Programme A||Programme B|
Mayor has GBP 5m to invest in one programme.
|Programme A||Programme B|
|Camden||63/90 = 70%||8/10 = 80%|
|Lambeth||4/10 = 40%||45/90= 50%|
|67/100 = 67%||53/100 = 53%|
[a] phenomenon wherein an association or a trend observed in the data at the level of the entire population disappears or even reverses when data is disaggregated by its underlying subgroups Alipourfard et al., 2018
BEWARE OF THE CONTEXT OF YOUR DATA
10 min. break
More on learning outcomes in the module handbook
Teaching assistant: Isabelle van der Vegt
Homework for today:
Tutorial + lecture
Tutorial: Refresher of PSM I with R + GLM tutorial