Peter Ralph
8 January 2018 – Advanced Biological Statistics
Course structure and introductions
Probability: review and notation
Bayesian analysis
Application: Beta distributed coins
Suppose I have two trick coins:
But, I lost one and I don’t know which! So, I flip it 10 times and get 6 heads. Which is it, and how sure are you?
Er, probably coin #1?
Well, \[\begin{aligned} \P\{ \text{6 H in 10 flips} \given \text{coin A} \} &= \binom{10}{6} (.75)^6 (.25)^4 \\ &= 0.146 \end{aligned}\] and \[\begin{aligned} \P\{ \text{6 H in 10 flips} \given \text{coin B} \} &= \binom{10}{6} (.25)^6 (.75)^4 \\ &= 0.016 \end{aligned}\] … so, probably coin A?
For a precise answer…
Before flipping, each coin seems equally likely. Then
\[\begin{aligned} \P\{ \text{coin A} \given \text{6 H in 10 flips} \} &= \frac{ \frac{1}{2} \times 0.146 }{ \frac{1}{2} \times 0.146 + \frac{1}{2} \times 0.016 } \\ &= 0.9 \end{aligned}\]
Probabilities are proportions: \(\hspace{2em} 0 \le \P\{A\} \le 1\)
Everything: \(\hspace{2em} \P\{ \Omega \} = 1\)
Complements: \(\hspace{2em} \P\{ \text{not } A\} = 1 - \P\{A\}\)
Disjoint events: If \(\hspace{2em} \P\{A \text{ and } B\} = 0\) then \(\hspace{2em} \P\{A \text{ or } B\} = \P\{A\} + \P\{B\}\).
Independence: \(A\) and \(B\) are independent iff \(\P\{A \text{ and } B\} = \P\{A\} \P\{B\}\).
Conditional probability: \[\P\{A \given B\} = \frac{\P\{A \text{ and } B\}}{ \P\{B\} }\]
A consequence is
\[\P\{B \given A\} = \frac{\P\{B\} \P\{A \given B\}}{ \P\{A\} } .\]
In “Bayesian statistics”:
Suppose instead I had 9 coins, with probabilities 10%, 20%, …, 80%, 90%; as before I flipped on 10 times and got 6 heads. For each \(\theta\) in \(0.1, 0.2, \ldots, 0.8, 0.9\), find \[\begin{aligned} \P\{\text{ coin has prob $\theta$ } \given \text{ 6 H in 10 flips } \} \end{aligned}\] and plot these as a bar plot.
Question: which coin(s) is it, and how sure are you? (and, what do you mean when you say how sure you are)
Time allowing, do it again with 99 coins. And 999 coins. Does your answer change?
prior
\(\times\)
likelihood
\(\propto\)
posterior
prior
\(\times\)
likelihood
\(\propto\)
posterior
prior
\(\times\)
likelihood
\(\propto\)
posterior
prior
\(\times\)
likelihood
\(\propto\)
posterior
prior
\(\times\)
likelihood
\(\propto\)
posterior
prior
\(\times\)
likelihood
\(\propto\)
posterior
prior
\(\times\)
likelihood
\(\propto\)
posterior
If \[P \sim \text{Beta}(a,b)\] then \(P\) has probability density \[p(\theta) = \frac{ \theta^{a-1} (1 - \theta)^{b-1} }{ B(a,b) } . \]
Takes values between 0 and 1.
If \(U_{(1)} < U_{(2)} < \cdots < U_{(n)}\) are sorted, independent \(\text{Unif}[0,1]\) then \(U_{(k)} \sim \text{Beta}(k, n-k+1)\).
Mean: \(a/(a+b)\).
Larger \(a+b\) is more tightly concentrated (like \(1/\sqrt{a+b}\))
Recall: there were nine possible values of \(\theta\).
Which coin is it, and how sure are you?
Possible types of answer:
Give examples of when each type of answer is the right one.
Now suppose we want to estimate the probability of heads for a coin without knowing the possible values. (or, a disease incidence, or error rate in an experiment, …)
We flip it \(n\) times and get \(z\) Heads.
The likelihood of this, given the prob-of-heads \(\theta\), is: \[p(z \given \theta) = \binom{n}{z}\theta^z (1-\theta)^{n-z} . \]
How to weight the possible \(\theta\)? Need a flexible set of weighting functions, i.e., prior distributions on \([0,1]\).
What would we use if:
the coin is probably close to fair.
the disease is probably quite rare.
no idea whatsoever.
If \[\begin{aligned} P &\sim \text{Beta}(a,b) \\ Z &\sim \text{Binom}(n,P) , \end{aligned}\] then by Bayes’ rule: \[\begin{aligned} \P\{ P = \theta \given Z = z\} &= \frac{\P\{Z = z \given P = \theta \} \P\{P = \theta\}}{\P\{Z = z\}} \\ &= \frac{ \binom{n}{z}\theta^z (1-\theta)^{n-z} \times \frac{\theta^{a-1}(1-\theta)^{b-1}}{B(a,b)} }{ \text{(something)} } \\ &= \text{(something else)} \times \theta^{a + z - 1} (1-\theta)^{b + n - z - 1} . \end{aligned}\]
“Miraculously” (the Beta is the conjugate prior for the Binomial), \[\begin{aligned} P \given Z = z \sim \text{Beta}(a+z, b+n-z) . \end{aligned}\]
Simulate \(10^6\) coin probabilities, called \(\theta\), from Beta(5,5). (rbeta
)
For each coin, simulate 10 flips. (rbinom
)
Make a histogram of the true probabilities (values of \(\theta\)) of only those coins having 3 of 10 heads.
Compare the distribution to Beta(\(a\),\(b\)) – with what \(a\) and \(b\)? (dbeta
)
Explain what happened.
We flip an odd-looking coin 100 times, and get 65 heads. What is it’s true* probability of heads?
True = ??
What prior to use?
Is it reasonable that \(\theta = 1/2\)?
Best guess at \(\theta\)?
How far off are we, probably?
How much does the answer depend on the prior?
Does our procedure work on simulated data?