3 Section 3. Multidimensional Posterior
2021-09-02
3.1 Resources
3.2 Notes
3.2.1 Reading instructions
- the trace of a square matrix \(tr(A)\) is the sum of the diagonals
- the following property is used in derivation of 3.11: \(tr(ABC) = tr(CAB) = tr(BCA)\)
3.2.2 Chapter 3. Introduction to multiparameter models
Averaging over ‘nuisance parameters’
- suppose the unknown variable \(\theta\) is a vector of length two: \(\theta= (\theta_1, \theta_2)\)
- may only care about one of the variables, but the other is still required for a good model
- example model: \(y | \mu, \sigma^2 \sim N(\mu, \sigma^2)\)
- here, \(\theta\) would be the unknown values \(\mu (=\theta_1)\) and \(\sigma (=\theta_2)\), but we really only care about \(\mu\)
- we want \(p(\theta_1|y)\)
- derive it from the joint posterior density: \(p(\theta_1, \theta_2) \propto p(y|\theta_1, \theta2) p(\theta_1, \theta_2)\)
- by averaging over \(\theta_2\): \(p(\theta_1|y) = \int p(\theta_1, \theta_2| y) d\theta_2\)
- “integrate over the uncertainty in \(\theta_2\)”
Summary of elementary modeling and computation
- the following is an outline of a simple Bayesian analysis
- it will change when we get to more complex models whose posteriors are estimated by more complex sampling processes
- write the likelihood: \(p(y|\theta)\)
- write the posterior density: \(p(\theta|y) \propto p(\theta) p(y|\theta)\)$
- estimate the parameters \(\theta\) (e.g. using MLE)
- draw simulations \(\theta^1, \dots, \theta^S\) for the posterior distribution (using the results of 3 as a starting point); use the samples to compute any other functions of \(\theta\) that are of interest
- if any predictive quantities \(\tilde{y}\) are of interest, simulate \(\tilde{y}^1, \dots, \tilde{y}^S\) from \(p(\tilde{y} | \theta^s)\)
3.2.3 Lecture notes
(No extra notes were taken — some comments added directly to slides.)