17 Section 19. Notes on ‘Ch 21. Gaussian process models’
2022-01-13
::opts_chunk$set(echo = TRUE, dpi = 300, comment = "#>")
knitr
library(glue)
library(ggtext)
library(tidyverse)
theme_set(
theme_bw() +
theme(axis.ticks = element_blank(), strip.background = element_blank())
)
These are just notes on a single chapter of BDA3 that were not part of the course.
17.1 Chapter 21. Gaussian process models
- Gaussian process (GP): “flexible class of models for which any finite-dimensional marginal distribution is Gaussian” (pg. 501)
- “can be viewed as a potentially infinite-dimensional generalization of Gaussian distribution” (pg. 501)
21.1 Gaussian process regression
- realizations from a GP correspond to random functions
- good prior for an unknown regression function \(\mu(x)\)
- \(\mu \sim \text{GP}(m,k)\)
- \(m\): mean function
- \(k\): covariance function
- \(\mu\) is a random function (“stochastic process”) where the values at any \(n\) pooints \(x_1, \dots, x_n\) are drawn from the \(n-dimensional\) normal distribution
- with mean \(m\) and covariance \(K\):
\[ \mu(x_1), \dots, \mu(x_n) \sim \text{N}((m(x_1), \dots, m(x_n)), K(x_1, \dots, x_n)) \]
- the GP \(\mu \sim \text{GP}(m,k)\) is nonparametric with infinitely many parameters
- the mean function \(m\) represents an inital guess at the regression function
- the covariance function \(k\) represents the covariance between the process at any two points
- controls the smoothness of realizations from the GP and degree of shrinkage towards the mean
- below is an example of realizations from a GP with mean function 0 and the squared exponential (a.k.a. exponentiated quadratic, Gaussian) covariance function with different parameters
\[ k(x, x^\prime) = \tau^2 \exp(-\frac{|x-x^\prime|^2}{2l^2}) \]
<- function(x, tau, l) {
squared_exponential_cov <- length(x)
n <- matrix(0, nrow = n, ncol = n)
k <- 2 * (l^2)
denom for (i in 1:n) {
for (j in 1:n) {
<- x[i]
a <- x[j]
b <- tau^2 * exp(-(abs(a - b)^2) / (denom))
k[i, j]
}
}return(k)
}
<- function(x, tau, l, n = 3) {
my_gaussian_process <- rep(0, length(x))
m <- squared_exponential_cov(x = x, tau = tau, l = l)
k <- mvtnorm::rmvnorm(n = n, mean = m, sigma = k)
gp_samples return(gp_samples)
}
<- function(x, tau, l, n = 3) {
tidy_gp my_gaussian_process(x = x, tau = tau, l = l, n = n) %>%
as.data.frame() %>%
as_tibble() %>%
set_names(x) %>%
mutate(sample_idx = as.character(1:n())) %>%
pivot_longer(-sample_idx, names_to = "x", values_to = "y") %>%
mutate(x = as.numeric(x))
}
set.seed(0)
<- seq(0, 10, by = 0.1)
x <- tibble(tau = c(0.25, 0.5, 0.25, 0.5), l = c(0.5, 0.5, 2, 2)) %>%
gp_samples mutate(samples = purrr::map2(tau, l, ~ tidy_gp(x = x, tau = .x, l = .y, n = 3))) %>%
unnest(samples)
%>%
gp_samples mutate(grp = glue("\u03C4 = {tau}, \u2113 = {l}")) %>%
ggplot(aes(x = x, y = y)) +
facet_wrap(vars(grp), nrow = 2) +
geom_line(aes(color = sample_idx)) +
scale_x_continuous(expand = expansion(c(0, 0))) +
scale_y_continuous(expand = expansion(c(0.02, 0.02))) +
scale_color_brewer(type = "qual", palette = "Set1") +
theme(legend.position = "none", axis.text.y = element_markdown()) +
labs(x = "x", y = "\u03BC(x)")
Covariance functions
- “Different covariance functions can be used to add structural prior assumptions like smoothness, nonstationarity, periodicity, and multi-scale or hierarchical structures.” (pg. 502)
- sums and products of GPs are also GPs so can combine them in the same model
- can also use “anisotropic” GPs covariance functions for multiple predictors
Inference
- computing the mean and covariance in the \(n\)-variate normal conditional posterior for \(\tilde{\mu}\) involves a matrix inversion that requires \(O(n^3)\) computation
- this needs to be repeated for each MCMC step
- limits the size of the data set and number of covariates in a model
Covariance function approximations
- there are approximations to the GP that can speed up computation
- generally work by reducing the matrix inversion burden
21.3 Latent Gaussian process models
- with non-Gaussian likelihoods, the GP prior becomes a latent function \(f\) which determines the likelihood \(p(y|f,\phi)\) through a link function
21.4 Functional data analysis
- functional data analysis: considers responses and predictors not a scalar/vector-valued random variables but as random functions with infinitely-many points
- GPs fit this need well with little modification
21.5 Density estimation and regression
- can get more flexibility by modeling the conditional observation model as a nonparametric GP
- so far have used a GP as a prior for a function controlling location or shape of a parametric observation model
- one solution is the logistic Gaussian process (LGP) or a Dirichlet process (covered in a later chapter)
Density estimation
- LGP generates a random surface from a GP and then transforms the surface to the space of probability densities
- with 1D, the surface is just a curve
- use the continuous logistic transformation to constrain to non-negative and integrate to 1
- there is illustrative example in the book on page 513
Density regression
- generalize the LPG to density regression by putting a prior on the collection of conditional densities
Latent-variable regression
- an alternative to LPG
sessionInfo()
#> R version 4.1.2 (2021-11-01)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices datasets utils methods base
#>
#> other attached packages:
#> [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
#> [5] readr_2.0.1 tidyr_1.1.3 tibble_3.1.3 ggplot2_3.3.5
#> [9] tidyverse_1.3.1 ggtext_0.1.1 glue_1.4.2
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.7 lubridate_1.7.10 mvtnorm_1.1-2 clisymbols_1.2.0
#> [5] assertthat_0.2.1 digest_0.6.27 utf8_1.2.2 R6_2.5.0
#> [9] cellranger_1.1.0 backports_1.2.1 reprex_2.0.1 evaluate_0.14
#> [13] highr_0.9 httr_1.4.2 pillar_1.6.2 rlang_0.4.11
#> [17] readxl_1.3.1 rstudioapi_0.13 jquerylib_0.1.4 rmarkdown_2.10
#> [21] labeling_0.4.2 munsell_0.5.0 gridtext_0.1.4 broom_0.7.9
#> [25] compiler_4.1.2 modelr_0.1.8 xfun_0.25 pkgconfig_2.0.3
#> [29] htmltools_0.5.1.1 tidyselect_1.1.1 bookdown_0.24 fansi_0.5.0
#> [33] crayon_1.4.1 tzdb_0.1.2 dbplyr_2.1.1 withr_2.4.2
#> [37] grid_4.1.2 jsonlite_1.7.2 gtable_0.3.0 lifecycle_1.0.0
#> [41] DBI_1.1.1 magrittr_2.0.1 scales_1.1.1 cli_3.0.1
#> [45] stringi_1.7.3 farver_2.1.0 renv_0.14.0 fs_1.5.0
#> [49] xml2_1.3.2 bslib_0.2.5.1 ellipsis_0.3.2 generics_0.1.0
#> [53] vctrs_0.3.8 RColorBrewer_1.1-2 tools_4.1.2 markdown_1.1
#> [57] hms_1.1.0 yaml_2.2.1 colorspace_2.0-2 rvest_1.0.1
#> [61] knitr_1.33 haven_2.4.3 sass_0.4.0