rapid_models.gp_diagnostics.cv
Module Contents
Functions

Compute multifold CV residuals for GP regression with noiseless (noise_variance = 0) or fixed variance iid Gaussian noise. 

Compute multifold CV residuals from the Cholesky factor L of the observation precision matrix and the training data Y_train 

Compute LeaveOneOut (LOO) residuals for GP regression with noiseless (noise_variance = 0) or fixed variance iid Gaussian noise. 

Compute LeaveOneOut (LOO) residuals from the Cholesky factor L of the observation precision matrix and the training data Y_train 

Check that the list of index subsets (list of lists) is valid 

Check that the argument is a 2d numpy array which is lower triangular 

Check that the argument is a numpy array of correct dimension 

Compute multifold cv residuals using matrix inverse (for testing) 
 rapid_models.gp_diagnostics.cv.multifold(K, Y_train, folds, noise_variance=0, check_args=True)[source]
Compute multifold CV residuals for GP regression with noiseless (noise_variance = 0) or fixed variance iid Gaussian noise. (residual = observed  predicted)
 Parameters
K (2d array) – GP prior covariance matrix
Y_train (array) – training observations
folds (list of lists) – The index subsets
noise_variance – variance of the observational noise. Set noise_variance = 0 for noiseless observations
check_args (bool) – Check (assert) that arguments are wellspecified before computation
 Returns
Mean of CV residuals cov: Covariance of CV residuals residuals_transformed: The residuals transformed to the standard normal space
 Return type
mean
This function just calls ‘multifold_cholesky()’ with the appropriate Cholesky factor. It is based on the formulation derived in:
[D. Ginsbourger and C. Schaerer (2021). Fast calculation of Gaussian Process multiplefold crossvalidation residuals and their covariances. arXiv:2101.03108]
 rapid_models.gp_diagnostics.cv.multifold_cholesky(L, Y_train, folds, check_args=True)[source]
Compute multifold CV residuals from the Cholesky factor L of the observation precision matrix and the training data Y_train (residual = observed  predicted)
 Parameters
L (2d array) – lower triangular Cholesky factor of covariance matrix (L L.T = covariance matrix)
Y_train (array) – training observations
folds (list of lists) – The index subsets
check_args (bool) – Check (assert) that arguments are wellspecified before computation
 Returns
Mean of CV residuals cov: Covariance of CV residuals residuals_transformed: The residuals transformed to the standard normal space
 Return type
mean
Note: * The matrix K = L L.T is the covariance matrix of the predicted observations Y_train * For observations including Gaussian noise with fixed variance (v), the matrix K is K = (K + v*I) where K[i, j] is the prior covariance of the latent GP between the ith an jth training location
This implementation uses the Cholesky factor instead of the inverse precision matrix, but is otherwise equivalent to the formulas derived in
[D. Ginsbourger and C. Schaerer (2021). Fast calculation of Gaussian Process multiplefold crossvalidation residuals and their covariances. arXiv:2101.03108]
 rapid_models.gp_diagnostics.cv.loo(K, Y_train, noise_variance=0, check_args=True)[source]
Compute LeaveOneOut (LOO) residuals for GP regression with noiseless (noise_variance = 0) or fixed variance iid Gaussian noise. (residual = observed  predicted) This function just calls ‘loo_cholesky()’ with the appropriate Cholesky factor.
 Parameters
K (2d array) – GP prior covariance matrix
Y_train (array) – training observations
noise_variance – variance of the observational noise. Set noise_variance = 0 for noiseless observations
check_args (bool) – Check (assert) that arguments are wellspecified before computation
 Returns
Mean of LOO residuals cov: Covariance of LOO residuals residuals_transformed: The residuals transformed to the standard normal space
 Return type
mean
 rapid_models.gp_diagnostics.cv.loo_cholesky(L, Y_train, check_args=True)[source]
Compute LeaveOneOut (LOO) residuals from the Cholesky factor L of the observation precision matrix and the training data Y_train (residual = observed  predicted)
 Parameters
L (2d array) – lower triangular Cholesky factor of covariance matrix (L L.T = covariance matrix)
Y_train (array) – training observations
check_args (bool) – Check (assert) that arguments are wellspecified before computation
 Returns
Mean of LOO residuals cov: Covariance of LOO residuals residuals_transformed: The residuals transformed to the standard normal space
 Return type
mean
Note: * The matrix K = L L.T is the covariance matrix of the predicted observations Y_train * For observations including Gaussian noise with fixed variance (v), the matrix K is K = (K + v*I) where K[i, j] is the prior covariance of the latent GP between the ith an jth training location
This implementation uses the Cholesky factor instead of the inverse precision matrix, but is otherwise equivalent to the formulas derived in
[O. Dubrule. Cross validation of kriging in a unique neighborhood. Journal of the International Association for Mathematical Geology, 15 (6):687699, 1983.]
 rapid_models.gp_diagnostics.cv.check_folds_indices(folds, n_max)[source]
Check that the list of index subsets (list of lists) is valid
 Parameters
folds (list of lists) – The index subsets.
n_max (int) – Total number of indices.
 Raises
AssertionError – if not ‘folds’ represents the range [0:n_max1] of n_max indices split into non overlapping subsets
 rapid_models.gp_diagnostics.cv.check_lower_triangular(arr, argname='arr')[source]
Check that the argument is a 2d numpy array which is lower triangular
 Parameters
() (arr) – object
 Raises
AssertionError – if not ‘arr’ represents a lower triangular matrix
 rapid_models.gp_diagnostics.cv.check_numeric_array(arr, dim, argname='arr')[source]
Check that the argument is a numpy array of correct dimension
 Parameters
() (arr) – object
 Raises
AssertionError – if not ‘arr’ represents a ‘dim’dimensional numpy array
 rapid_models.gp_diagnostics.cv._multifold_inv(K, Y_train, folds)[source]
Compute multifold cv residuals using matrix inverse (for testing) (residual = observed  predicted)
 Parameters
K (2d array) – covariance matrix
Y_train (array) – training observations
folds (list of lists) – The index subsets.
 Returns
Mean of CV residuals cov: Covariance of CV residuals residuals_transformed: The residuals transformed to the standard normal space
 Return type
mean
[D. Ginsbourger and C. Schaerer (2021). Fast calculation of Gaussian Process multiplefold crossvalidation residuals and their covariances. arXiv:2101.03108]