GPy.models package¶

Introduction¶

This package principally contains classes ultimately inherited from GPy.core.gp.GP intended as models for end user consuption - much of GPy.core.gp.GP is not intended to be called directly. The general form of a “model” is a function that takes some data, a kernel (see GPy.kern) and other parameters, returning an object representation.

Several models directly inherit GPy.core.gp.GP:

Inheritance diagram of GPy.models.gp_classification, GPy.models.gp_coregionalized_regression, GPy.models.gp_heteroscedastic_regression, GPy.models.gp_offset_regression, GPy.models.gp_regression, GPy.models.gp_var_gauss, GPy.models.gplvm, GPy.models.input_warped_gp, GPy.models.multioutput_gp

Some models fall into conceptually related groups of models (e.g. GPy.core.sparse_gp, GPy.core.sparse_gp_mpi):

Inheritance diagram of GPy.models.bayesian_gplvm, GPy.models.bayesian_gplvm_minibatch, GPy.models.gp_multiout_regression, GPy.models.gp_multiout_regression_md, GPy.models.ibp_lfm.IBPLFM, GPy.models.sparse_gp_coregionalized_regression, GPy.models.sparse_gp_minibatch, GPy.models.sparse_gp_regression, GPy.models.sparse_gp_regression_md, GPy.models.sparse_gplvm

In some cases one end-user model inherits another e.g.

Inheritance diagram of GPy.models.bayesian_gplvm_minibatch

Submodules¶

GPy.models.bayesian_gplvm module¶

class BayesianGPLVM(Y, input_dim, X=None, X_variance=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='bayesian gplvm', mpi_comm=None, normalizer=None, missing_data=False, stochastic=False, batchsize=1, Y_metadata=None)[source]¶

Bases: GPy.core.sparse_gp_mpi.SparseGP_MPI

Bayesian Gaussian Process Latent Variable Model

Parameters:	Y (np.ndarray\| GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood input_dim (int) – latent dimensionality init ('PCA'\|'random') – initialisation method for the latent space

get_X_gradients(X)[source]¶: Get the gradients of the posterior distribution of X in its specific form.

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶

Plot a scatter plot of the inducing inputs.

Parameters:

which_indices ([int]) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – marker to use [default is custom arrow like]
kwargs – the kwargs for the scatter plots
projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
scatter_kwargs – the kwargs for the scatter plots

plot_scatter(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶

Plot a scatter plot of the latent space.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – markers to use - cycle if more labels then markers are given
kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
annotation_kwargs – the kwargs for the annotation plot
scatter_kwargs – the kwargs for the scatter plots

set_X_gradients(X, X_grad)[source]¶: Set the gradients of the posterior distribution of X in its specific form.

GPy.models.bayesian_gplvm_minibatch module¶

class BayesianGPLVMMiniBatch(Y, input_dim, X=None, X_variance=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='bayesian gplvm', normalizer=None, missing_data=False, stochastic=False, batchsize=1)[source]¶

Bases: GPy.models.sparse_gp_minibatch.SparseGPMiniBatch

Bayesian Gaussian Process Latent Variable Model

Parameters:	Y (np.ndarray\| GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood input_dim (int) – latent dimensionality init ('PCA'\|'random') – initialisation method for the latent space

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶

Plot a scatter plot of the inducing inputs.

Parameters:

which_indices ([int]) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – marker to use [default is custom arrow like]
kwargs – the kwargs for the scatter plots
projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
scatter_kwargs – the kwargs for the scatter plots

plot_scatter(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶

Plot a scatter plot of the latent space.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – markers to use - cycle if more labels then markers are given
kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
annotation_kwargs – the kwargs for the annotation plot
scatter_kwargs – the kwargs for the scatter plots

GPy.models.bcgplvm module¶

class BCGPLVM(Y, input_dim, kernel=None, mapping=None)[source]¶

Bases: GPy.models.gplvm.GPLVM

Back constrained Gaussian Process Latent Variable Model

Parameters:	Y (np.ndarray) – observed data input_dim (int) – latent dimensionality mapping (GPy.core.Mapping object) – mapping for back constraint

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.dpgplvm module¶

class DPBayesianGPLVM(Y, input_dim, X_prior, X=None, X_variance=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='bayesian gplvm', mpi_comm=None, normalizer=None, missing_data=False, stochastic=False, batchsize=1)[source]¶

Bases: GPy.models.bayesian_gplvm.BayesianGPLVM

Bayesian Gaussian Process Latent Variable Model with Descriminative prior

GPy.models.gp_classification module¶

class GPClassification(X, Y, kernel=None, Y_metadata=None, mean_function=None, inference_method=None, likelihood=None, normalizer=False)[source]¶

Bases: GPy.core.gp.GP

Gaussian Process classification

This is a thin wrapper around the models.GP class, with a set of sensible defaults

Parameters:	X – input observations Y – observed values, can be None if likelihood is not None kernel – a GPy kernel, defaults to rbf likelihood – a GPy likelihood, defaults to Bernoulli inference_method (`GPy.inference.latent_function_inference.LatentFunctionInference`) – Latent function inference to use, defaults to EP

Note

Multiple independent outputs are allowed using columns of Y

static from_dict(input_dict, data=None)[source]¶

Instantiate an object of a derived class using the information in input_dict (built by the to_dict method of the derived class). More specifically, after reading the derived class from input_dict, it calls the method _build_from_input_dict of the derived class. Note: This method should not be overrided in the derived class. In case it is needed, please override _build_from_input_dict instate.

Parameters:	input_dict (dict) – Dictionary with all the information needed to instantiate the object.

static from_gp(gp)[source]¶

save_model(output_filename, compress=True, save_data=True)[source]¶

to_dict(save_data=True)[source]¶

Convert the object into a json serializable dictionary. Note: It uses the private method _save_to_input_dict of the parent.

Parameters:	save_data (boolean) – if true, it adds the training data self.X and self.Y to the dictionary
Return dict:	json serializable dictionary containing the needed information to instantiate the object

GPy.models.gp_coregionalized_regression module¶

class GPCoregionalizedRegression(X_list, Y_list, kernel=None, likelihoods_list=None, name='GPCR', W_rank=1, kernel_name='coreg')[source]¶

Bases: GPy.core.gp.GP

Gaussian Process model for heteroscedastic multioutput regression

This is a thin wrapper around the models.GP class, with a set of sensible defaults

Parameters:

X_list (list of numpy arrays) – list of input observations corresponding to each output
Y_list (list of numpy arrays) – list of observed values related to the different noise models
kernel (None | GPy.kernel defaults) – a GPy kernel ** Coregionalized, defaults to RBF ** Coregionalized
name (string) – model name
W_rank (integer) – number tuples of the corregionalization parameters ‘W’ (see coregionalize kernel documentation)
kernel_name (string) – name of the kernel

Likelihoods_list:

a list of likelihoods, defaults to list of Gaussian likelihoods

GPy.models.gp_grid_regression module¶

class GPRegressionGrid(X, Y, kernel=None, Y_metadata=None, normalizer=None)[source]¶

Bases: GPy.core.gp_grid.GpGrid

Gaussian Process model for grid inputs using Kronecker products

This is a thin wrapper around the models.GpGrid class, with a set of sensible defaults

Parameters:	X – input observations Y – observed values kernel – a GPy kernel, defaults to the kron variation of SqExp normalizer (Norm) – [False] Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)

Note

Multiple independent outputs are allowed using columns of Y

GPy.models.gp_heteroscedastic_regression module¶

class GPHeteroscedasticRegression(X, Y, kernel=None, Y_metadata=None)[source]¶

Bases: GPy.core.gp.GP

Gaussian Process model for heteroscedastic regression

This is a thin wrapper around the models.GP class, with a set of sensible defaults

Parameters:	X – input observations Y – observed values kernel – a GPy kernel, defaults to rbf

NB: This model does not make inference on the noise outside the training set

GPy.models.gp_kronecker_gaussian_regression module¶

class GPKroneckerGaussianRegression(X1, X2, Y, kern1, kern2, noise_var=1.0, name='KGPR')[source]¶

Bases: GPy.core.model.Model

Kronecker GP regression

Take two kernels computed on separate spaces K1(X1), K2(X2), and a data matrix Y which is f size (N1, N2).

The effective covaraince is np.kron(K2, K1) The effective data is vec(Y) = Y.flatten(order=’F’)

The noise must be iid Gaussian.

See [stegle_et_al_2011].

References

[stegle_et_al_2011]

Stegle, O.; Lippert, C.; Mooij, J.M.; Lawrence, N.D.; Borgwardt, K.:Efficient inference in matrix-variate Gaussian models with iid observation noise. In: Advances in Neural Information Processing Systems, 2011, Pages 630-638

log_likelihood()[source]¶

parameters_changed()[source]¶: This method gets called when parameters have changed. Another way of listening to param changes is to add self as a listener to the param, such that updates get passed through. See :py:function:paramz.param.Observable.add_observer

predict(X1new, X2new)[source]¶

Return the predictive mean and variance at a series of new points X1new, X2new Only returns the diagonal of the predictive variance, for now.

Parameters:	X1new (np.ndarray, Nnew x self.input_dim1) – The points at which to make a prediction X2new (np.ndarray, Nnew x self.input_dim2) – The points at which to make a prediction

GPy.models.gp_multiout_regression module¶

class GPMultioutRegression(X, Y, Xr_dim, kernel=None, kernel_row=None, Z=None, Z_row=None, X_row=None, Xvariance_row=None, num_inducing=(10, 10), qU_var_r_W_dim=None, qU_var_c_W_dim=None, init='GP', name='GPMR')[source]¶

Bases: GPy.core.sparse_gp.SparseGP

Gaussian Process model for multi-output regression without missing data

This is an implementation of Latent Variable Multiple Output Gaussian Processes (LVMOGP) in [Dai_et_al_2017].

References

[Dai_et_al_2017]

Dai, Z.; Alvarez, M.A.; Lawrence, N.D: Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes. In NIPS, 2017.

Parameters:

X (numpy.ndarray) – input observations.
Y (numpy.ndarray) – output observations, each column corresponding to an output dimension.
Xr_dim (int) – the dimensionality of a latent space, in which output dimensions are embedded in
kernel (GPy.kern.Kern or None) – a GPy kernel for GP of individual output dimensions ** defaults to RBF **
kernel_row (GPy.kern.Kern or None) – a GPy kernel for the GP of the latent space ** defaults to RBF **
Z (numpy.ndarray or None) – inducing inputs
Z_row (numpy.ndarray or None) – inducing inputs for the latent space
X_row (numpy.ndarray or None) – the initial value of the mean of the variational posterior distribution of points in the latent space
Xvariance_row (numpy.ndarray or None) – the initial value of the variance of the variational posterior distribution of points in the latent space
num_inducing ((int, int)) – a tuple (M, Mr). M is the number of inducing points for GP of individual output dimensions. Mr is the number of inducing points for the latent space.
qU_var_r_W_dim (int) – the dimensionality of the covariance of q(U) for the latent space. If it is smaller than the number of inducing points, it represents a low-rank parameterization of the covariance matrix.
qU_var_c_W_dim (int) – the dimensionality of the covariance of q(U) for the GP regression. If it is smaller than the number of inducing points, it represents a low-rank parameterization of the covariance matrix.
init (str) – the choice of initialization: ‘GP’ or ‘rand’. With ‘rand’, the model is initialized randomly. With ‘GP’, the model is initialized through a protocol as follows: (1) fits a sparse GP (2) fits a BGPLVM based on the outcome of sparse GP (3) initialize the model based on the outcome of the BGPLVM.
name (str) – the name of the model

optimize_auto(max_iters=10000, verbose=True)[source]¶

Optimize the model parameters through a pre-defined protocol.

Parameters:	max_iters (int) – the maximum number of iterations. verbose (boolean) – print the progress of optimization or not.

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.gp_multiout_regression_md module¶

class GPMultioutRegressionMD(X, Y, indexD, Xr_dim, kernel=None, kernel_row=None, Z=None, Z_row=None, X_row=None, Xvariance_row=None, num_inducing=(10, 10), qU_var_r_W_dim=None, qU_var_c_W_dim=None, init='GP', heter_noise=False, name='GPMRMD')[source]¶

Bases: GPy.core.sparse_gp.SparseGP

Gaussian Process model for multi-output regression with missing data

This is an implementation of Latent Variable Multiple Output Gaussian Processes (LVMOGP) in [Dai_et_al_2017]. This model targets at the use case, in which each output dimension is observed at a different set of inputs. The model takes a different data format: the inputs and outputs observations of all the output dimensions are stacked together correspondingly into two matrices. An extra array is used to indicate the index of output dimension for each data point. The output dimensions are indexed using integers from 0 to D-1 assuming there are D output dimensions.

References

[Dai_et_al_2017]

Dai, Z.; Alvarez, M.A.; Lawrence, N.D: Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes. In NIPS, 2017.

Parameters:

X (numpy.ndarray) – input observations.
Y (numpy.ndarray) – output observations, each column corresponding to an output dimension.
indexD (numpy.ndarray) – the array containing the index of output dimension for each data point
Xr_dim (int) – the dimensionality of a latent space, in which output dimensions are embedded in
kernel (GPy.kern.Kern or None) – a GPy kernel for GP of individual output dimensions ** defaults to RBF **
kernel_row (GPy.kern.Kern or None) – a GPy kernel for the GP of the latent space ** defaults to RBF **
Z (numpy.ndarray or None) – inducing inputs
Z_row (numpy.ndarray or None) – inducing inputs for the latent space
X_row (numpy.ndarray or None) – the initial value of the mean of the variational posterior distribution of points in the latent space
Xvariance_row (numpy.ndarray or None) – the initial value of the variance of the variational posterior distribution of points in the latent space
num_inducing ((int, int)) – a tuple (M, Mr). M is the number of inducing points for GP of individual output dimensions. Mr is the number of inducing points for the latent space.
qU_var_r_W_dim (int) – the dimensionality of the covariance of q(U) for the latent space. If it is smaller than the number of inducing points, it represents a low-rank parameterization of the covariance matrix.
qU_var_c_W_dim (int) – the dimensionality of the covariance of q(U) for the GP regression. If it is smaller than the number of inducing points, it represents a low-rank parameterization of the covariance matrix.
init (str) – the choice of initialization: ‘GP’ or ‘rand’. With ‘rand’, the model is initialized randomly. With ‘GP’, the model is initialized through a protocol as follows: (1) fits a sparse GP (2) fits a BGPLVM based on the outcome of sparse GP (3) initialize the model based on the outcome of the BGPLVM.
heter_noise (boolean) – whether assuming heteroscedastic noise in the model, boolean
name (str) – the name of the model

optimize_auto(max_iters=10000, verbose=True)[source]¶

Optimize the model parameters through a pre-defined protocol.

Parameters:	max_iters (int) – the maximum number of iterations. verbose (boolean) – print the progress of optimization or not.

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.gp_offset_regression module¶

class GPOffsetRegression(X, Y, kernel=None, Y_metadata=None, normalizer=None, noise_var=1.0, mean_function=None)[source]¶

Bases: GPy.core.gp.GP

Gaussian Process model for offset regression

Parameters:

X – input observations, we assume for this class that this has one dimension of actual inputs and the last dimension should be the index of the cluster (so X should be Nx2)
Y – observed values (Nx1?)
kernel – a GPy kernel, defaults to rbf
normalizer (Norm) – [False]
noise_var –
the noise variance for Gaussian likelhood, defaults to 1.

Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)

Note

Multiple independent outputs are allowed using columns of Y

dr_doffset(X, sel, delta)[source]¶

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.gp_regression module¶

class GPRegression(X, Y, kernel=None, Y_metadata=None, normalizer=None, noise_var=1.0, mean_function=None)[source]¶

Bases: GPy.core.gp.GP

Gaussian Process model for regression

This is a thin wrapper around the models.GP class, with a set of sensible defaults

Parameters:	X – input observations Y – observed values kernel – a GPy kernel, defaults to rbf normalizer (Norm) – [False] noise_var – the noise variance for Gaussian likelhood, defaults to 1. Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)

Note

Multiple independent outputs are allowed using columns of Y

static from_gp(gp)[source]¶

save_model(output_filename, compress=True, save_data=True)[source]¶

to_dict(save_data=True)[source]¶

Convert the object into a json serializable dictionary. Note: It uses the private method _save_to_input_dict of the parent.

Parameters:	save_data (boolean) – if true, it adds the training data self.X and self.Y to the dictionary
Return dict:	json serializable dictionary containing the needed information to instantiate the object

GPy.models.gp_var_gauss module¶

class GPVariationalGaussianApproximation(X, Y, kernel, likelihood, Y_metadata=None)[source]¶

Bases: GPy.core.gp.GP

The Variational Gaussian Approximation revisited

References

[opper_archambeau_2009]

Opper, M.; Archambeau, C.; The Variational Gaussian Approximation Revisited. Neural Comput. 2009, pages 786-792.

GPy.models.gplvm module¶

class GPLVM(Y, input_dim, init='PCA', X=None, kernel=None, name='gplvm', Y_metadata=None, normalizer=False)[source]¶

Bases: GPy.core.gp.GP

Gaussian Process Latent Variable Model

Parameters:	Y (np.ndarray) – observed data input_dim (int) – latent dimensionality init ('PCA'\|'random') – initialisation method for the latent space normalizer (bool) – normalize the outputs Y. If normalizer is True, we will normalize using Standardize. If normalizer is False (the default), no normalization will be done.

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶

Plot a scatter plot of the inducing inputs.

Parameters:

which_indices ([int]) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – marker to use [default is custom arrow like]
kwargs – the kwargs for the scatter plots
projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
scatter_kwargs – the kwargs for the scatter plots

plot_scatter(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶

Plot a scatter plot of the latent space.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – markers to use - cycle if more labels then markers are given
kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
annotation_kwargs – the kwargs for the annotation plot
scatter_kwargs – the kwargs for the scatter plots

GPy.models.gradient_checker module¶

class GradientChecker(f, df, x0, names=None, *args, **kwargs)[source]¶

Bases: GPy.core.model.Model

Parameters:

f – Function to check gradient for
df – Gradient of function to check
x0 ([array-like] | array-like | float | int) – Initial guess for inputs x (if it has a shape (a,b) this will be reflected in the parameter names). Can be a list of arrays, if takes a list of arrays. This list will be passed to f and df in the same order as given here. If only one argument, make sure not to pass a list!!!
names – Names to print, when performing gradcheck. If a list was passed to x0 a list of names with the same length is expected.
args – Arguments passed as f(x, *args, **kwargs) and df(x, *args, **kwargs)

Examples

Initialisation:

from GPy.models import GradientChecker
N, M, Q = 10, 5, 3

Sinusoid:

X = numpy.random.rand(N, Q)
grad = GradientChecker(numpy.sin,numpy.cos,X,'x')
grad.checkgrad(verbose=1)

Using GPy:

X, Z = numpy.random.randn(N,Q), numpy.random.randn(M,Q)
kern = GPy.kern.linear(Q, ARD=True) + GPy.kern.rbf(Q, ARD=True)
grad = GradientChecker(kern.K,
                        lambda x: 2*kern.dK_dX(numpy.ones((1,1)), x),
                        x0 = X.copy(),
                        names='X')
grad.checkgrad(verbose=1)
grad.randomize()
grad.checkgrad(verbose=1)

log_likelihood()[source]¶

class HessianChecker(f, df, ddf, x0, names=None, *args, **kwargs)[source]¶

Bases: GPy.models.gradient_checker.GradientChecker

Parameters:

f – Function (only used for numerical hessian gradient)
df – Gradient of function to check
ddf – Analytical gradient function
x0 ([array-like] | array-like | float | int) – Initial guess for inputs x (if it has a shape (a,b) this will be reflected in the parameter names). Can be a list of arrays, if takes a list of arrays. This list will be passed to f and df in the same order as given here. If only one argument, make sure not to pass a list!!!
names – Names to print, when performing gradcheck. If a list was passed to x0 a list of names with the same length is expected.
args – Arguments passed as f(x, *args, **kwargs) and df(x, *args, **kwargs)

checkgrad(target_param=None, verbose=False, step=1e-06, tolerance=0.001, block_indices=None, plot=False)[source]¶

Overwrite checkgrad method to check whole block instead of looping through

Shows diagnostics using matshow instead

Parameters:	verbose (bool) – If True, print a “full” checking of each parameter step (float (default 1e-6)) – The size of the step around which to linearise the objective tolerance (float (default 1e-3)) – the tolerance allowed (see note)

Note:-: The gradient is considered correct if the ratio of the analytical and numerical gradients is within <tolerance> of unity.

checkgrad_block(analytic_hess, numeric_hess, verbose=False, step=1e-06, tolerance=0.001, block_indices=None, plot=False)[source]¶: Checkgrad a block matrix

class SkewChecker(df, ddf, dddf, x0, names=None, *args, **kwargs)[source]¶

Bases: GPy.models.gradient_checker.HessianChecker

Parameters:

df – gradient of function
ddf – Gradient of function to check (hessian)
dddf – Analytical gradient function (third derivative)
x0 ([array-like] | array-like | float | int) – Initial guess for inputs x (if it has a shape (a,b) this will be reflected in the parameter names). Can be a list of arrays, if takes a list of arrays. This list will be passed to f and df in the same order as given here. If only one argument, make sure not to pass a list!!!
names – Names to print, when performing gradcheck. If a list was passed to x0 a list of names with the same length is expected.
args – Arguments passed as f(x, *args, **kwargs) and df(x, *args, **kwargs)

checkgrad(target_param=None, verbose=False, step=1e-06, tolerance=0.001, block_indices=None, plot=False, super_plot=False)[source]¶

Gradient checker that just checks each hessian individually

super_plot will plot the hessian wrt every parameter, plot will just do the first one

at_least_one_element(x)[source]¶

flatten_if_needed(x)[source]¶

get_shape(x)[source]¶

GPy.models.ibp_lfm module¶

class IBPLFM(X, Y, input_dim=2, output_dim=1, rank=1, Gamma=None, num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='IBP for LFM', alpha=2.0, beta=2.0, connM=None, tau=None, mpi_comm=None, normalizer=False, variational_prior=None, **kwargs)[source]¶

Bases: GPy.core.sparse_gp_mpi.SparseGP_MPI

Indian Buffet Process for Latent Force Models

Parameters:	Y (np.ndarray\| GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood X (np.ndarray) – input data (np.ndarray) [X:values, X:index], index refers to the number of the output input_dim (int) – latent dimensionality

: param rank: number of latent functions

get_Zp_gradients(Zp)[source]¶: Get the gradients of the posterior distribution of Zp in its specific form.

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

set_Zp_gradients(Zp, Zp_grad)[source]¶: Set the gradients of the posterior distribution of Zp in its specific form.

class IBPPosterior(binary_prob, tau=None, name='Sensitivity space', *a, **kw)[source]¶

Bases: GPy.core.parameterization.parameterized.Parameterized

The IBP distribution for variational approximations.

binary_prob : the probability of including a latent function over an output.

set_gradients(grad)[source]¶

class IBPPrior(rank, alpha=2.0, name='IBPPrior', **kw)[source]¶

Bases: GPy.core.parameterization.variational.VariationalPrior

KL_divergence(variational_posterior)[source]¶

update_gradients_KL(variational_posterior)[source]¶: updates the gradients for mean and variance in place

class VarDTC_minibatch_IBPLFM(batchsize=None, limit=3, mpi_comm=None)[source]¶

Bases: GPy.inference.latent_function_inference.var_dtc_parallel.VarDTC_minibatch

Modifications of VarDTC_minibatch for IBP LFM

gatherPsiStat(kern, X, Z, Y, beta, Zp)[source]¶

inference_likelihood(kern, X, Z, likelihood, Y, Zp)[source]¶

The first phase of inference: Compute: log-likelihood, dL_dKmm

Cached intermediate results: Kmm, KmmInv,

inference_minibatch(kern, X, Z, likelihood, Y, Zp)[source]¶: The second phase of inference: Computing the derivatives over a minibatch of Y Compute: dL_dpsi0, dL_dpsi1, dL_dpsi2, dL_dthetaL return a flag showing whether it reached the end of Y (isEnd)

update_gradients(model, mpi_comm=None)[source]¶

GPy.models.input_warped_gp module¶

class InputWarpedGP(X, Y, kernel=None, normalizer=False, warping_function=None, warping_indices=None, Xmin=None, Xmax=None, epsilon=None)[source]¶

Bases: GPy.core.gp.GP

Input Warped GP

This defines a GP model that applies a warping function to the Input. By default, it uses Kumar Warping (CDF of Kumaraswamy distribution)

X : array_like, shape = (n_samples, n_features) for input data

Y : array_like, shape = (n_samples, 1) for output data

kernel : object, optional: An instance of kernel function defined in GPy.kern Default to Matern 32
warping_function : object, optional: An instance of warping function defined in GPy.util.input_warping_functions Default to KumarWarping
warping_indices : list of int, optional: An list of indices of which features in X should be warped. It is used in the Kumar warping function
normalizer : bool, optional: A bool variable indicates whether to normalize the output
Xmin : list of float, optional: The min values for every feature in X It is used in the Kumar warping function
Xmax : list of float, optional: The max values for every feature in X It is used in the Kumar warping function
epsilon : float, optional: We normalize X to [0+e, 1-e]. If not given, using the default value defined in KumarWarping function

X_untransformed : array_like, shape = (n_samples, n_features): A copy of original input X
X_warped : array_like, shape = (n_samples, n_features): Input data after warping
warping_function : object, optional: An instance of warping function defined in GPy.util.input_warping_functions Default to KumarWarping

Kumar warping uses the CDF of Kumaraswamy distribution. More on the Kumaraswamy distribution can be found at the wiki page: https://en.wikipedia.org/wiki/Kumaraswamy_distribution

Snoek, J.; Swersky, K.; Zemel, R. S. & Adams, R. P. Input Warping for Bayesian Optimization of Non-stationary Functions preprint arXiv:1402.0929, 2014

log_likelihood()[source]¶

Compute the marginal log likelihood

For input warping, just use the normal GP log likelihood

parameters_changed()[source]¶

Update the gradients of parameters for warping function

This method is called when having new values of parameters for warping function, kernels and other parameters in a normal GP

predict(Xnew)[source]¶

Prediction on the new data

Xnew : array_like, shape = (n_samples, n_features): The test data.

mean : array_like, shape = (n_samples, output.dim): Posterior mean at the location of Xnew
var : array_like, shape = (n_samples, 1): Posterior variance at the location of Xnew

transform_data(X, test_data=False)[source]¶

Apply warping_function to some Input data

X : array_like, shape = (n_samples, n_features)

test_data: bool, optional: Default to False, should set to True when transforming test data

GPy.models.mrd module¶

class MRD(Ylist, input_dim, X=None, X_variance=None, initx='PCA', initz='permute', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihoods=None, name='mrd', Ynames=None, normalizer=False, stochastic=False, batchsize=10)[source]¶

Bases: GPy.models.bayesian_gplvm_minibatch.BayesianGPLVMMiniBatch

!WARNING: This is bleeding edge code and still in development. Functionality may change fundamentally during development!

Apply MRD to all given datasets Y in Ylist.

Y_i in [n x p_i]

If Ylist is a dictionary, the keys of the dictionary are the names, and the values are the different datasets to compare.

The samples n in the datasets need to match up, whereas the dimensionality p_d can differ.

Parameters:

Ylist ([array-like]) – List of datasets to apply MRD on
input_dim (int) – latent dimensionality
X (array-like) – mean of starting latent space q in [n x q]
X_variance (array-like) – variance of starting latent space q in [n x q]
initx (['concat'|'single'|'random']) –
initialisation method for the latent space :
- ’concat’ - PCA on concatenation of all datasets
- ’single’ - Concatenation of PCA on datasets, respectively
- ’random’ - Random draw from a Normal(0,1)
initz ('permute'|'random') – initialisation method for inducing inputs
num_inducing – number of inducing inputs to use
Z – initial inducing inputs
kernel ([GPy.kernels.kernels] | GPy.kernels.kernels | None (default)) – list of kernels or kernel to copy for each output

:param :class:`~GPy.inference.latent_function_inference inference_method:: InferenceMethodList of inferences, or one inference method for all

:param likelihoods likelihoods: the likelihoods to use :param str name: the name of this model :param [str] Ynames: the names for the datasets given, must be of equal length as Ylist or None :param bool|Norm normalizer: How to normalize the data? :param bool stochastic: Should this model be using stochastic gradient descent over the dimensions? :param bool|[bool] batchsize: either one batchsize for all, or one batchsize per dataset.

factorize_space(threshold=0.005, printOut=False, views=None)[source]¶: Given a trained MRD model, this function looks at the optimized ARD weights (lengthscales) and decides which part of the latent space is shared across views or private, according to a threshold. The threshold is applied after all weights are normalized so that the maximum value is 1.

log_likelihood()[source]¶: The log marginal likelihood of the model, $p(\mathbf{y})$, this is the objective function of the model being optimised

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_latent(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', predict_kwargs={}, scatter_kwargs=None, **imshow_kwargs)[source]¶: see plotting.matplot_dep.dim_reduction_plots.plot_latent if predict_kwargs is None, will plot latent spaces for 0th dataset (and kernel), otherwise give predict_kwargs=dict(Yindex=’index’) for plotting only the latent space of dataset with ‘index’.

plot_scales(titles=None, fig_kwargs={}, **kwargs)[source]¶

Plot input sensitivity for all datasets, to see which input dimensions are significant for which dataset.

Parameters:	titles – titles for axes of datasets

kwargs go into plot_ARD for each kernel.

predict(Xnew, full_cov=False, Y_metadata=None, kern=None, Yindex=0)[source]¶: Prediction for data set Yindex[default=0]. This predicts the output mean and variance for the dataset given in Ylist[Yindex]

GPy.models.multioutput_gp module¶

class MultioutputGP(X_list, Y_list, kernel_list, likelihood_list, name='multioutputgp', kernel_cross_covariances={}, inference_method=None)[source]¶

Bases: GPy.core.gp.GP

Gaussian process model for using observations from multiple likelihoods and different kernels :param X_list: input observations in a list for each likelihood :param Y: output observations in a list for each likelihood :param kernel_list: kernels in a list for each likelihood :param likelihood_list: likelihoods in a list :param kernel_cross_covariances: Cross covariances between different likelihoods. See class MultioutputKern for more :param inference_method: The LatentFunctionInference inference method to use for this GP

log_predictive_density(x_test, y_test, Y_metadata=None)[source]¶

Calculation of the log predictive density

Parameters:	x_test ((Nx1) array) – test locations (x_{}) y_test* ((Nx1) array) – test observations (y_{}) Y_metadata* – metadata associated with the test points

predict(Xnew, full_cov=False, Y_metadata=None, kern=None, likelihood=None, include_likelihood=True)[source]¶

Predict the function(s) at the new point(s) Xnew. This includes the likelihood variance added to the predicted underlying function (usually referred to as f).

In order to predict without adding in the likelihood give include_likelihood=False, or refer to self.predict_noiseless().

Parameters:

Xnew (np.ndarray (Nnew x self.input_dim)) – The points at which to make a prediction
full_cov (bool) – whether to return the full covariance matrix, or just the diagonal
Y_metadata – metadata about the predicting point to pass to the likelihood
kern – The kernel to use for prediction (defaults to the model kern). this is useful for examining e.g. subprocesses.
include_likelihood (bool) – Whether or not to add likelihood noise to the predicted underlying latent function f.

Returns:

(mean, var): mean: posterior mean, a Numpy array, Nnew x self.input_dim var: posterior variance, a Numpy array, Nnew x 1 if full_cov=False,

Nnew x Nnew otherwise

If full_cov and self.input_dim > 1, the return shape of var is Nnew x Nnew x self.input_dim. If self.input_dim == 1, the return shape is Nnew x Nnew. This is to allow for different normalizations of the output dimensions.

Note: If you want the predictive quantiles (e.g. 95% confidence interval) use predict_quantiles.

predict_noiseless(Xnew, full_cov=False, Y_metadata=None, kern=None)[source]¶

Convenience function to predict the underlying function of the GP (often referred to as f) without adding the likelihood variance on the prediction function.

This is most likely what you want to use for your predictions.

Parameters:

Xnew (np.ndarray (Nnew x self.input_dim)) – The points at which to make a prediction
full_cov (bool) – whether to return the full covariance matrix, or just the diagonal
Y_metadata – metadata about the predicting point to pass to the likelihood
kern – The kernel to use for prediction (defaults to the model kern). this is useful for examining e.g. subprocesses.

Returns:

(mean, var):: mean: posterior mean, a Numpy array, Nnew x self.input_dim var: posterior variance, a Numpy array, Nnew x 1 if full_cov=False, Nnew x Nnew otherwise

If full_cov and self.input_dim > 1, the return shape of var is Nnew x Nnew x self.input_dim. If self.input_dim == 1, the return shape is Nnew x Nnew. This is to allow for different normalizations of the output dimensions.

Note: If you want the predictive quantiles (e.g. 95% confidence interval) use predict_quantiles.

predict_quantiles(X, quantiles=(2.5, 97.5), Y_metadata=None, kern=None, likelihood=None)[source]¶

Get the predictive quantiles around the prediction at X

Parameters:	X (np.ndarray (Xnew x self.input_dim)) – The points at which to make a prediction quantiles (tuple) – tuple of quantiles, default is (2.5, 97.5) which is the 95% interval kern – optional kernel to use for prediction
Returns:	list of quantiles for each X and predictive quantiles for interval combination
Return type:	[np.ndarray (Xnew x self.output_dim), np.ndarray (Xnew x self.output_dim)]

predictive_gradients(Xnew, kern=None)[source]¶

Compute the derivatives of the predicted latent function with respect to X* Given a set of points at which to predict X* (size [N*,Q]), compute the derivatives of the mean and variance. Resulting arrays are sized:

dmu_dX* – [N*, Q ,D], where D is the number of output in this GP (usually one).

Note that this is not the same as computing the mean and variance of the derivative of the function!: dv_dX* – [N*, Q], (since all outputs have the same variance)

Parameters:	X (np.ndarray (Xnew x self.input_dim)) – The points at which to get the predictive gradients
Returns:	dmu_dX, dv_dX
Return type:	[np.ndarray (N, Q ,D), np.ndarray (N,Q) ]

set_XY(X=None, Y=None)[source]¶

Set the input / output data of the model This is useful if we wish to change our existing data but maintain the same model

Parameters:	X (np.ndarray) – input observations Y (np.ndarray) – output observations

GPy.models.one_vs_all_classification module¶

class OneVsAllClassification(X, Y, kernel=None, Y_metadata=None, messages=True)[source]¶

Bases: object

Gaussian Process classification: One vs all

This is a thin wrapper around the models.GPClassification class, with a set of sensible defaults

Parameters:	X – input observations Y – observed values, can be None if likelihood is not None kernel – a GPy kernel, defaults to rbf

Note

Multiple independent outputs are not allowed

GPy.models.one_vs_all_sparse_classification module¶

class OneVsAllSparseClassification(X, Y, kernel=None, Y_metadata=None, messages=True, num_inducing=10)[source]¶

Bases: object

Gaussian Process classification: One vs all

This is a thin wrapper around the models.GPClassification class, with a set of sensible defaults

Parameters:	X – input observations Y – observed values, can be None if likelihood is not None kernel – a GPy kernel, defaults to rbf

Note

Multiple independent outputs are not allowed

GPy.models.sparse_gp_classification module¶

class SparseGPClassification(X, Y=None, likelihood=None, kernel=None, Z=None, num_inducing=10, Y_metadata=None, mean_function=None, inference_method=None, normalizer=False)[source]¶

Bases: GPy.core.sparse_gp.SparseGP

Sparse Gaussian Process model for classification

This is a thin wrapper around the sparse_GP class, with a set of sensible defaults

Parameters:

X – input observations
Y – observed values
likelihood – a GPy likelihood, defaults to Bernoulli
kernel – a GPy kernel, defaults to rbf+white
inference_method (GPy.inference.latent_function_inference.LatentFunctionInference) – Latent function inference to use, defaults to EPDTC
normalize_X (False|True) – whether to normalize the input data before computing (predictions will be in original scales)
normalize_Y (False|True) – whether to normalize the input data before computing (predictions will be in original scales)

Return type:

model object

static from_dict(input_dict, data=None)[source]¶

Instantiate an SparseGPClassification object using the information in input_dict (built by the to_dict method).

Parameters:	data (tuple(`np.ndarray`, `np.ndarray`)) – It is used to provide X and Y for the case when the model was saved using save_data=False in to_dict method.

static from_sparse_gp(sparse_gp)[source]¶

save_model(output_filename, compress=True, save_data=True)[source]¶

Method to serialize the model.

Parameters:	output_filename (string) – Output file compress (boolean) – If true compress the file using zip save_data (boolean) – if true, it serializes the training data (self.X and self.Y)

to_dict(save_data=True)[source]¶

Store the object into a json serializable dictionary

Parameters:	save_data (boolean) – if true, it adds the data self.X and self.Y to the dictionary
Return dict:	json serializable dictionary containing the needed information to instantiate the object

class SparseGPClassificationUncertainInput(X, X_variance, Y, kernel=None, Z=None, num_inducing=10, Y_metadata=None, normalizer=None)[source]¶

Bases: GPy.core.sparse_gp.SparseGP

Sparse Gaussian Process model for classification with uncertain inputs.

This is a thin wrapper around the sparse_GP class, with a set of sensible defaults

Parameters:

X (np.ndarray (num_data x input_dim)) – input observations
X_variance (np.ndarray (num_data x input_dim)) – The uncertainty in the measurements of X (Gaussian variance, optional)
Y – observed values
kernel – a GPy kernel, defaults to rbf+white
Z (np.ndarray (num_inducing x input_dim) | None) – inducing inputs (optional, see note)
num_inducing (int) – number of inducing points (ignored if Z is passed, see note)

Return type:

model object

Note

If no Z array is passed, num_inducing (default 10) points are selected from the data. Other wise num_inducing is ignored

Note

Multiple independent outputs are allowed using columns of Y

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.sparse_gp_coregionalized_regression module¶

class SparseGPCoregionalizedRegression(X_list, Y_list, Z_list=[], kernel=None, likelihoods_list=None, num_inducing=10, X_variance=None, name='SGPCR', W_rank=1, kernel_name='coreg')[source]¶

Bases: GPy.core.sparse_gp.SparseGP

Sparse Gaussian Process model for heteroscedastic multioutput regression

This is a thin wrapper around the SparseGP class, with a set of sensible defaults

Parameters:

X_list (list of numpy arrays) – list of input observations corresponding to each output
Y_list (list of numpy arrays) – list of observed values related to the different noise models
Z_list (empty list | list of numpy arrays) – list of inducing inputs (optional)
kernel (None | GPy.kernel defaults) – a GPy kernel ** Coregionalized, defaults to RBF ** Coregionalized
num_inducing (integer | list of integers) – number of inducing inputs, defaults to 10 per output (ignored if Z_list is not empty)
name (string) – model name
W_rank (integer) – number tuples of the corregionalization parameters ‘W’ (see coregionalize kernel documentation)
kernel_name (string) – name of the kernel

Likelihoods_list:

a list of likelihoods, defaults to list of Gaussian likelihoods

GPy.models.sparse_gp_minibatch module¶

class SparseGPMiniBatch(X, Y, Z, kernel, likelihood, inference_method=None, name='sparse gp', Y_metadata=None, normalizer=False, missing_data=False, stochastic=False, batchsize=1)[source]¶

Bases: GPy.core.sparse_gp.SparseGP

A general purpose Sparse GP model, allowing missing data and stochastics across dimensions.

This model allows (approximate) inference using variational DTC or FITC (Gaussian likelihoods) as well as non-conjugate sparse methods based on these.

Parameters:

X (np.ndarray (num_data x input_dim)) – inputs
likelihood (GPy.likelihood.(Gaussian | EP | Laplace)) – a likelihood instance, containing the observed data
kernel (a GPy.kern.kern instance) – the kernel (covariance function). See link kernels
X_variance (np.ndarray (num_data x input_dim) | None) – The uncertainty in the measurements of X (Gaussian variance)
Z (np.ndarray (num_inducing x input_dim)) – inducing inputs
num_inducing (int) – Number of inducing points (optional, default 10. Ignored if Z is not None)

has_uncertain_inputs()[source]¶

optimize(optimizer=None, start=None, **kwargs)[source]¶

Optimize the model using self.log_likelihood and self.log_likelihood_gradient, as well as self.priors. kwargs are passed to the optimizer. They can be:

Parameters:

max_iters (int) – maximum number of function evaluations
messages (bool) – whether to display during optimisation
optimizer (string) – which optimizer to use (defaults to self.preferred optimizer), a range of optimisers can be found in :module:`~GPy.inference.optimization`, they include ‘scg’, ‘lbfgs’, ‘tnc’.
ipython_notebook (bool) – whether to use ipython notebook widgets or not.
clear_after_finish (bool) – if in ipython notebook, we can clear the widgets after optimization.

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.sparse_gp_regression module¶

class SparseGPRegression(X, Y, kernel=None, Z=None, num_inducing=10, X_variance=None, mean_function=None, normalizer=None, mpi_comm=None, name='sparse_gp')[source]¶

Bases: GPy.core.sparse_gp_mpi.SparseGP_MPI

Gaussian Process model for regression

This is a thin wrapper around the SparseGP class, with a set of sensible defalts

Parameters:	X – input observations X_variance – input uncertainties, one per input X Y – observed values kernel – a GPy kernel, defaults to rbf+white Z (np.ndarray (num_inducing x input_dim) \| None) – inducing inputs (optional, see note) num_inducing (int) – number of inducing points (ignored if Z is passed, see note)
Return type:	model object

Note

If no Z array is passed, num_inducing (default 10) points are selected from the data. Other wise num_inducing is ignored

Note

Multiple independent outputs are allowed using columns of Y

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.sparse_gp_regression_md module¶

class SparseGPRegressionMD(X, Y, indexD, kernel=None, Z=None, num_inducing=10, normalizer=None, mpi_comm=None, individual_Y_noise=False, name='sparse_gp')[source]¶

Bases: GPy.core.sparse_gp_mpi.SparseGP_MPI

Sparse Gaussian Process Regression with Missing Data

This model targets at the use case, in which there are multiple output dimensions (different dimensions are assumed to be independent following the same GP prior) and each output dimension is observed at a different set of inputs. The model takes a different data format: the inputs and outputs observations of all the output dimensions are stacked together correspondingly into two matrices. An extra array is used to indicate the index of output dimension for each data point. The output dimensions are indexed using integers from 0 to D-1 assuming there are D output dimensions.

Parameters:

X (numpy.ndarray) – input observations.
Y (numpy.ndarray) – output observations, each column corresponding to an output dimension.
indexD (numpy.ndarray) – the array containing the index of output dimension for each data point
kernel (GPy.kern.Kern or None) – a GPy kernel for GP of individual output dimensions ** defaults to RBF **
Z (numpy.ndarray or None) – inducing inputs
num_inducing ((int, int)) – a tuple (M, Mr). M is the number of inducing points for GP of individual output dimensions. Mr is the number of inducing points for the latent space.
individual_Y_noise (boolean) – whether individual output dimensions have their own noise variance or not, boolean
name (str) – the name of the model

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

GPy.models.sparse_gplvm module¶

class SparseGPLVM(Y, input_dim, X=None, kernel=None, init='PCA', num_inducing=10)[source]¶

Bases: GPy.models.sparse_gp_regression.SparseGPRegression

Sparse Gaussian Process Latent Variable Model

Parameters:	Y (np.ndarray) – observed data input_dim (int) – latent dimensionality init ('PCA'\|'random') – initialisation method for the latent space

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_latent(labels=None, which_indices=None, resolution=50, ax=None, marker='o', s=40, fignum=None, plot_inducing=True, legend=True, plot_limits=None, aspect='auto', updates=False, predict_kwargs={}, imshow_kwargs={})[source]¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

GPy.models.ss_gplvm module¶

class IBPPosterior(means, variances, binary_prob, tau=None, sharedX=False, name='latent space')[source]¶

Bases: GPy.core.parameterization.variational.SpikeAndSlabPosterior

The SpikeAndSlab distribution for variational approximations.

binary_prob : the probability of the distribution on the slab part.

set_gradients(grad)[source]¶

class IBPPrior(input_dim, alpha=2.0, name='IBPPrior', **kw)[source]¶

Bases: GPy.core.parameterization.variational.VariationalPrior

KL_divergence(variational_posterior)[source]¶

update_gradients_KL(variational_posterior)[source]¶: updates the gradients for mean and variance in place

class SLVMPosterior(means, variances, binary_prob, tau=None, name='latent space')[source]¶

Bases: GPy.core.parameterization.variational.SpikeAndSlabPosterior

The SpikeAndSlab distribution for variational approximations.

binary_prob : the probability of the distribution on the slab part.

set_gradients(grad)[source]¶

class SLVMPrior(input_dim, alpha=1.0, beta=1.0, Z=None, name='SLVMPrior', **kw)[source]¶

Bases: GPy.core.parameterization.variational.VariationalPrior

KL_divergence(variational_posterior)[source]¶

update_gradients_KL(variational_posterior)[source]¶: updates the gradients for mean and variance in place

class SSGPLVM(Y, input_dim, X=None, X_variance=None, Gamma=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='Spike_and_Slab GPLVM', group_spike=False, IBP=False, SLVM=False, alpha=2.0, beta=2.0, connM=None, tau=None, mpi_comm=None, pi=None, learnPi=False, normalizer=False, sharedX=False, variational_prior=None, **kwargs)[source]¶

Bases: GPy.core.sparse_gp_mpi.SparseGP_MPI

Spike-and-Slab Gaussian Process Latent Variable Model

Parameters:	Y (np.ndarray\| GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood input_dim (int) – latent dimensionality init ('PCA'\|'random') – initialisation method for the latent space

get_X_gradients(X)[source]¶: Get the gradients of the posterior distribution of X in its specific form.

input_sensitivity()[source]¶: Returns the sensitivity for each dimension of this model

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in the GP class this method re-performs inference, recalculating the posterior and log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶

Plot a scatter plot of the inducing inputs.

Parameters:

which_indices ([int]) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – marker to use [default is custom arrow like]
kwargs – the kwargs for the scatter plots
projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
scatter_kwargs – the kwargs for the scatter plots

plot_scatter(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶

Plot a scatter plot of the latent space.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
legend (bool) – whether to plot the legend on the figure
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
marker (str) – markers to use - cycle if more labels then markers are given
kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶

Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.

Parameters:

labels (array-like) – a label for each data point (row) of the inputs
which_indices ((int, int)) – which input dimensions to plot against each other
resolution (int) – the resolution at which we predict the magnification factor
legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
updates (bool) – if possible, make interactive updates using the specific library you are using
kern (Kern) – the kernel to use for prediction
marker (str) – markers to use - cycle if more labels then markers are given
num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
imshow_kwargs – the kwargs for the imshow (magnification factor)
annotation_kwargs – the kwargs for the annotation plot
scatter_kwargs – the kwargs for the scatter plots

sample_W(nSamples, raw_samples=False)[source]¶: Sample the loading matrix if the kernel is linear.

set_X_gradients(X, X_grad)[source]¶: Set the gradients of the posterior distribution of X in its specific form.

GPy.models.ss_mrd module¶

The Maniforld Relevance Determination model with the spike-and-slab prior

class IBPPrior_SSMRD(nModels, input_dim, alpha=2.0, tau=None, name='IBPPrior', **kw)[source]¶

Bases: GPy.core.parameterization.variational.VariationalPrior

KL_divergence(variational_posterior)[source]¶

update_gradients_KL(variational_posterior)[source]¶: updates the gradients for mean and variance in place

class SSMRD(Ylist, input_dim, X=None, X_variance=None, Gammas=None, initx='PCA_concat', initz='permute', num_inducing=10, Zs=None, kernels=None, inference_methods=None, likelihoods=None, group_spike=True, pi=0.5, name='ss_mrd', Ynames=None, mpi_comm=None, IBP=False, alpha=2.0, taus=None)[source]¶

Bases: GPy.core.model.Model

log_likelihood()[source]¶

optimize(optimizer=None, start=None, **kwargs)[source]¶

Optimize the model using self.log_likelihood and self.log_likelihood_gradient, as well as self.priors.

kwargs are passed to the optimizer. They can be:

Parameters:	max_iters (int) – maximum number of function evaluations optimizer (string) – which optimizer to use (defaults to self.preferred optimizer)
Messages:	True: Display messages during optimisation, “ipython_notebook”:

Valid optimizers are:

‘scg’: scaled conjugate gradient method, recommended for stability.

See also GPy.inference.optimization.scg
‘fmin_tnc’: truncated Newton method (see scipy.optimize.fmin_tnc)
‘simplex’: the Nelder-Mead simplex method (see scipy.optimize.fmin),
‘lbfgsb’: the l-bfgs-b method (see scipy.optimize.fmin_l_bfgs_b),
‘lbfgs’: the bfgs method (see scipy.optimize.fmin_bfgs),
‘sgd’: stochastic gradient decsent (see scipy.optimize.sgd). For experts only!

parameters_changed()[source]¶: This method gets called when parameters have changed. Another way of listening to param changes is to add self as a listener to the param, such that updates get passed through. See :py:function:paramz.param.Observable.add_observer

optimizer_array¶

Array for the optimizer to work on. This array always lives in the space for the optimizer. Thus, it is untransformed, going from Transformations.

Setting this array, will make sure the transformed parameters for this model will be set accordingly. It has to be set with an array, retrieved from this method, as e.g. fixing will resize the array.

The optimizer should only interfere with this array, such that transformations are secured.

class SpikeAndSlabPrior_SSMRD(nModels, pi=0.5, learnPi=False, group_spike=True, variance=1.0, name='SSMRDPrior', **kw)[source]¶

Bases: GPy.core.parameterization.variational.SpikeAndSlabPrior

KL_divergence(variational_posterior)[source]¶

update_gradients_KL(variational_posterior)[source]¶: updates the gradients for mean and variance in place

GPy.models.state_space module¶

GPy.models.state_space_cython module¶

GPy.models.state_space_main module¶

Main functionality for state-space inference.

class AddMethodToClass(func=None, tp='staticmethod')[source]¶

Bases: object

func: function to add tp: string Type of the method: normal, staticmethod, classmethod

class ContDescrStateSpace[source]¶

Bases: GPy.models.state_space_main.DescreteStateSpace

Class for continuous-discrete Kalman filter. State equation is continuous while measurement equation is discrete.

d x(t)/ dt = F x(t) + L q; where q~ N(0, Qc) y_{t_k} = H_{k} x_{t_k} + r_{k}; r_{k-1} ~ N(0, R_{k})

class AQcompute_batch_Python(F, L, Qc, dt, compute_derivatives=False, grad_params_no=None, P_inf=None, dP_inf=None, dF=None, dQc=None)[source]¶

Bases: GPy.models.state_space_main.Q_handling_Python

Class for calculating matrices A, Q, dA, dQ of the discrete Kalman Filter from the matrices F, L, Qc, P_ing, dF, dQc, dP_inf of the continuos state equation. dt - time steps.

It has the same interface as AQcompute_once.

It computes matrices for all time steps. This object is used when there are not so many (controlled by internal variable) different time steps and storing all the matrices do not take too much memory.

Since all the matrices are computed all together, this object can be used in smoother without repeating the computations.

Constructor. All necessary parameters are passed here and stored in the opject.

F, L, Qc, P_inf : matrices

Parameters of corresponding continuous state model

dt: array

All time steps

compute_derivatives: bool

Whether to calculate derivatives

dP_inf, dF, dQc: 3D array

Derivatives if they are required

Nothing

Ak(k, m, P)[source]¶: function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.

k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

Q_inverse(k, p_largest_cond_num, p_regularization_type)[source]¶

Function inverts Q matrix and regularizes the inverse. Regularization is useful when original matrix is badly conditioned. Function is currently used only in SparseGP code.

k: int Iteration number.

p_largest_cond_num: float Largest condition value for the inverted matrix. If cond. number is smaller than that no regularization happen.

regularization_type: 1 or 2 Regularization type.

regularization_type: int (1 or 2)

type 1: 1/(S[k] + regularizer) regularizer is computed type 2: S[k]/(S^2[k] + regularizer) regularizer is computed

Q_srk(k)[source]¶: Square root of the noise matrix Q

Qk(k)[source]¶: function (k). Returns noise matrix of dynamic model on iteration k. k (iteration number). starts at 0

dAk(k)[source]¶: function (k). Returns the derivative of A on iteration k. k (iteration number). starts at 0

dQk(k)[source]¶: function (k). Returns the derivative of Q on iteration k. k (iteration number). starts at 0

f_a(k, m, A)[source]¶: Dynamic model

reset(compute_derivatives=False)[source]¶: For reusing this object e.g. in smoother computation. It makes sence because necessary matrices have been already computed for all time steps.

return_last()[source]¶: Function returns last available matrices.

class AQcompute_once(F, L, Qc, dt, compute_derivatives=False, grad_params_no=None, P_inf=None, dP_inf=None, dF=None, dQc=None)[source]¶

Bases: GPy.models.state_space_main.Q_handling_Python

Class for calculating matrices A, Q, dA, dQ of the discrete Kalman Filter from the matrices F, L, Qc, P_ing, dF, dQc, dP_inf of the continuos state equation. dt - time steps.

It has the same interface as AQcompute_batch.

It computes matrices for only one time step. This object is used when there are many different time steps and storing matrices for each of them would take too much memory.

Constructor. All necessary parameters are passed here and stored in the opject.

F, L, Qc, P_inf : matrices

Parameters of corresponding continuous state model

dt: array

All time steps

compute_derivatives: bool

Whether to calculate derivatives

dP_inf, dF, dQc: 3D array

Derivatives if they are required

Nothing

Ak(k, m, P)[source]¶: function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.

k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

Q_inverse(k, p_largest_cond_num, p_regularization_type)[source]¶

Function inverts Q matrix and regularizes the inverse. Regularization is useful when original matrix is badly conditioned. Function is currently used only in SparseGP code.

k: int Iteration number.

p_largest_cond_num: float Largest condition value for the inverted matrix. If cond. number is smaller than that no regularization happen.

regularization_type: 1 or 2 Regularization type.

regularization_type: int (1 or 2)

type 1: 1/(S[k] + regularizer) regularizer is computed type 2: S[k]/(S^2[k] + regularizer) regularizer is computed

Q_srk(k)[source]¶: Check square root, maybe rewriting for Spectral decomposition is needed. Square root of the noise matrix Q

Qk(k)[source]¶: function (k). Returns noise matrix of dynamic model on iteration k. k (iteration number). starts at 0

dAk(k)[source]¶: function (k). Returns the derivative of A on iteration k. k (iteration number). starts at 0

dQk(k)[source]¶: function (k). Returns the derivative of Q on iteration k. k (iteration number). starts at 0

f_a(k, m, A)[source]¶: Dynamic model

reset(compute_derivatives)[source]¶: For reusing this object e.g. in smoother computation. Actually, this object can not be reused because it computes the matrices on every iteration. But this method is written for keeping the same interface with the class AQcompute_batch.

return_last()[source]¶: Function returns last computed matrices.

classmethod cont_discr_kalman_filter(F, L, Qc, p_H, p_R, P_inf, X, Y, index=None, m_init=None, P_init=None, p_kalman_filter_type='regular', calc_log_likelihood=False, calc_grad_log_likelihood=False, grad_params_no=0, grad_calc_params=None)[source]¶

This function implements the continuous-discrete Kalman Filter algorithm These notations for the State-Space model are assumed:

d/dt x(t) = F * x(t) + L * w(t); w(t) ~ N(0, Qc) y_{k} = H_{k} * x_{k} + r_{k}; r_{k-1} ~ N(0, R_{k})

Returns estimated filter distributions x_{k} ~ N(m_{k}, P(k))

1) The function generaly do not modify the passed parameters. If it happens then it is an error. There are several exeprions: scalars can be modified into a matrix, in some rare cases shapes of the derivatives matrices may be changed, it is ignored for now.

2) Copies of F,L,Qc are created in memory because they may be used later in smoother. References to copies are kept in “AQcomp” object return parameter.

3) Function support “multiple time series mode” which means that exactly the same State-Space model is used to filter several sets of measurements. In this case third dimension of Y should include these state-space measurements Log_likelihood and Grad_log_likelihood have the corresponding dimensions then.

4) Calculation of Grad_log_likelihood is not supported if matrices H, or R changes overf time (with index k). (later may be changed)

5) Measurement may include missing values. In this case update step is not done for this measurement. (later may be changed)

F: (state_dim, state_dim) matrix: F in the model.
L: (state_dim, noise_dim) matrix: L in the model.
Qc: (noise_dim, noise_dim) matrix: Q_c in the model.
p_H: scalar, matrix (measurement_dim, state_dim) , 3D array: H_{k} in the model. If matrix then H_{k} = H - constant. If it is 3D array then H_{k} = p_Q[:,:, index[2,k]]
p_R: scalar, square symmetric matrix, 3D array: R_{k} in the model. If matrix then R_{k} = R - constant. If it is 3D array then R_{k} = p_R[:,:, index[3,k]]
P_inf: (state_dim, state_dim) matrix: State varince matrix on infinity.
X: 1D array: Time points of measurements. Needed for converting continuos problem to the discrete one.
Y: matrix or vector or 3D array: Data. If Y is matrix then samples are along 0-th dimension and features along the 1-st. If 3D array then third dimension correspond to “multiple time series mode”.
index: vector: Which indices (on 3-rd dimension) from arrays p_H, p_R to use on every time step. If this parameter is None then it is assumed that p_H, p_R do not change over time and indices are not needed. index[0,:] - correspond to H, index[1,:] - correspond to R If index.shape[0] == 1, it is assumed that indides for all matrices are the same.
m_init: vector or matrix: Initial distribution mean. If None it is assumed to be zero. For “multiple time series mode” it is matrix, second dimension of which correspond to different time series. In regular case (“one time series mode”) it is a vector.
P_init: square symmetric matrix or scalar: Initial covariance of the states. If the parameter is scalar then it is assumed that initial covariance matrix is unit matrix multiplied by this scalar. If None the unit matrix is used instead. “multiple time series mode” does not affect it, since it does not affect anything related to state variaces.
p_kalman_filter_type: string, one of (‘regular’, ‘svd’): Which Kalman Filter is used. Regular or SVD. SVD is more numerically stable, in particular, Covariace matrices are guarantied to be positive semi-definite. However, ‘svd’ works slower, especially for small data due to SVD call overhead.
calc_log_likelihood: boolean: Whether to calculate marginal likelihood of the state-space model.
calc_grad_log_likelihood: boolean: Whether to calculate gradient of the marginal likelihood of the state-space model. If true then “grad_calc_params” parameter must provide the extra parameters for gradient calculation.
grad_params_no: int: If previous parameter is true, then this parameters gives the total number of parameters in the gradient.
grad_calc_params: dictionary: Dictionary with derivatives of model matrices with respect to parameters “dF”, “dL”, “dQc”, “dH”, “dR”, “dm_init”, “dP_init”. They can be None, in this case zero matrices (no dependence on parameters) is assumed. If there is only one parameter then third dimension is automatically added.

M: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array: Filter estimates of the state means. In the extra step the initial value is included. In the “multiple time series mode” third dimension correspond to different timeseries.
P: (no_steps+1, state_dim, state_dim) 3D array: Filter estimates of the state covariances. In the extra step the initial value is included.

log_likelihood: double or (1, time_series_no) 3D array.

If the parameter calc_log_likelihood was set to true, return logarithm of marginal likelihood of the state-space model. If the parameter was false, return None. In the “multiple time series mode” it is a vector providing log_likelihood for each time series.

grad_log_likelihood: column vector or (grad_params_no, time_series_no) matrix: If calc_grad_log_likelihood is true, return gradient of log likelihood with respect to parameters. It returns it column wise, so in “multiple time series mode” gradients for each time series is in the corresponding column.
AQcomp: object: Contains some pre-computed values for converting continuos model into discrete one. It can be used later in the smoothing pahse.

classmethod cont_discr_rts_smoother(state_dim, filter_means, filter_covars, p_dynamic_callables=None, X=None, F=None, L=None, Qc=None)[source]¶

Continuos-discrete Rauch–Tung–Striebel(RTS) smoother.

This function implements Rauch–Tung–Striebel(RTS) smoother algorithm based on the results of _cont_discr_kalman_filter_raw.

Model:: d/dt x(t) = F * x(t) + L * w(t); w(t) ~ N(0, Qc) y_{k} = H_{k} * x_{k} + r_{k}; r_{k-1} ~ N(0, R_{k})

Returns estimated smoother distributions x_{k} ~ N(m_{k}, P(k))

filter_means: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array

Results of the Kalman Filter means estimation.

filter_covars: (no_steps+1, state_dim, state_dim) 3D array

Results of the Kalman Filter covariance estimation.

Dynamic_callables: object or None

Object form the filter phase which provides functions for computing A, Q, dA, dQ fro discrete model from the continuos model.

X, F, L, Qc: matrices: If AQcomp is None, these matrices are used to create this object from scratch.

M: (no_steps+1,state_dim) matrix: Smoothed estimates of the state means
P: (no_steps+1,state_dim, state_dim) 3D array: Smoothed estimates of the state covariances

static lti_sde_to_descrete(F, L, Qc, dt, compute_derivatives=False, grad_params_no=None, P_inf=None, dP_inf=None, dF=None, dQc=None)[source]¶

Linear Time-Invariant Stochastic Differential Equation (LTI SDE):

dx(t) = F x(t) dt + L d eta ,where

x(t): (vector) stochastic process eta: (vector) Brownian motion process F, L: (time invariant) matrices of corresponding dimensions Qc: covariance of noise.

This function rewrites it into the corresponding state-space form:

x_{k} = A_{k} * x_{k-1} + q_{k-1}; q_{k-1} ~ N(0, Q_{k-1})

TODO: this function can be redone to “preprocess dataset”, when close time points are handeled properly (with rounding parameter) and values are averaged accordingly.

F,L: LTI SDE matrices of corresponding dimensions

Qc: matrix (n,n): Covarince between different dimensions of noise eta. n is the dimensionality of the noise.
dt: double or iterable: Time difference used on this iteration. If dt is iterable, then A and Q_noise are computed for every unique dt
compute_derivatives: boolean: Whether derivatives of A and Q are required.
grad_params_no: int: Number of gradient parameters

P_inf: (state_dim. state_dim) matrix

dP_inf

dF: 3D array: Derivatives of F
dQc: 3D array: Derivatives of Qc
dR: 3D array: Derivatives of R

A: matrix: A_{k}. Because we have LTI SDE only dt can affect on matrix difference for different k.
Q_noise: matrix: Covariance matrix of (vector) q_{k-1}. Only dt can affect the matrix difference for different k.
reconstruct_index: array: If dt was iterable return three dimensinal arrays A and Q_noise. Third dimension of these arrays correspond to unique dt’s. This reconstruct_index contain indices of the original dt’s in the uninue dt sequence. A[:,:, reconstruct_index[5]] is matrix A of 6-th(indices start from zero) dt in the original sequence.
dA: 3D array: Derivatives of A
dQ: 3D array: Derivatives of Q

class DescreteStateSpace[source]¶

Bases: object

This class implents state-space inference for linear and non-linear state-space models. Linear models are: x_{k} = A_{k} * x_{k-1} + q_{k-1}; q_{k-1} ~ N(0, Q_{k-1}) y_{k} = H_{k} * x_{k} + r_{k}; r_{k-1} ~ N(0, R_{k})

Nonlinear: x_{k} = f_a(k, x_{k-1}, A_{k}) + q_{k-1}; q_{k-1} ~ N(0, Q_{k-1}) y_{k} = f_h(k, x_{k}, H_{k}) + r_{k}; r_{k-1} ~ N(0, R_{k}) Here f_a and f_h are some functions of k (iteration number), x_{k-1} or x_{k} (state value on certain iteration), A_{k} and H_{k} - Jacobian matrices of f_a and f_h respectively. In the linear case they are exactly A_{k} and H_{k}.

Currently two nonlinear Gaussian filter algorithms are implemented: Extended Kalman Filter (EKF), Statistically linearized Filter (SLF), which implementations are very similar.

classmethod extended_kalman_filter(p_state_dim, p_a, p_f_A, p_f_Q, p_h, p_f_H, p_f_R, Y, m_init=None, P_init=None, calc_log_likelihood=False)[source]¶

Extended Kalman Filter

p_state_dim: integer

p_a: if None - the function from the linear model is assumed. No non-

linearity in the dynamic is assumed.

function (k, x_{k-1}, A_{k}). Dynamic function. k: (iteration number), x_{k-1}: (previous state) x_{k}: Jacobian matrices of f_a. In the linear case it is exactly A_{k}.

p_f_A: matrix - in this case function which returns this matrix is assumed.

Look at this parameter description in kalman_filter function.

function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.

k: (iteration number), m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

p_f_Q: matrix. In this case function which returns this matrix is asumed.

Look at this parameter description in kalman_filter function.

function (k). Returns noise matrix of dynamic model on iteration k. k: (iteration number).

p_h: if None - the function from the linear measurement model is assumed.

No nonlinearity in the measurement is assumed.

function (k, x_{k}, H_{k}). Measurement function. k: (iteration number), x_{k}: (current state) H_{k}: Jacobian matrices of f_h. In the linear case it is exactly H_{k}.

p_f_H: matrix - in this case function which returns this matrix is assumed.

function (k, m, P) return Jacobian of dynamic function, it is passed into p_h. k: (iteration number), m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

p_f_R: matrix. In this case function which returns this matrix is asumed.

function (k). Returns noise matrix of measurement equation on iteration k. k: (iteration number).

Y: matrix or vector

Data. If Y is matrix then samples are along 0-th dimension and features along the 1-st. May have missing values.

p_mean: vector

Initial distribution mean. If None it is assumed to be zero

P_init: square symmetric matrix or scalar

Initial covariance of the states. If the parameter is scalar then it is assumed that initial covariance matrix is unit matrix multiplied by this scalar. If None the unit matrix is used instead.

calc_log_likelihood: boolean

Whether to calculate marginal likelihood of the state-space model.

classmethod kalman_filter(p_A, p_Q, p_H, p_R, Y, index=None, m_init=None, P_init=None, p_kalman_filter_type='regular', calc_log_likelihood=False, calc_grad_log_likelihood=False, grad_params_no=None, grad_calc_params=None)[source]¶

This function implements the basic Kalman Filter algorithm These notations for the State-Space model are assumed:

x_{k} = A_{k} * x_{k-1} + q_{k-1}; q_{k-1} ~ N(0, Q_{k-1}) y_{k} = H_{k} * x_{k} + r_{k}; r_{k-1} ~ N(0, R_{k})

Returns estimated filter distributions x_{k} ~ N(m_{k}, P(k))

1) The function generaly do not modify the passed parameters. If it happens then it is an error. There are several exeprions: scalars can be modified into a matrix, in some rare cases shapes of the derivatives matrices may be changed, it is ignored for now.

2) Copies of p_A, p_Q, index are created in memory to be used later in smoother. References to copies are kept in “matrs_for_smoother” return parameter.

3) Function support “multiple time series mode” which means that exactly the same State-Space model is used to filter several sets of measurements. In this case third dimension of Y should include these state-space measurements Log_likelihood and Grad_log_likelihood have the corresponding dimensions then.

4) Calculation of Grad_log_likelihood is not supported if matrices A,Q, H, or R changes over time. (later may be changed)

5) Measurement may include missing values. In this case update step is not done for this measurement. (later may be changed)

p_A: scalar, square matrix, 3D array: A_{k} in the model. If matrix then A_{k} = A - constant. If it is 3D array then A_{k} = p_A[:,:, index[0,k]]
p_Q: scalar, square symmetric matrix, 3D array: Q_{k-1} in the model. If matrix then Q_{k-1} = Q - constant. If it is 3D array then Q_{k-1} = p_Q[:,:, index[1,k]]
p_H: scalar, matrix (measurement_dim, state_dim) , 3D array: H_{k} in the model. If matrix then H_{k} = H - constant. If it is 3D array then H_{k} = p_Q[:,:, index[2,k]]
p_R: scalar, square symmetric matrix, 3D array: R_{k} in the model. If matrix then R_{k} = R - constant. If it is 3D array then R_{k} = p_R[:,:, index[3,k]]
Y: matrix or vector or 3D array: Data. If Y is matrix then samples are along 0-th dimension and features along the 1-st. If 3D array then third dimension correspond to “multiple time series mode”.
index: vector: Which indices (on 3-rd dimension) from arrays p_A, p_Q,p_H, p_R to use on every time step. If this parameter is None then it is assumed that p_A, p_Q, p_H, p_R do not change over time and indices are not needed. index[0,:] - correspond to A, index[1,:] - correspond to Q index[2,:] - correspond to H, index[3,:] - correspond to R. If index.shape[0] == 1, it is assumed that indides for all matrices are the same.
m_init: vector or matrix: Initial distribution mean. If None it is assumed to be zero. For “multiple time series mode” it is matrix, second dimension of which correspond to different time series. In regular case (“one time series mode”) it is a vector.
P_init: square symmetric matrix or scalar: Initial covariance of the states. If the parameter is scalar then it is assumed that initial covariance matrix is unit matrix multiplied by this scalar. If None the unit matrix is used instead. “multiple time series mode” does not affect it, since it does not affect anything related to state variaces.
calc_log_likelihood: boolean: Whether to calculate marginal likelihood of the state-space model.
calc_grad_log_likelihood: boolean: Whether to calculate gradient of the marginal likelihood of the state-space model. If true then “grad_calc_params” parameter must provide the extra parameters for gradient calculation.
grad_params_no: int: If previous parameter is true, then this parameters gives the total number of parameters in the gradient.
grad_calc_params: dictionary: Dictionary with derivatives of model matrices with respect to parameters “dA”, “dQ”, “dH”, “dR”, “dm_init”, “dP_init”. They can be None, in this case zero matrices (no dependence on parameters) is assumed. If there is only one parameter then third dimension is automatically added.

M: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array: Filter estimates of the state means. In the extra step the initial value is included. In the “multiple time series mode” third dimension correspond to different timeseries.
P: (no_steps+1, state_dim, state_dim) 3D array: Filter estimates of the state covariances. In the extra step the initial value is included.
log_likelihood: double or (1, time_series_no) 3D array.: If the parameter calc_log_likelihood was set to true, return logarithm of marginal likelihood of the state-space model. If the parameter was false, return None. In the “multiple time series mode” it is a vector providing log_likelihood for each time series.
grad_log_likelihood: column vector or (grad_params_no, time_series_no) matrix: If calc_grad_log_likelihood is true, return gradient of log likelihood with respect to parameters. It returns it column wise, so in “multiple time series mode” gradients for each time series is in the corresponding column.
matrs_for_smoother: dict: Dictionary with model functions for smoother. The intrinsic model functions are computed in this functions and they are returned to use in smoother for convenience. They are: ‘p_a’, ‘p_f_A’, ‘p_f_Q’ The dictionary contains the same fields.

classmethod rts_smoother(state_dim, p_dynamic_callables, filter_means, filter_covars)[source]¶

This function implements Rauch–Tung–Striebel(RTS) smoother algorithm based on the results of kalman_filter_raw. These notations are the same:

x_{k} = A_{k} * x_{k-1} + q_{k-1}; q_{k-1} ~ N(0, Q_{k-1}) y_{k} = H_{k} * x_{k} + r_{k}; r_{k-1} ~ N(0, R_{k})

Returns estimated smoother distributions x_{k} ~ N(m_{k}, P(k))

p_a: function (k, x_{k-1}, A_{k}). Dynamic function.: k (iteration number), starts at 0 x_{k-1} State from the previous step A_{k} Jacobian matrices of f_a. In the linear case it is exactly A_{k}.
p_f_A: function (k, m, P) return Jacobian of dynamic function, it is: passed into p_a. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.
p_f_Q: function (k). Returns noise matrix of dynamic model on iteration k.: k (iteration number). starts at 0
filter_means: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array: Results of the Kalman Filter means estimation.
filter_covars: (no_steps+1, state_dim, state_dim) 3D array: Results of the Kalman Filter covariance estimation.

M: (no_steps+1, state_dim) matrix: Smoothed estimates of the state means
P: (no_steps+1, state_dim, state_dim) 3D array: Smoothed estimates of the state covariances

class DescreteStateSpaceMeta[source]¶

Bases: type

Substitute necessary methods from cython.

After thos method the class object is created

Dynamic_Callables_Class¶: alias of GPy.models.state_space_main.Dynamic_Callables_Python

class Dynamic_Callables_Python[source]¶

Bases: object

Ak(k, m, P)[source]¶: function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.

k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

Q_srk(k)[source]¶

function (k). Returns the square root of noise matrix of dynamic model on iteration k.

k (iteration number). starts at 0

This function is implemented to use SVD prediction step.

Qk(k)[source]¶

function (k). Returns noise matrix of dynamic model on iteration k.: k (iteration number). starts at 0

dAk(k)[source]¶

function (k). Returns the derivative of A on iteration k.: k (iteration number). starts at 0

dQk(k)[source]¶

function (k). Returns the derivative of Q on iteration k.: k (iteration number). starts at 0

f_a(k, m, A)[source]¶

p_a: function (k, x_{k-1}, A_{k}). Dynamic function.: k (iteration number), starts at 0 x_{k-1} State from the previous step A_{k} Jacobian matrices of f_a. In the linear case it is exactly A_{k}.

reset(compute_derivatives=False)[source]¶: Return the state of this object to the beginning of iteration (to k eq. 0).

Measurement_Callables_Class¶: alias of GPy.models.state_space_main.Measurement_Callables_Python

class Measurement_Callables_Python[source]¶

Bases: object

Hk(k, m_pred, P_pred)[source]¶

function (k, m, P) return Jacobian of measurement function, it is: passed into p_h. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

R_isrk(k)[source]¶

function (k). Returns the square root of the noise matrix of: measurement equation on iteration k. k (iteration number). starts at 0

This function is implemented to use SVD prediction step.

Rk(k)[source]¶

function (k). Returns noise matrix of measurement equation: on iteration k. k (iteration number). starts at 0

dHk(k)[source]¶

function (k). Returns the derivative of H on iteration k.: k (iteration number). starts at 0

dRk(k)[source]¶

function (k). Returns the derivative of R on iteration k.: k (iteration number). starts at 0

f_h(k, m_pred, Hk)[source]¶

function (k, x_{k}, H_{k}). Measurement function.: k (iteration number), starts at 0 x_{k} state H_{k} Jacobian matrices of f_h. In the linear case it is exactly H_{k}.

reset(compute_derivatives=False)[source]¶: Return the state of this object to the beginning of iteration (to k eq. 0)

Q_handling_Class¶: alias of GPy.models.state_space_main.Q_handling_Python

class Q_handling_Python(Q, index, Q_time_var_index, unique_Q_number, dQ=None)[source]¶

Bases: GPy.models.state_space_main.Dynamic_Callables_Python

R - array with noise on various steps. The result of preprocessing: the noise input.
index - for each step of Kalman filter contains the corresponding index: in the array.
R_time_var_index - another index in the array R. Computed earlier and: passed here.
unique_R_number - number of unique noise matrices below which square: roots are cached and above which they are computed each time.
dQ: 3D array[:, :, param_num]: derivative of Q. Derivative is supported only when Q do not change over time

Object which has two necessary functions:: f_R(k) inv_R_square_root(k)

Q_srk(k)[source]¶

function (k). Returns the square root of noise matrix of dynamic model: on iteration k.

k (iteration number). starts at 0

This function is implemented to use SVD prediction step.

Qk(k)[source]¶

function (k). Returns noise matrix of dynamic model on iteration k.: k (iteration number). starts at 0

dQk(k)[source]¶: function (k). Returns the derivative of Q on iteration k. k (iteration number). starts at 0

R_handling_Class¶: alias of GPy.models.state_space_main.R_handling_Python

class R_handling_Python(R, index, R_time_var_index, unique_R_number, dR=None)[source]¶

Bases: GPy.models.state_space_main.Measurement_Callables_Python

The calss handles noise matrix R.

R - array with noise on various steps. The result of preprocessing: the noise input.
index - for each step of Kalman filter contains the corresponding index: in the array.
R_time_var_index - another index in the array R. Computed earlier and: is passed here.
unique_R_number - number of unique noise matrices below which square: roots are cached and above which they are computed each time.
dR: 3D array[:, :, param_num]: derivative of R. Derivative is supported only when R do not change over time

Object which has two necessary functions:: f_R(k) inv_R_square_root(k)

R_isrk(k)[source]¶: Function returns the inverse square root of R matrix on step k.

Rk(k)[source]¶: function (k). Returns noise matrix of measurement equation on iteration k. k (iteration number). starts at 0

dRk(k)[source]¶: function (k). Returns the derivative of R on iteration k. k (iteration number). starts at 0

Std_Dynamic_Callables_Class¶: alias of GPy.models.state_space_main.Std_Dynamic_Callables_Python

class Std_Dynamic_Callables_Python(A, A_time_var_index, Q, index, Q_time_var_index, unique_Q_number, dA=None, dQ=None)[source]¶

Bases: GPy.models.state_space_main.Q_handling_Python

Ak(k, m_pred, P_pred)[source]¶

function (k, m, P) return Jacobian of measurement function, it is: passed into p_h. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

dAk(k)[source]¶: function (k). Returns the derivative of A on iteration k. k (iteration number). starts at 0

f_a(k, m, A)[source]¶: f_a: function (k, x_{k-1}, A_{k}). Dynamic function. k (iteration number), starts at 0 x_{k-1} State from the previous step A_{k} Jacobian matrices of f_a. In the linear case it is exactly A_{k}.

reset(compute_derivatives=False)[source]¶: Return the state of this object to the beginning of iteration (to k eq. 0)

Std_Measurement_Callables_Class¶: alias of GPy.models.state_space_main.Std_Measurement_Callables_Python

class Std_Measurement_Callables_Python(H, H_time_var_index, R, index, R_time_var_index, unique_R_number, dH=None, dR=None)[source]¶

Bases: GPy.models.state_space_main.R_handling_Python

Hk(k, m_pred, P_pred)[source]¶

function (k, m, P) return Jacobian of measurement function, it is: passed into p_h. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

dHk(k)[source]¶: function (k). Returns the derivative of H on iteration k. k (iteration number). starts at 0

f_h(k, m, H)[source]¶

function (k, x_{k}, H_{k}). Measurement function.: k (iteration number), starts at 0 x_{k} state H_{k} Jacobian matrices of f_h. In the linear case it is exactly H_{k}.

class Struct[source]¶: Bases: object

balance_matrix(A)[source]¶

Balance matrix, i.e. finds such similarity transformation of the original matrix A: A = T * bA * T^{-1}, where norms of columns of bA and of rows of bA are as close as possible. It is usually used as a preprocessing step in eigenvalue calculation routine. It is useful also for State-Space models.

See also:

[1] Beresford N. Parlett and Christian Reinsch (1969). Balancing: a matrix for calculation of eigenvalues and eigenvectors. Numerische Mathematik, 13(4): 293-304.

A: square matrix: Matrix to be balanced

bA: matrix

Balanced matrix

T: matrix

Left part of the similarity transformation

T_inv: matrix

Right part of the similarity transformation.

balance_ss_model(F, L, Qc, H, Pinf, P0, dF=None, dQc=None, dPinf=None, dP0=None)[source]¶

Balances State-Space model for more numerical stability

This is based on the following:

dx/dt = F x + L w

y = H x

Let T z = x, which gives

dz/dt = inv(T) F T z + inv(T) L w

y = H T z

matrix_exponent(M)[source]¶: The function computes matrix exponent and handles some special cases

GPy.models.state_space_model module¶

class StateSpace(X, Y, kernel=None, noise_var=1.0, kalman_filter_type='regular', use_cython=False, balance=False, name='StateSpace')[source]¶

Bases: GPy.core.model.Model

balance: bool Whether to balance or not the model as a whole

log_likelihood()[source]¶

parameters_changed()[source]¶: Parameters have now changed

plot(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, samples_likelihood=0, lower=2.5, upper=97.5, plot_data=True, plot_inducing=True, plot_density=False, predict_kw=None, projection='2d', legend=True, **kwargs)¶

Convenience function for plotting the fit of a GP.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

If you want fine graned control use the specific plotting functions supplied in the model.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
samples_likelihood (int) – the number of samples to draw from the GP and apply the likelihood noise. This is usually not what you want!
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
projection ({2d|3d}) – plot in 2d or 3d?
legend (bool) – convenience, whether to put a legend on the plot or not.

plot_confidence(lower=2.5, upper=97.5, plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', label='gp confidence', predict_kw=None, **kwargs)¶

Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
which_data_ycols (array-like) – which columns of the output y (!) to plot (array-like or list of ints)
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_data(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **plot_kwargs)¶

Plot the training data

For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.

Can plot only part of the data using which_data_rows and which_data_ycols.

Parameters:

which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
label (str) – the label for the plot
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

Returns list:

of plots created.

plot_data_error(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **error_kwargs)¶

Plot the training data input error.

For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.

Can plot only part of the data using which_data_rows and which_data_ycols.

Parameters:

which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
label (str) – the label for the plot
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

Returns list:

of plots created.

plot_density(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=35, label='gp density', predict_kw=None, **kwargs)¶

Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
which_data_ycols (array-like) – which columns of y to plot (array-like or list of ints)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_errorbars_trainset(which_data_rows='all', which_data_ycols='all', fixed_inputs=None, plot_raw=False, apply_link=False, label=None, projection='2d', predict_kw=None, **plot_kwargs)¶

Plot the errorbars of the GP likelihood on the training data. These are the errorbars after the appropriate approximations according to the likelihood are done.

This also works for heteroscedastic likelihoods.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
which_data_ycols – when the data has several columns (independant outputs), only plot these
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
predict_kwargs (dict) – kwargs for the prediction used to predict the right quantiles.
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_f(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_latent(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_mean(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=20, projection='2d', label='gp mean', predict_kw=None, **kwargs)¶

Plot the mean of the GP.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols (array-like) – which columns of y to plot (array-like or list of ints)
levels (int) – for 2D plotting, the number of contour levels to use is
projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
label (str) – the label for the plot.
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_noiseless(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_samples(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=True, apply_link=False, visible_dims=None, which_data_ycols='all', samples=3, projection='2d', label='gp_samples', predict_kw=None, **kwargs)¶

Plot the mean of the GP.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
plot_raw (bool) – plot the latent function (usually denoted f) only? This is usually what you want!
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
which_data_ycols (array-like) – which columns of y to plot (array-like or list of ints)
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
levels (int) – for 2D plotting, the number of contour levels to use is

predict(Xnew=None, filteronly=False, include_likelihood=True, balance=None, **kw)[source]¶: balance: bool Whether to balance or not the model as a whole

predict_quantiles(Xnew=None, quantiles=(2.5, 97.5), balance=None, **kw)[source]¶: balance: bool Whether to balance or not the model as a whole

GPy.models.state_space_setup module¶

This module is intended for the setup of state_space_main module. The need of this module appeared because of the way state_space_main module connected with cython code.

GPy.models.tp_regression module¶

class TPRegression(X, Y, kernel=None, deg_free=5.0, normalizer=None, mean_function=None, name='TP regression')[source]¶

Bases: GPy.core.model.Model

Student-t Process model for regression, as presented in

Shah, A., Wilson, A. and Ghahramani, Z., 2014, April. Student-t processes as alternatives to Gaussian processes. In Artificial Intelligence and Statistics (pp. 877-885).

Parameters:	X – input observations Y – observed values kernel – a GPy kernel, defaults to rbf deg_free – initial value for the degrees of freedom hyperparameter normalizer (Norm) – [False] Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)

Note

Multiple independent outputs are allowed using columns of Y

log_likelihood()[source]¶: The log marginal likelihood of the model, $p(\mathbf{y})$, this is the objective function of the model being optimised

parameters_changed()[source]¶: Method that is called upon any changes to Param variables within the model. In particular in this class this method re-performs inference, recalculating the posterior, log marginal likelihood and gradients of the model

Warning

This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, samples_likelihood=0, lower=2.5, upper=97.5, plot_data=True, plot_inducing=True, plot_density=False, predict_kw=None, projection='2d', legend=True, **kwargs)¶

Convenience function for plotting the fit of a GP.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

If you want fine graned control use the specific plotting functions supplied in the model.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
samples_likelihood (int) – the number of samples to draw from the GP and apply the likelihood noise. This is usually not what you want!
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
projection ({2d|3d}) – plot in 2d or 3d?
legend (bool) – convenience, whether to put a legend on the plot or not.

plot_confidence(lower=2.5, upper=97.5, plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', label='gp confidence', predict_kw=None, **kwargs)¶

Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
which_data_ycols (array-like) – which columns of the output y (!) to plot (array-like or list of ints)
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_data(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **plot_kwargs)¶

Plot the training data

For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.

Can plot only part of the data using which_data_rows and which_data_ycols.

Parameters:

which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
label (str) – the label for the plot
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

Returns list:

of plots created.

plot_data_error(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **error_kwargs)¶

Plot the training data input error.

For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.

Can plot only part of the data using which_data_rows and which_data_ycols.

Parameters:

which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
label (str) – the label for the plot
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

Returns list:

of plots created.

plot_density(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=35, label='gp density', predict_kw=None, **kwargs)¶

Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
which_data_ycols (array-like) – which columns of y to plot (array-like or list of ints)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_errorbars_trainset(which_data_rows='all', which_data_ycols='all', fixed_inputs=None, plot_raw=False, apply_link=False, label=None, projection='2d', predict_kw=None, **plot_kwargs)¶

Plot the errorbars of the GP likelihood on the training data. These are the errorbars after the appropriate approximations according to the likelihood are done.

This also works for heteroscedastic likelihoods.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
which_data_ycols – when the data has several columns (independant outputs), only plot these
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
predict_kwargs (dict) – kwargs for the prediction used to predict the right quantiles.
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_f(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_latent(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_mean(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=20, projection='2d', label='gp mean', predict_kw=None, **kwargs)¶

Plot the mean of the GP.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
plot_raw (bool) – plot the latent function (usually denoted f) only?
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols (array-like) – which columns of y to plot (array-like or list of ints)
levels (int) – for 2D plotting, the number of contour levels to use is
projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
label (str) – the label for the plot.
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_noiseless(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶

Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!

If you want fine graned control use the specific plotting functions supplied in the model.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [default:200]
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
visible_dims (array-like) – an array specifying the input dimensions to plot (maximum two)
levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
lower (float) – the lower percentile to plot
upper (float) – the upper percentile to plot
plot_data (bool) – plot the data into the plot?
plot_inducing (bool) – plot inducing inputs?
plot_density (bool) – plot density instead of the confidence interval?
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_samples(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=True, apply_link=False, visible_dims=None, which_data_ycols='all', samples=3, projection='2d', label='gp_samples', predict_kw=None, **kwargs)¶

Plot the mean of the GP.

You can deactivate the legend for this one plot by supplying None to label.

Give the Y_metadata in the predict_kw if you need it.

Parameters:

plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
plot_raw (bool) – plot the latent function (usually denoted f) only? This is usually what you want!
apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
visible_dims (array-like) – which columns of the input X (!) to plot (array-like or list of ints)
which_data_ycols (array-like) – which columns of y to plot (array-like or list of ints)
predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
levels (int) – for 2D plotting, the number of contour levels to use is

posterior_samples(X, size=10, full_cov=False, Y_metadata=None, likelihood=None, **predict_kwargs)[source]¶: Samples the posterior GP at the points X, equivalent to posterior_samples_f due to the absence of a likelihood.

posterior_samples_f(X, size=10, full_cov=True, **predict_kwargs)[source]¶

Samples the posterior TP at the points X.

Parameters:	X (np.ndarray (Nnew x self.input_dim)) – The points at which to take the samples. size (int.) – the number of a posteriori samples. full_cov (bool.) – whether to return the full covariance matrix, or just the diagonal.
Returns:	fsim: set of simulations
Return type:	np.ndarray (D x N x samples) (if D==1 we flatten out the first dimension)

predict(Xnew, full_cov=False, kern=None, **kwargs)[source]¶: Predict the function(s) at the new point(s) Xnew. For Student-t processes, this method is equivalent to predict_noiseless as no likelihood is included in the model.

predict_noiseless(Xnew, full_cov=False, kern=None)[source]¶

Predict the underlying function f at the new point(s) Xnew.

Parameters:

Xnew (np.ndarray (Nnew x self.input_dim)) – The points at which to make a prediction
full_cov (bool) – whether to return the full covariance matrix, or just the diagonal
kern – The kernel to use for prediction (defaults to the model kern).

Returns:

(mean, var):: mean: posterior mean, a Numpy array, Nnew x self.input_dim var: posterior variance, a Numpy array, Nnew x 1 if full_cov=False, Nnew x Nnew otherwise

If full_cov and self.input_dim > 1, the return shape of var is Nnew x Nnew x self.input_dim. If self.input_dim == 1, the return shape is Nnew x Nnew. This is to allow for different normalizations of the output dimensions.

predict_quantiles(X, quantiles=(2.5, 97.5), kern=None, **kwargs)[source]¶

Get the predictive quantiles around the prediction at X

Parameters:	X (np.ndarray (Xnew x self.input_dim)) – The points at which to make a prediction quantiles (tuple) – tuple of quantiles, default is (2.5, 97.5) which is the 95% interval kern – optional kernel to use for prediction
Returns:	list of quantiles for each X and predictive quantiles for interval combination
Return type:	[np.ndarray (Xnew x self.output_dim), np.ndarray (Xnew x self.output_dim)]

set_X(X)[source]¶

Set the input data of the model

Parameters:	X (np.ndarray) – input observations

set_XY(X, Y)[source]¶

Set the input / output data of the model This is useful if we wish to change our existing data but maintain the same model

Parameters:	X (np.ndarray) – input observations Y (np.ndarray or ObsAr) – output observations

set_Y(Y)[source]¶

Set the output data of the model

Parameters:	Y (np.ndarray or ObsArray) – output observations

GPy.models.warped_gp module¶

class WarpedGP(X, Y, kernel=None, warping_function=None, warping_terms=3, normalizer=False)[source]¶

Bases: GPy.core.gp.GP

This defines a GP Regression model that applies a warping function to the output.

log_likelihood()[source]¶: Notice we add the jacobian of the warping function here.

log_predictive_density(x_test, y_test, Y_metadata=None)[source]¶

Calculation of the log predictive density. Notice we add the jacobian of the warping function here.

Parameters:	x_test ((Nx1) array) – test locations (x_{}) y_test* ((Nx1) array) – test observations (y_{}) Y_metadata* – metadata associated with the test points

parameters_changed()[source]¶: Notice that we update the warping function gradients here.

plot_warping()[source]¶

predict(Xnew, kern=None, pred_init=None, Y_metadata=None, median=False, deg_gauss_hermite=20, likelihood=None)[source]¶: Prediction results depend on: - The value of the self.predict_in_warped_space flag - The median flag passed as argument The likelihood keyword is never used, it is just to follow the plotting API.

predict_quantiles(X, quantiles=(2.5, 97.5), Y_metadata=None, likelihood=None, kern=None)[source]¶

Get the predictive quantiles around the prediction at X

Parameters:	X (np.ndarray (Xnew x self.input_dim)) – The points at which to make a prediction quantiles (tuple) – tuple of quantiles, default is (2.5, 97.5) which is the 95% interval
Returns:	list of quantiles for each X and predictive quantiles for interval combination
Return type:	[np.ndarray (Xnew x self.input_dim), np.ndarray (Xnew x self.input_dim)]

set_XY(X=None, Y=None)[source]¶

Set the input / output data of the model This is useful if we wish to change our existing data but maintain the same model

Parameters:	X (np.ndarray) – input observations Y (np.ndarray) – output observations

transform_data()[source]¶