GPy.models package¶
Introduction¶
This package principally contains classes ultimately inherited from GPy.core.gp.GP
intended as models for end user consuption  much of GPy.core.gp.GP
is not intended to be called directly. The general form of a “model” is a function that takes some data, a kernel (see GPy.kern
) and other parameters, returning an object representation.
Several models directly inherit GPy.core.gp.GP
:
Some models fall into conceptually related groups of models (e.g. GPy.core.sparse_gp
, GPy.core.sparse_gp_mpi
):
In some cases one enduser model inherits another e.g.
Submodules¶
GPy.models.bayesian_gplvm module¶

class
BayesianGPLVM
(Y, input_dim, X=None, X_variance=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='bayesian gplvm', mpi_comm=None, normalizer=None, missing_data=False, stochastic=False, batchsize=1, Y_metadata=None)[source]¶ Bases:
GPy.core.sparse_gp_mpi.SparseGP_MPI
Bayesian Gaussian Process Latent Variable Model
Parameters:  Y (np.ndarray GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood
 input_dim (int) – latent dimensionality
 init ('PCA''random') – initialisation method for the latent space

get_X_gradients
(X)[source]¶ Get the gradients of the posterior distribution of X in its specific form.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing
(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶ Plot a scatter plot of the inducing inputs.
Parameters:  which_indices ([int]) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – marker to use [default is custom arrow like]
 kwargs – the kwargs for the scatter plots
 projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent
(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 scatter_kwargs – the kwargs for the scatter plots

plot_scatter
(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶ Plot a scatter plot of the latent space.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – markers to use  cycle if more labels then markers are given
 kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map
(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 annotation_kwargs – the kwargs for the annotation plot
 scatter_kwargs – the kwargs for the scatter plots
GPy.models.bayesian_gplvm_minibatch module¶

class
BayesianGPLVMMiniBatch
(Y, input_dim, X=None, X_variance=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='bayesian gplvm', normalizer=None, missing_data=False, stochastic=False, batchsize=1)[source]¶ Bases:
GPy.models.sparse_gp_minibatch.SparseGPMiniBatch
Bayesian Gaussian Process Latent Variable Model
Parameters:  Y (np.ndarray GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood
 input_dim (int) – latent dimensionality
 init ('PCA''random') – initialisation method for the latent space

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing
(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶ Plot a scatter plot of the inducing inputs.
Parameters:  which_indices ([int]) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – marker to use [default is custom arrow like]
 kwargs – the kwargs for the scatter plots
 projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent
(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 scatter_kwargs – the kwargs for the scatter plots

plot_scatter
(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶ Plot a scatter plot of the latent space.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – markers to use  cycle if more labels then markers are given
 kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map
(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 annotation_kwargs – the kwargs for the annotation plot
 scatter_kwargs – the kwargs for the scatter plots
GPy.models.bcgplvm module¶

class
BCGPLVM
(Y, input_dim, kernel=None, mapping=None)[source]¶ Bases:
GPy.models.gplvm.GPLVM
Back constrained Gaussian Process Latent Variable Model
Parameters:  Y (np.ndarray) – observed data
 input_dim (int) – latent dimensionality
 mapping (GPy.core.Mapping object) – mapping for back constraint

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.dpgplvm module¶

class
DPBayesianGPLVM
(Y, input_dim, X_prior, X=None, X_variance=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='bayesian gplvm', mpi_comm=None, normalizer=None, missing_data=False, stochastic=False, batchsize=1)[source]¶ Bases:
GPy.models.bayesian_gplvm.BayesianGPLVM
Bayesian Gaussian Process Latent Variable Model with Descriminative prior
GPy.models.gp_classification module¶

class
GPClassification
(X, Y, kernel=None, Y_metadata=None, mean_function=None, inference_method=None, likelihood=None, normalizer=False)[source]¶ Bases:
GPy.core.gp.GP
Gaussian Process classification
This is a thin wrapper around the models.GP class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values, can be None if likelihood is not None
 kernel – a GPy kernel, defaults to rbf
 likelihood – a GPy likelihood, defaults to Bernoulli
 inference_method (
GPy.inference.latent_function_inference.LatentFunctionInference
) – Latent function inference to use, defaults to EP
Note
Multiple independent outputs are allowed using columns of Y

static
from_dict
(input_dict, data=None)[source]¶ Instantiate an object of a derived class using the information in input_dict (built by the to_dict method of the derived class). More specifically, after reading the derived class from input_dict, it calls the method _build_from_input_dict of the derived class. Note: This method should not be overrided in the derived class. In case it is needed, please override _build_from_input_dict instate.
Parameters: input_dict (dict) – Dictionary with all the information needed to instantiate the object.

to_dict
(save_data=True)[source]¶ Convert the object into a json serializable dictionary. Note: It uses the private method _save_to_input_dict of the parent.
Parameters: save_data (boolean) – if true, it adds the training data self.X and self.Y to the dictionary Return dict: json serializable dictionary containing the needed information to instantiate the object
GPy.models.gp_coregionalized_regression module¶

class
GPCoregionalizedRegression
(X_list, Y_list, kernel=None, likelihoods_list=None, name='GPCR', W_rank=1, kernel_name='coreg')[source]¶ Bases:
GPy.core.gp.GP
Gaussian Process model for heteroscedastic multioutput regression
This is a thin wrapper around the models.GP class, with a set of sensible defaults
Parameters:  X_list (list of numpy arrays) – list of input observations corresponding to each output
 Y_list (list of numpy arrays) – list of observed values related to the different noise models
 kernel (None  GPy.kernel defaults) – a GPy kernel ** Coregionalized, defaults to RBF ** Coregionalized
 name (string) – model name
 W_rank (integer) – number tuples of the corregionalization parameters ‘W’ (see coregionalize kernel documentation)
 kernel_name (string) – name of the kernel
Likelihoods_list: a list of likelihoods, defaults to list of Gaussian likelihoods
GPy.models.gp_grid_regression module¶

class
GPRegressionGrid
(X, Y, kernel=None, Y_metadata=None, normalizer=None)[source]¶ Bases:
GPy.core.gp_grid.GpGrid
Gaussian Process model for grid inputs using Kronecker products
This is a thin wrapper around the models.GpGrid class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values
 kernel – a GPy kernel, defaults to the kron variation of SqExp
 normalizer (Norm) –
[False]
Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)
Note
Multiple independent outputs are allowed using columns of Y
GPy.models.gp_heteroscedastic_regression module¶

class
GPHeteroscedasticRegression
(X, Y, kernel=None, Y_metadata=None)[source]¶ Bases:
GPy.core.gp.GP
Gaussian Process model for heteroscedastic regression
This is a thin wrapper around the models.GP class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values
 kernel – a GPy kernel, defaults to rbf
NB: This model does not make inference on the noise outside the training set
GPy.models.gp_kronecker_gaussian_regression module¶

class
GPKroneckerGaussianRegression
(X1, X2, Y, kern1, kern2, noise_var=1.0, name='KGPR')[source]¶ Bases:
GPy.core.model.Model
Kronecker GP regression
Take two kernels computed on separate spaces K1(X1), K2(X2), and a data matrix Y which is f size (N1, N2).
The effective covaraince is np.kron(K2, K1) The effective data is vec(Y) = Y.flatten(order=’F’)
The noise must be iid Gaussian.
See [stegle_et_al_2011].
References
[stegle_et_al_2011] Stegle, O.; Lippert, C.; Mooij, J.M.; Lawrence, N.D.; Borgwardt, K.:Efficient inference in matrixvariate Gaussian models with iid observation noise. In: Advances in Neural Information Processing Systems, 2011, Pages 630638 
parameters_changed
()[source]¶ This method gets called when parameters have changed. Another way of listening to param changes is to add self as a listener to the param, such that updates get passed through. See :py:function:
paramz.param.Observable.add_observer

predict
(X1new, X2new)[source]¶ Return the predictive mean and variance at a series of new points X1new, X2new Only returns the diagonal of the predictive variance, for now.
Parameters:  X1new (np.ndarray, Nnew x self.input_dim1) – The points at which to make a prediction
 X2new (np.ndarray, Nnew x self.input_dim2) – The points at which to make a prediction

GPy.models.gp_multiout_regression module¶

class
GPMultioutRegression
(X, Y, Xr_dim, kernel=None, kernel_row=None, Z=None, Z_row=None, X_row=None, Xvariance_row=None, num_inducing=(10, 10), qU_var_r_W_dim=None, qU_var_c_W_dim=None, init='GP', name='GPMR')[source]¶ Bases:
GPy.core.sparse_gp.SparseGP
Gaussian Process model for multioutput regression without missing data
This is an implementation of Latent Variable Multiple Output Gaussian Processes (LVMOGP) in [Dai_et_al_2017].
References
[Dai_et_al_2017] Dai, Z.; Alvarez, M.A.; Lawrence, N.D: Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes. In NIPS, 2017. Parameters:  X (numpy.ndarray) – input observations.
 Y (numpy.ndarray) – output observations, each column corresponding to an output dimension.
 Xr_dim (int) – the dimensionality of a latent space, in which output dimensions are embedded in
 kernel (GPy.kern.Kern or None) – a GPy kernel for GP of individual output dimensions ** defaults to RBF **
 kernel_row (GPy.kern.Kern or None) – a GPy kernel for the GP of the latent space ** defaults to RBF **
 Z (numpy.ndarray or None) – inducing inputs
 Z_row (numpy.ndarray or None) – inducing inputs for the latent space
 X_row (numpy.ndarray or None) – the initial value of the mean of the variational posterior distribution of points in the latent space
 Xvariance_row (numpy.ndarray or None) – the initial value of the variance of the variational posterior distribution of points in the latent space
 num_inducing ((int, int)) – a tuple (M, Mr). M is the number of inducing points for GP of individual output dimensions. Mr is the number of inducing points for the latent space.
 qU_var_r_W_dim (int) – the dimensionality of the covariance of q(U) for the latent space. If it is smaller than the number of inducing points, it represents a lowrank parameterization of the covariance matrix.
 qU_var_c_W_dim (int) – the dimensionality of the covariance of q(U) for the GP regression. If it is smaller than the number of inducing points, it represents a lowrank parameterization of the covariance matrix.
 init (str) – the choice of initialization: ‘GP’ or ‘rand’. With ‘rand’, the model is initialized randomly. With ‘GP’, the model is initialized through a protocol as follows: (1) fits a sparse GP (2) fits a BGPLVM based on the outcome of sparse GP (3) initialize the model based on the outcome of the BGPLVM.
 name (str) – the name of the model

optimize_auto
(max_iters=10000, verbose=True)[source]¶ Optimize the model parameters through a predefined protocol.
Parameters:  max_iters (int) – the maximum number of iterations.
 verbose (boolean) – print the progress of optimization or not.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.gp_multiout_regression_md module¶

class
GPMultioutRegressionMD
(X, Y, indexD, Xr_dim, kernel=None, kernel_row=None, Z=None, Z_row=None, X_row=None, Xvariance_row=None, num_inducing=(10, 10), qU_var_r_W_dim=None, qU_var_c_W_dim=None, init='GP', heter_noise=False, name='GPMRMD')[source]¶ Bases:
GPy.core.sparse_gp.SparseGP
Gaussian Process model for multioutput regression with missing data
This is an implementation of Latent Variable Multiple Output Gaussian Processes (LVMOGP) in [Dai_et_al_2017]. This model targets at the use case, in which each output dimension is observed at a different set of inputs. The model takes a different data format: the inputs and outputs observations of all the output dimensions are stacked together correspondingly into two matrices. An extra array is used to indicate the index of output dimension for each data point. The output dimensions are indexed using integers from 0 to D1 assuming there are D output dimensions.
References
[Dai_et_al_2017] Dai, Z.; Alvarez, M.A.; Lawrence, N.D: Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes. In NIPS, 2017. Parameters:  X (numpy.ndarray) – input observations.
 Y (numpy.ndarray) – output observations, each column corresponding to an output dimension.
 indexD (numpy.ndarray) – the array containing the index of output dimension for each data point
 Xr_dim (int) – the dimensionality of a latent space, in which output dimensions are embedded in
 kernel (GPy.kern.Kern or None) – a GPy kernel for GP of individual output dimensions ** defaults to RBF **
 kernel_row (GPy.kern.Kern or None) – a GPy kernel for the GP of the latent space ** defaults to RBF **
 Z (numpy.ndarray or None) – inducing inputs
 Z_row (numpy.ndarray or None) – inducing inputs for the latent space
 X_row (numpy.ndarray or None) – the initial value of the mean of the variational posterior distribution of points in the latent space
 Xvariance_row (numpy.ndarray or None) – the initial value of the variance of the variational posterior distribution of points in the latent space
 num_inducing ((int, int)) – a tuple (M, Mr). M is the number of inducing points for GP of individual output dimensions. Mr is the number of inducing points for the latent space.
 qU_var_r_W_dim (int) – the dimensionality of the covariance of q(U) for the latent space. If it is smaller than the number of inducing points, it represents a lowrank parameterization of the covariance matrix.
 qU_var_c_W_dim (int) – the dimensionality of the covariance of q(U) for the GP regression. If it is smaller than the number of inducing points, it represents a lowrank parameterization of the covariance matrix.
 init (str) – the choice of initialization: ‘GP’ or ‘rand’. With ‘rand’, the model is initialized randomly. With ‘GP’, the model is initialized through a protocol as follows: (1) fits a sparse GP (2) fits a BGPLVM based on the outcome of sparse GP (3) initialize the model based on the outcome of the BGPLVM.
 heter_noise (boolean) – whether assuming heteroscedastic noise in the model, boolean
 name (str) – the name of the model

optimize_auto
(max_iters=10000, verbose=True)[source]¶ Optimize the model parameters through a predefined protocol.
Parameters:  max_iters (int) – the maximum number of iterations.
 verbose (boolean) – print the progress of optimization or not.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.gp_offset_regression module¶

class
GPOffsetRegression
(X, Y, kernel=None, Y_metadata=None, normalizer=None, noise_var=1.0, mean_function=None)[source]¶ Bases:
GPy.core.gp.GP
Gaussian Process model for offset regression
Parameters:  X – input observations, we assume for this class that this has one dimension of actual inputs and the last dimension should be the index of the cluster (so X should be Nx2)
 Y – observed values (Nx1?)
 kernel – a GPy kernel, defaults to rbf
 normalizer (Norm) – [False]
 noise_var –
the noise variance for Gaussian likelhood, defaults to 1.
Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)
Note
Multiple independent outputs are allowed using columns of Y

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.gp_regression module¶

class
GPRegression
(X, Y, kernel=None, Y_metadata=None, normalizer=None, noise_var=1.0, mean_function=None)[source]¶ Bases:
GPy.core.gp.GP
Gaussian Process model for regression
This is a thin wrapper around the models.GP class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values
 kernel – a GPy kernel, defaults to rbf
 normalizer (Norm) – [False]
 noise_var –
the noise variance for Gaussian likelhood, defaults to 1.
Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)
Note
Multiple independent outputs are allowed using columns of Y

to_dict
(save_data=True)[source]¶ Convert the object into a json serializable dictionary. Note: It uses the private method _save_to_input_dict of the parent.
Parameters: save_data (boolean) – if true, it adds the training data self.X and self.Y to the dictionary Return dict: json serializable dictionary containing the needed information to instantiate the object
GPy.models.gp_var_gauss module¶

class
GPVariationalGaussianApproximation
(X, Y, kernel, likelihood, Y_metadata=None)[source]¶ Bases:
GPy.core.gp.GP
The Variational Gaussian Approximation revisited
References
[opper_archambeau_2009] Opper, M.; Archambeau, C.; The Variational Gaussian Approximation Revisited. Neural Comput. 2009, pages 786792.
GPy.models.gplvm module¶

class
GPLVM
(Y, input_dim, init='PCA', X=None, kernel=None, name='gplvm', Y_metadata=None, normalizer=False)[source]¶ Bases:
GPy.core.gp.GP
Gaussian Process Latent Variable Model
Parameters:  Y (np.ndarray) – observed data
 input_dim (int) – latent dimensionality
 init ('PCA''random') – initialisation method for the latent space
 normalizer (bool) – normalize the outputs Y. If normalizer is True, we will normalize using Standardize. If normalizer is False (the default), no normalization will be done.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing
(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶ Plot a scatter plot of the inducing inputs.
Parameters:  which_indices ([int]) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – marker to use [default is custom arrow like]
 kwargs – the kwargs for the scatter plots
 projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent
(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 scatter_kwargs – the kwargs for the scatter plots

plot_scatter
(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶ Plot a scatter plot of the latent space.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – markers to use  cycle if more labels then markers are given
 kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map
(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 annotation_kwargs – the kwargs for the annotation plot
 scatter_kwargs – the kwargs for the scatter plots
GPy.models.gradient_checker module¶

class
GradientChecker
(f, df, x0, names=None, *args, **kwargs)[source]¶ Bases:
GPy.core.model.Model
Parameters:  f – Function to check gradient for
 df – Gradient of function to check
 x0 ([arraylike]  arraylike  float  int) – Initial guess for inputs x (if it has a shape (a,b) this will be reflected in the parameter names). Can be a list of arrays, if takes a list of arrays. This list will be passed to f and df in the same order as given here. If only one argument, make sure not to pass a list!!!
 names – Names to print, when performing gradcheck. If a list was passed to x0 a list of names with the same length is expected.
 args – Arguments passed as f(x, *args, **kwargs) and df(x, *args, **kwargs)
Examples
Initialisation:
from GPy.models import GradientChecker N, M, Q = 10, 5, 3
Sinusoid:
X = numpy.random.rand(N, Q) grad = GradientChecker(numpy.sin,numpy.cos,X,'x') grad.checkgrad(verbose=1)
Using GPy:
X, Z = numpy.random.randn(N,Q), numpy.random.randn(M,Q) kern = GPy.kern.linear(Q, ARD=True) + GPy.kern.rbf(Q, ARD=True) grad = GradientChecker(kern.K, lambda x: 2*kern.dK_dX(numpy.ones((1,1)), x), x0 = X.copy(), names='X') grad.checkgrad(verbose=1) grad.randomize() grad.checkgrad(verbose=1)

class
HessianChecker
(f, df, ddf, x0, names=None, *args, **kwargs)[source]¶ Bases:
GPy.models.gradient_checker.GradientChecker
Parameters:  f – Function (only used for numerical hessian gradient)
 df – Gradient of function to check
 ddf – Analytical gradient function
 x0 ([arraylike]  arraylike  float  int) – Initial guess for inputs x (if it has a shape (a,b) this will be reflected in the parameter names). Can be a list of arrays, if takes a list of arrays. This list will be passed to f and df in the same order as given here. If only one argument, make sure not to pass a list!!!
 names – Names to print, when performing gradcheck. If a list was passed to x0 a list of names with the same length is expected.
 args – Arguments passed as f(x, *args, **kwargs) and df(x, *args, **kwargs)

checkgrad
(target_param=None, verbose=False, step=1e06, tolerance=0.001, block_indices=None, plot=False)[source]¶ Overwrite checkgrad method to check whole block instead of looping through
Shows diagnostics using matshow instead
Parameters:  verbose (bool) – If True, print a “full” checking of each parameter
 step (float (default 1e6)) – The size of the step around which to linearise the objective
 tolerance (float (default 1e3)) – the tolerance allowed (see note)
 Note:
 The gradient is considered correct if the ratio of the analytical and numerical gradients is within <tolerance> of unity.

class
SkewChecker
(df, ddf, dddf, x0, names=None, *args, **kwargs)[source]¶ Bases:
GPy.models.gradient_checker.HessianChecker
Parameters:  df – gradient of function
 ddf – Gradient of function to check (hessian)
 dddf – Analytical gradient function (third derivative)
 x0 ([arraylike]  arraylike  float  int) – Initial guess for inputs x (if it has a shape (a,b) this will be reflected in the parameter names). Can be a list of arrays, if takes a list of arrays. This list will be passed to f and df in the same order as given here. If only one argument, make sure not to pass a list!!!
 names – Names to print, when performing gradcheck. If a list was passed to x0 a list of names with the same length is expected.
 args – Arguments passed as f(x, *args, **kwargs) and df(x, *args, **kwargs)
GPy.models.ibp_lfm module¶

class
IBPLFM
(X, Y, input_dim=2, output_dim=1, rank=1, Gamma=None, num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='IBP for LFM', alpha=2.0, beta=2.0, connM=None, tau=None, mpi_comm=None, normalizer=False, variational_prior=None, **kwargs)[source]¶ Bases:
GPy.core.sparse_gp_mpi.SparseGP_MPI
Indian Buffet Process for Latent Force Models
Parameters:  Y (np.ndarray GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood
 X (np.ndarray) – input data (np.ndarray) [X:values, X:index], index refers to the number of the output
 input_dim (int) – latent dimensionality
: param rank: number of latent functions

get_Zp_gradients
(Zp)[source]¶ Get the gradients of the posterior distribution of Zp in its specific form.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

class
IBPPosterior
(binary_prob, tau=None, name='Sensitivity space', *a, **kw)[source]¶ Bases:
GPy.core.parameterization.parameterized.Parameterized
The IBP distribution for variational approximations.
binary_prob : the probability of including a latent function over an output.

class
IBPPrior
(rank, alpha=2.0, name='IBPPrior', **kw)[source]¶ Bases:
GPy.core.parameterization.variational.VariationalPrior

class
VarDTC_minibatch_IBPLFM
(batchsize=None, limit=3, mpi_comm=None)[source]¶ Bases:
GPy.inference.latent_function_inference.var_dtc_parallel.VarDTC_minibatch
Modifications of VarDTC_minibatch for IBP LFM
GPy.models.input_warped_gp module¶

class
InputWarpedGP
(X, Y, kernel=None, normalizer=False, warping_function=None, warping_indices=None, Xmin=None, Xmax=None, epsilon=None)[source]¶ Bases:
GPy.core.gp.GP
Input Warped GP
This defines a GP model that applies a warping function to the Input. By default, it uses Kumar Warping (CDF of Kumaraswamy distribution)
X : array_like, shape = (n_samples, n_features) for input data
Y : array_like, shape = (n_samples, 1) for output data
 kernel : object, optional
 An instance of kernel function defined in GPy.kern Default to Matern 32
 warping_function : object, optional
 An instance of warping function defined in GPy.util.input_warping_functions Default to KumarWarping
 warping_indices : list of int, optional
 An list of indices of which features in X should be warped. It is used in the Kumar warping function
 normalizer : bool, optional
 A bool variable indicates whether to normalize the output
 Xmin : list of float, optional
 The min values for every feature in X It is used in the Kumar warping function
 Xmax : list of float, optional
 The max values for every feature in X It is used in the Kumar warping function
 epsilon : float, optional
 We normalize X to [0+e, 1e]. If not given, using the default value defined in KumarWarping function
 X_untransformed : array_like, shape = (n_samples, n_features)
 A copy of original input X
 X_warped : array_like, shape = (n_samples, n_features)
 Input data after warping
 warping_function : object, optional
 An instance of warping function defined in GPy.util.input_warping_functions Default to KumarWarping
Kumar warping uses the CDF of Kumaraswamy distribution. More on the Kumaraswamy distribution can be found at the wiki page: https://en.wikipedia.org/wiki/Kumaraswamy_distribution
Snoek, J.; Swersky, K.; Zemel, R. S. & Adams, R. P. Input Warping for Bayesian Optimization of Nonstationary Functions preprint arXiv:1402.0929, 2014

log_likelihood
()[source]¶ Compute the marginal log likelihood
For input warping, just use the normal GP log likelihood

parameters_changed
()[source]¶ Update the gradients of parameters for warping function
This method is called when having new values of parameters for warping function, kernels and other parameters in a normal GP
GPy.models.mrd module¶

class
MRD
(Ylist, input_dim, X=None, X_variance=None, initx='PCA', initz='permute', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihoods=None, name='mrd', Ynames=None, normalizer=False, stochastic=False, batchsize=10)[source]¶ Bases:
GPy.models.bayesian_gplvm_minibatch.BayesianGPLVMMiniBatch
!WARNING: This is bleeding edge code and still in development. Functionality may change fundamentally during development!
Apply MRD to all given datasets Y in Ylist.
Y_i in [n x p_i]
If Ylist is a dictionary, the keys of the dictionary are the names, and the values are the different datasets to compare.
The samples n in the datasets need to match up, whereas the dimensionality p_d can differ.
Parameters:  Ylist ([arraylike]) – List of datasets to apply MRD on
 input_dim (int) – latent dimensionality
 X (arraylike) – mean of starting latent space q in [n x q]
 X_variance (arraylike) – variance of starting latent space q in [n x q]
 initx (['concat''single''random']) –
initialisation method for the latent space :
 ’concat’  PCA on concatenation of all datasets
 ’single’  Concatenation of PCA on datasets, respectively
 ’random’  Random draw from a Normal(0,1)
 initz ('permute''random') – initialisation method for inducing inputs
 num_inducing – number of inducing inputs to use
 Z – initial inducing inputs
 kernel ([GPy.kernels.kernels]  GPy.kernels.kernels  None (default)) – list of kernels or kernel to copy for each output
 :param :class:`~GPy.inference.latent_function_inference inference_method:
 InferenceMethodList of inferences, or one inference method for all
:param
likelihoods
likelihoods: the likelihoods to use :param str name: the name of this model :param [str] Ynames: the names for the datasets given, must be of equal length as Ylist or None :param boolNorm normalizer: How to normalize the data? :param bool stochastic: Should this model be using stochastic gradient descent over the dimensions? :param bool[bool] batchsize: either one batchsize for all, or one batchsize per dataset.
factorize_space
(threshold=0.005, printOut=False, views=None)[source]¶ Given a trained MRD model, this function looks at the optimized ARD weights (lengthscales) and decides which part of the latent space is shared across views or private, according to a threshold. The threshold is applied after all weights are normalized so that the maximum value is 1.

log_likelihood
()[source]¶ The log marginal likelihood of the model, \(p(\mathbf{y})\), this is the objective function of the model being optimised

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_latent
(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', predict_kwargs={}, scatter_kwargs=None, **imshow_kwargs)[source]¶ see plotting.matplot_dep.dim_reduction_plots.plot_latent if predict_kwargs is None, will plot latent spaces for 0th dataset (and kernel), otherwise give predict_kwargs=dict(Yindex=’index’) for plotting only the latent space of dataset with ‘index’.
GPy.models.multioutput_gp module¶

class
MultioutputGP
(X_list, Y_list, kernel_list, likelihood_list, name='multioutputgp', kernel_cross_covariances={}, inference_method=None)[source]¶ Bases:
GPy.core.gp.GP
Gaussian process model for using observations from multiple likelihoods and different kernels :param X_list: input observations in a list for each likelihood :param Y: output observations in a list for each likelihood :param kernel_list: kernels in a list for each likelihood :param likelihood_list: likelihoods in a list :param kernel_cross_covariances: Cross covariances between different likelihoods. See class MultioutputKern for more :param inference_method: The
LatentFunctionInference
inference method to use for this GP
log_predictive_density
(x_test, y_test, Y_metadata=None)[source]¶ Calculation of the log predictive density
Parameters:  x_test ((Nx1) array) – test locations (x_{*})
 y_test ((Nx1) array) – test observations (y_{*})
 Y_metadata – metadata associated with the test points

predict
(Xnew, full_cov=False, Y_metadata=None, kern=None, likelihood=None, include_likelihood=True)[source]¶ Predict the function(s) at the new point(s) Xnew. This includes the likelihood variance added to the predicted underlying function (usually referred to as f).
In order to predict without adding in the likelihood give include_likelihood=False, or refer to self.predict_noiseless().
Parameters:  Xnew (np.ndarray (Nnew x self.input_dim)) – The points at which to make a prediction
 full_cov (bool) – whether to return the full covariance matrix, or just the diagonal
 Y_metadata – metadata about the predicting point to pass to the likelihood
 kern – The kernel to use for prediction (defaults to the model kern). this is useful for examining e.g. subprocesses.
 include_likelihood (bool) – Whether or not to add likelihood noise to the predicted underlying latent function f.
Returns: (mean, var): mean: posterior mean, a Numpy array, Nnew x self.input_dim var: posterior variance, a Numpy array, Nnew x 1 if full_cov=False,
Nnew x Nnew otherwise
If full_cov and self.input_dim > 1, the return shape of var is Nnew x Nnew x self.input_dim. If self.input_dim == 1, the return shape is Nnew x Nnew. This is to allow for different normalizations of the output dimensions.
Note: If you want the predictive quantiles (e.g. 95% confidence interval) use
predict_quantiles
.

predict_noiseless
(Xnew, full_cov=False, Y_metadata=None, kern=None)[source]¶ Convenience function to predict the underlying function of the GP (often referred to as f) without adding the likelihood variance on the prediction function.
This is most likely what you want to use for your predictions.
Parameters:  Xnew (np.ndarray (Nnew x self.input_dim)) – The points at which to make a prediction
 full_cov (bool) – whether to return the full covariance matrix, or just the diagonal
 Y_metadata – metadata about the predicting point to pass to the likelihood
 kern – The kernel to use for prediction (defaults to the model kern). this is useful for examining e.g. subprocesses.
Returns:  (mean, var):
mean: posterior mean, a Numpy array, Nnew x self.input_dim var: posterior variance, a Numpy array, Nnew x 1 if full_cov=False, Nnew x Nnew otherwise
If full_cov and self.input_dim > 1, the return shape of var is Nnew x Nnew x self.input_dim. If self.input_dim == 1, the return shape is Nnew x Nnew. This is to allow for different normalizations of the output dimensions.
Note: If you want the predictive quantiles (e.g. 95% confidence interval) use
predict_quantiles
.

predict_quantiles
(X, quantiles=(2.5, 97.5), Y_metadata=None, kern=None, likelihood=None)[source]¶ Get the predictive quantiles around the prediction at X
Parameters:  X (np.ndarray (Xnew x self.input_dim)) – The points at which to make a prediction
 quantiles (tuple) – tuple of quantiles, default is (2.5, 97.5) which is the 95% interval
 kern – optional kernel to use for prediction
Returns: list of quantiles for each X and predictive quantiles for interval combination
Return type: [np.ndarray (Xnew x self.output_dim), np.ndarray (Xnew x self.output_dim)]

predictive_gradients
(Xnew, kern=None)[source]¶ Compute the derivatives of the predicted latent function with respect to X* Given a set of points at which to predict X* (size [N*,Q]), compute the derivatives of the mean and variance. Resulting arrays are sized:
dmu_dX* – [N*, Q ,D], where D is the number of output in this GP (usually one). Note that this is not the same as computing the mean and variance of the derivative of the function!
 dv_dX* – [N*, Q], (since all outputs have the same variance)
Parameters: X (np.ndarray (Xnew x self.input_dim)) – The points at which to get the predictive gradients Returns: dmu_dX, dv_dX Return type: [np.ndarray (N*, Q ,D), np.ndarray (N*,Q) ]

GPy.models.one_vs_all_classification module¶

class
OneVsAllClassification
(X, Y, kernel=None, Y_metadata=None, messages=True)[source]¶ Bases:
object
Gaussian Process classification: One vs all
This is a thin wrapper around the models.GPClassification class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values, can be None if likelihood is not None
 kernel – a GPy kernel, defaults to rbf
Note
Multiple independent outputs are not allowed
GPy.models.one_vs_all_sparse_classification module¶

class
OneVsAllSparseClassification
(X, Y, kernel=None, Y_metadata=None, messages=True, num_inducing=10)[source]¶ Bases:
object
Gaussian Process classification: One vs all
This is a thin wrapper around the models.GPClassification class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values, can be None if likelihood is not None
 kernel – a GPy kernel, defaults to rbf
Note
Multiple independent outputs are not allowed
GPy.models.sparse_gp_classification module¶

class
SparseGPClassification
(X, Y=None, likelihood=None, kernel=None, Z=None, num_inducing=10, Y_metadata=None, mean_function=None, inference_method=None, normalizer=False)[source]¶ Bases:
GPy.core.sparse_gp.SparseGP
Sparse Gaussian Process model for classification
This is a thin wrapper around the sparse_GP class, with a set of sensible defaults
Parameters:  X – input observations
 Y – observed values
 likelihood – a GPy likelihood, defaults to Bernoulli
 kernel – a GPy kernel, defaults to rbf+white
 inference_method (
GPy.inference.latent_function_inference.LatentFunctionInference
) – Latent function inference to use, defaults to EPDTC  normalize_X (FalseTrue) – whether to normalize the input data before computing (predictions will be in original scales)
 normalize_Y (FalseTrue) – whether to normalize the input data before computing (predictions will be in original scales)
Return type: model object

static
from_dict
(input_dict, data=None)[source]¶ Instantiate an SparseGPClassification object using the information in input_dict (built by the to_dict method).
Parameters: data (tuple( np.ndarray
,np.ndarray
)) – It is used to provide X and Y for the case when the model was saved using save_data=False in to_dict method.

class
SparseGPClassificationUncertainInput
(X, X_variance, Y, kernel=None, Z=None, num_inducing=10, Y_metadata=None, normalizer=None)[source]¶ Bases:
GPy.core.sparse_gp.SparseGP
Sparse Gaussian Process model for classification with uncertain inputs.
This is a thin wrapper around the sparse_GP class, with a set of sensible defaults
Parameters:  X (np.ndarray (num_data x input_dim)) – input observations
 X_variance (np.ndarray (num_data x input_dim)) – The uncertainty in the measurements of X (Gaussian variance, optional)
 Y – observed values
 kernel – a GPy kernel, defaults to rbf+white
 Z (np.ndarray (num_inducing x input_dim)  None) – inducing inputs (optional, see note)
 num_inducing (int) – number of inducing points (ignored if Z is passed, see note)
Return type: model object
Note
If no Z array is passed, num_inducing (default 10) points are selected from the data. Other wise num_inducing is ignored
Note
Multiple independent outputs are allowed using columns of Y

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.sparse_gp_coregionalized_regression module¶

class
SparseGPCoregionalizedRegression
(X_list, Y_list, Z_list=[], kernel=None, likelihoods_list=None, num_inducing=10, X_variance=None, name='SGPCR', W_rank=1, kernel_name='coreg')[source]¶ Bases:
GPy.core.sparse_gp.SparseGP
Sparse Gaussian Process model for heteroscedastic multioutput regression
This is a thin wrapper around the SparseGP class, with a set of sensible defaults
Parameters:  X_list (list of numpy arrays) – list of input observations corresponding to each output
 Y_list (list of numpy arrays) – list of observed values related to the different noise models
 Z_list (empty list  list of numpy arrays) – list of inducing inputs (optional)
 kernel (None  GPy.kernel defaults) – a GPy kernel ** Coregionalized, defaults to RBF ** Coregionalized
 num_inducing (integer  list of integers) – number of inducing inputs, defaults to 10 per output (ignored if Z_list is not empty)
 name (string) – model name
 W_rank (integer) – number tuples of the corregionalization parameters ‘W’ (see coregionalize kernel documentation)
 kernel_name (string) – name of the kernel
Likelihoods_list: a list of likelihoods, defaults to list of Gaussian likelihoods
GPy.models.sparse_gp_minibatch module¶

class
SparseGPMiniBatch
(X, Y, Z, kernel, likelihood, inference_method=None, name='sparse gp', Y_metadata=None, normalizer=False, missing_data=False, stochastic=False, batchsize=1)[source]¶ Bases:
GPy.core.sparse_gp.SparseGP
A general purpose Sparse GP model, allowing missing data and stochastics across dimensions.
This model allows (approximate) inference using variational DTC or FITC (Gaussian likelihoods) as well as nonconjugate sparse methods based on these.
Parameters:  X (np.ndarray (num_data x input_dim)) – inputs
 likelihood (GPy.likelihood.(Gaussian  EP  Laplace)) – a likelihood instance, containing the observed data
 kernel (a GPy.kern.kern instance) – the kernel (covariance function). See link kernels
 X_variance (np.ndarray (num_data x input_dim)  None) – The uncertainty in the measurements of X (Gaussian variance)
 Z (np.ndarray (num_inducing x input_dim)) – inducing inputs
 num_inducing (int) – Number of inducing points (optional, default 10. Ignored if Z is not None)

optimize
(optimizer=None, start=None, **kwargs)[source]¶ Optimize the model using self.log_likelihood and self.log_likelihood_gradient, as well as self.priors. kwargs are passed to the optimizer. They can be:
Parameters:  max_iters (int) – maximum number of function evaluations
 messages (bool) – whether to display during optimisation
 optimizer (string) – which optimizer to use (defaults to self.preferred optimizer), a range of optimisers can be found in :module:`~GPy.inference.optimization`, they include ‘scg’, ‘lbfgs’, ‘tnc’.
 ipython_notebook (bool) – whether to use ipython notebook widgets or not.
 clear_after_finish (bool) – if in ipython notebook, we can clear the widgets after optimization.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.sparse_gp_regression module¶

class
SparseGPRegression
(X, Y, kernel=None, Z=None, num_inducing=10, X_variance=None, mean_function=None, normalizer=None, mpi_comm=None, name='sparse_gp')[source]¶ Bases:
GPy.core.sparse_gp_mpi.SparseGP_MPI
Gaussian Process model for regression
This is a thin wrapper around the SparseGP class, with a set of sensible defalts
Parameters:  X – input observations
 X_variance – input uncertainties, one per input X
 Y – observed values
 kernel – a GPy kernel, defaults to rbf+white
 Z (np.ndarray (num_inducing x input_dim)  None) – inducing inputs (optional, see note)
 num_inducing (int) – number of inducing points (ignored if Z is passed, see note)
Return type: model object
Note
If no Z array is passed, num_inducing (default 10) points are selected from the data. Other wise num_inducing is ignored
Note
Multiple independent outputs are allowed using columns of Y

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.sparse_gp_regression_md module¶

class
SparseGPRegressionMD
(X, Y, indexD, kernel=None, Z=None, num_inducing=10, normalizer=None, mpi_comm=None, individual_Y_noise=False, name='sparse_gp')[source]¶ Bases:
GPy.core.sparse_gp_mpi.SparseGP_MPI
Sparse Gaussian Process Regression with Missing Data
This model targets at the use case, in which there are multiple output dimensions (different dimensions are assumed to be independent following the same GP prior) and each output dimension is observed at a different set of inputs. The model takes a different data format: the inputs and outputs observations of all the output dimensions are stacked together correspondingly into two matrices. An extra array is used to indicate the index of output dimension for each data point. The output dimensions are indexed using integers from 0 to D1 assuming there are D output dimensions.
Parameters:  X (numpy.ndarray) – input observations.
 Y (numpy.ndarray) – output observations, each column corresponding to an output dimension.
 indexD (numpy.ndarray) – the array containing the index of output dimension for each data point
 kernel (GPy.kern.Kern or None) – a GPy kernel for GP of individual output dimensions ** defaults to RBF **
 Z (numpy.ndarray or None) – inducing inputs
 num_inducing ((int, int)) – a tuple (M, Mr). M is the number of inducing points for GP of individual output dimensions. Mr is the number of inducing points for the latent space.
 individual_Y_noise (boolean) – whether individual output dimensions have their own noise variance or not, boolean
 name (str) – the name of the model

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.
GPy.models.sparse_gplvm module¶

class
SparseGPLVM
(Y, input_dim, X=None, kernel=None, init='PCA', num_inducing=10)[source]¶ Bases:
GPy.models.sparse_gp_regression.SparseGPRegression
Sparse Gaussian Process Latent Variable Model
Parameters:  Y (np.ndarray) – observed data
 input_dim (int) – latent dimensionality
 init ('PCA''random') – initialisation method for the latent space

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_latent
(labels=None, which_indices=None, resolution=50, ax=None, marker='o', s=40, fignum=None, plot_inducing=True, legend=True, plot_limits=None, aspect='auto', updates=False, predict_kwargs={}, imshow_kwargs={})[source]¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using
GPy.models.ss_gplvm module¶

class
IBPPosterior
(means, variances, binary_prob, tau=None, sharedX=False, name='latent space')[source]¶ Bases:
GPy.core.parameterization.variational.SpikeAndSlabPosterior
The SpikeAndSlab distribution for variational approximations.
binary_prob : the probability of the distribution on the slab part.

class
IBPPrior
(input_dim, alpha=2.0, name='IBPPrior', **kw)[source]¶ Bases:
GPy.core.parameterization.variational.VariationalPrior

class
SLVMPosterior
(means, variances, binary_prob, tau=None, name='latent space')[source]¶ Bases:
GPy.core.parameterization.variational.SpikeAndSlabPosterior
The SpikeAndSlab distribution for variational approximations.
binary_prob : the probability of the distribution on the slab part.

class
SLVMPrior
(input_dim, alpha=1.0, beta=1.0, Z=None, name='SLVMPrior', **kw)[source]¶ Bases:
GPy.core.parameterization.variational.VariationalPrior

class
SSGPLVM
(Y, input_dim, X=None, X_variance=None, Gamma=None, init='PCA', num_inducing=10, Z=None, kernel=None, inference_method=None, likelihood=None, name='Spike_and_Slab GPLVM', group_spike=False, IBP=False, SLVM=False, alpha=2.0, beta=2.0, connM=None, tau=None, mpi_comm=None, pi=None, learnPi=False, normalizer=False, sharedX=False, variational_prior=None, **kwargs)[source]¶ Bases:
GPy.core.sparse_gp_mpi.SparseGP_MPI
SpikeandSlab Gaussian Process Latent Variable Model
Parameters:  Y (np.ndarray GPy.likelihood instance) – observed data (np.ndarray) or GPy.likelihood
 input_dim (int) – latent dimensionality
 init ('PCA''random') – initialisation method for the latent space

get_X_gradients
(X)[source]¶ Get the gradients of the posterior distribution of X in its specific form.

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in the GP class this method reperforms inference, recalculating the posterior and log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot_inducing
(which_indices=None, legend=False, plot_limits=None, marker=None, projection='2d', **kwargs)¶ Plot a scatter plot of the inducing inputs.
Parameters:  which_indices ([int]) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – marker to use [default is custom arrow like]
 kwargs – the kwargs for the scatter plots
 projection (str) – for now 2d or 3d projection (other projections can be implemented, see developer documentation)

plot_latent
(labels=None, which_indices=None, resolution=60, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, projection='2d', scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 scatter_kwargs – the kwargs for the scatter plots

plot_scatter
(labels=None, which_indices=None, legend=True, plot_limits=None, marker='<>^vsd', num_samples=1000, projection='2d', **kwargs)¶ Plot a scatter plot of the latent space.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 legend (bool) – whether to plot the legend on the figure
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 marker (str) – markers to use  cycle if more labels then markers are given
 kwargs – the kwargs for the scatter plots

plot_steepest_gradient_map
(output_labels=None, data_labels=None, which_indices=None, resolution=15, legend=True, plot_limits=None, updates=False, kern=None, marker='<>^vsd', num_samples=1000, annotation_kwargs=None, scatter_kwargs=None, **imshow_kwargs)¶ Plot the latent space of the GP on the inputs. This is the density of the GP posterior as a grey scale and the scatter plot of the input dimemsions selected by which_indices.
Parameters:  labels (arraylike) – a label for each data point (row) of the inputs
 which_indices ((int, int)) – which input dimensions to plot against each other
 resolution (int) – the resolution at which we predict the magnification factor
 legend (bool) – whether to plot the legend on the figure, if int plot legend columns on legend
 plot_limits ((xmin, xmax, ymin, ymax) or ((xmin, xmax), (ymin, ymax))) – the plot limits for the plot
 updates (bool) – if possible, make interactive updates using the specific library you are using
 kern (
Kern
) – the kernel to use for prediction  marker (str) – markers to use  cycle if more labels then markers are given
 num_samples (int) – the number of samples to plot maximally. We do a stratified subsample from the labels, if the number of samples (in X) is higher then num_samples.
 imshow_kwargs – the kwargs for the imshow (magnification factor)
 annotation_kwargs – the kwargs for the annotation plot
 scatter_kwargs – the kwargs for the scatter plots
GPy.models.ss_mrd module¶
The Maniforld Relevance Determination model with the spikeandslab prior

class
IBPPrior_SSMRD
(nModels, input_dim, alpha=2.0, tau=None, name='IBPPrior', **kw)[source]¶ Bases:
GPy.core.parameterization.variational.VariationalPrior

class
SSMRD
(Ylist, input_dim, X=None, X_variance=None, Gammas=None, initx='PCA_concat', initz='permute', num_inducing=10, Zs=None, kernels=None, inference_methods=None, likelihoods=None, group_spike=True, pi=0.5, name='ss_mrd', Ynames=None, mpi_comm=None, IBP=False, alpha=2.0, taus=None)[source]¶ Bases:
GPy.core.model.Model

optimize
(optimizer=None, start=None, **kwargs)[source]¶ Optimize the model using self.log_likelihood and self.log_likelihood_gradient, as well as self.priors.
kwargs are passed to the optimizer. They can be:
Parameters:  max_iters (int) – maximum number of function evaluations
 optimizer (string) – which optimizer to use (defaults to self.preferred optimizer)
Messages: True: Display messages during optimisation, “ipython_notebook”:
 Valid optimizers are:
 ‘scg’: scaled conjugate gradient method, recommended for stability.
 See also GPy.inference.optimization.scg
 ‘fmin_tnc’: truncated Newton method (see scipy.optimize.fmin_tnc)
 ‘simplex’: the NelderMead simplex method (see scipy.optimize.fmin),
 ‘lbfgsb’: the lbfgsb method (see scipy.optimize.fmin_l_bfgs_b),
 ‘lbfgs’: the bfgs method (see scipy.optimize.fmin_bfgs),
 ‘sgd’: stochastic gradient decsent (see scipy.optimize.sgd). For experts only!

parameters_changed
()[source]¶ This method gets called when parameters have changed. Another way of listening to param changes is to add self as a listener to the param, such that updates get passed through. See :py:function:
paramz.param.Observable.add_observer

optimizer_array
¶ Array for the optimizer to work on. This array always lives in the space for the optimizer. Thus, it is untransformed, going from Transformations.
Setting this array, will make sure the transformed parameters for this model will be set accordingly. It has to be set with an array, retrieved from this method, as e.g. fixing will resize the array.
The optimizer should only interfere with this array, such that transformations are secured.


class
SpikeAndSlabPrior_SSMRD
(nModels, pi=0.5, learnPi=False, group_spike=True, variance=1.0, name='SSMRDPrior', **kw)[source]¶ Bases:
GPy.core.parameterization.variational.SpikeAndSlabPrior
GPy.models.state_space module¶
GPy.models.state_space_cython module¶
GPy.models.state_space_main module¶
Main functionality for statespace inference.

class
AddMethodToClass
(func=None, tp='staticmethod')[source]¶ Bases:
object
func: function to add tp: string Type of the method: normal, staticmethod, classmethod

class
ContDescrStateSpace
[source]¶ Bases:
GPy.models.state_space_main.DescreteStateSpace
Class for continuousdiscrete Kalman filter. State equation is continuous while measurement equation is discrete.
d x(t)/ dt = F x(t) + L q; where q~ N(0, Qc) y_{t_k} = H_{k} x_{t_k} + r_{k}; r_{k1} ~ N(0, R_{k})
class
AQcompute_batch_Python
(F, L, Qc, dt, compute_derivatives=False, grad_params_no=None, P_inf=None, dP_inf=None, dF=None, dQc=None)[source]¶ Bases:
GPy.models.state_space_main.Q_handling_Python
Class for calculating matrices A, Q, dA, dQ of the discrete Kalman Filter from the matrices F, L, Qc, P_ing, dF, dQc, dP_inf of the continuos state equation. dt  time steps.
It has the same interface as AQcompute_once.
It computes matrices for all time steps. This object is used when there are not so many (controlled by internal variable) different time steps and storing all the matrices do not take too much memory.
Since all the matrices are computed all together, this object can be used in smoother without repeating the computations.
Constructor. All necessary parameters are passed here and stored in the opject.
 F, L, Qc, P_inf : matrices
 Parameters of corresponding continuous state model
 dt: array
 All time steps
 compute_derivatives: bool
 Whether to calculate derivatives
 dP_inf, dF, dQc: 3D array
 Derivatives if they are required
Nothing

Ak
(k, m, P)[source]¶ function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.
k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

Q_inverse
(k, p_largest_cond_num, p_regularization_type)[source]¶ Function inverts Q matrix and regularizes the inverse. Regularization is useful when original matrix is badly conditioned. Function is currently used only in SparseGP code.
k: int Iteration number.
p_largest_cond_num: float Largest condition value for the inverted matrix. If cond. number is smaller than that no regularization happen.
regularization_type: 1 or 2 Regularization type.
regularization_type: int (1 or 2)
type 1: 1/(S[k] + regularizer) regularizer is computed type 2: S[k]/(S^2[k] + regularizer) regularizer is computed

Qk
(k)[source]¶ function (k). Returns noise matrix of dynamic model on iteration k. k (iteration number). starts at 0

dAk
(k)[source]¶ function (k). Returns the derivative of A on iteration k. k (iteration number). starts at 0

dQk
(k)[source]¶ function (k). Returns the derivative of Q on iteration k. k (iteration number). starts at 0

class
AQcompute_once
(F, L, Qc, dt, compute_derivatives=False, grad_params_no=None, P_inf=None, dP_inf=None, dF=None, dQc=None)[source]¶ Bases:
GPy.models.state_space_main.Q_handling_Python
Class for calculating matrices A, Q, dA, dQ of the discrete Kalman Filter from the matrices F, L, Qc, P_ing, dF, dQc, dP_inf of the continuos state equation. dt  time steps.
It has the same interface as AQcompute_batch.
It computes matrices for only one time step. This object is used when there are many different time steps and storing matrices for each of them would take too much memory.
Constructor. All necessary parameters are passed here and stored in the opject.
 F, L, Qc, P_inf : matrices
 Parameters of corresponding continuous state model
 dt: array
 All time steps
 compute_derivatives: bool
 Whether to calculate derivatives
 dP_inf, dF, dQc: 3D array
 Derivatives if they are required
Nothing

Ak
(k, m, P)[source]¶ function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.
k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

Q_inverse
(k, p_largest_cond_num, p_regularization_type)[source]¶ Function inverts Q matrix and regularizes the inverse. Regularization is useful when original matrix is badly conditioned. Function is currently used only in SparseGP code.
k: int Iteration number.
p_largest_cond_num: float Largest condition value for the inverted matrix. If cond. number is smaller than that no regularization happen.
regularization_type: 1 or 2 Regularization type.
regularization_type: int (1 or 2)
type 1: 1/(S[k] + regularizer) regularizer is computed type 2: S[k]/(S^2[k] + regularizer) regularizer is computed

Q_srk
(k)[source]¶ Check square root, maybe rewriting for Spectral decomposition is needed. Square root of the noise matrix Q

Qk
(k)[source]¶ function (k). Returns noise matrix of dynamic model on iteration k. k (iteration number). starts at 0

dAk
(k)[source]¶ function (k). Returns the derivative of A on iteration k. k (iteration number). starts at 0

dQk
(k)[source]¶ function (k). Returns the derivative of Q on iteration k. k (iteration number). starts at 0

classmethod
cont_discr_kalman_filter
(F, L, Qc, p_H, p_R, P_inf, X, Y, index=None, m_init=None, P_init=None, p_kalman_filter_type='regular', calc_log_likelihood=False, calc_grad_log_likelihood=False, grad_params_no=0, grad_calc_params=None)[source]¶ This function implements the continuousdiscrete Kalman Filter algorithm These notations for the StateSpace model are assumed:
d/dt x(t) = F * x(t) + L * w(t); w(t) ~ N(0, Qc) y_{k} = H_{k} * x_{k} + r_{k}; r_{k1} ~ N(0, R_{k})Returns estimated filter distributions x_{k} ~ N(m_{k}, P(k))
1) The function generaly do not modify the passed parameters. If it happens then it is an error. There are several exeprions: scalars can be modified into a matrix, in some rare cases shapes of the derivatives matrices may be changed, it is ignored for now.
2) Copies of F,L,Qc are created in memory because they may be used later in smoother. References to copies are kept in “AQcomp” object return parameter.
3) Function support “multiple time series mode” which means that exactly the same StateSpace model is used to filter several sets of measurements. In this case third dimension of Y should include these statespace measurements Log_likelihood and Grad_log_likelihood have the corresponding dimensions then.
4) Calculation of Grad_log_likelihood is not supported if matrices H, or R changes overf time (with index k). (later may be changed)
5) Measurement may include missing values. In this case update step is not done for this measurement. (later may be changed)
 F: (state_dim, state_dim) matrix
 F in the model.
 L: (state_dim, noise_dim) matrix
 L in the model.
 Qc: (noise_dim, noise_dim) matrix
 Q_c in the model.
 p_H: scalar, matrix (measurement_dim, state_dim) , 3D array
 H_{k} in the model. If matrix then H_{k} = H  constant. If it is 3D array then H_{k} = p_Q[:,:, index[2,k]]
 p_R: scalar, square symmetric matrix, 3D array
 R_{k} in the model. If matrix then R_{k} = R  constant. If it is 3D array then R_{k} = p_R[:,:, index[3,k]]
 P_inf: (state_dim, state_dim) matrix
 State varince matrix on infinity.
 X: 1D array
 Time points of measurements. Needed for converting continuos problem to the discrete one.
 Y: matrix or vector or 3D array
 Data. If Y is matrix then samples are along 0th dimension and features along the 1st. If 3D array then third dimension correspond to “multiple time series mode”.
 index: vector
 Which indices (on 3rd dimension) from arrays p_H, p_R to use on every time step. If this parameter is None then it is assumed that p_H, p_R do not change over time and indices are not needed. index[0,:]  correspond to H, index[1,:]  correspond to R If index.shape[0] == 1, it is assumed that indides for all matrices are the same.
 m_init: vector or matrix
 Initial distribution mean. If None it is assumed to be zero. For “multiple time series mode” it is matrix, second dimension of which correspond to different time series. In regular case (“one time series mode”) it is a vector.
 P_init: square symmetric matrix or scalar
 Initial covariance of the states. If the parameter is scalar then it is assumed that initial covariance matrix is unit matrix multiplied by this scalar. If None the unit matrix is used instead. “multiple time series mode” does not affect it, since it does not affect anything related to state variaces.
 p_kalman_filter_type: string, one of (‘regular’, ‘svd’)
 Which Kalman Filter is used. Regular or SVD. SVD is more numerically stable, in particular, Covariace matrices are guarantied to be positive semidefinite. However, ‘svd’ works slower, especially for small data due to SVD call overhead.
 calc_log_likelihood: boolean
 Whether to calculate marginal likelihood of the statespace model.
 calc_grad_log_likelihood: boolean
 Whether to calculate gradient of the marginal likelihood of the statespace model. If true then “grad_calc_params” parameter must provide the extra parameters for gradient calculation.
 grad_params_no: int
 If previous parameter is true, then this parameters gives the total number of parameters in the gradient.
 grad_calc_params: dictionary
 Dictionary with derivatives of model matrices with respect to parameters “dF”, “dL”, “dQc”, “dH”, “dR”, “dm_init”, “dP_init”. They can be None, in this case zero matrices (no dependence on parameters) is assumed. If there is only one parameter then third dimension is automatically added.
 M: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array
 Filter estimates of the state means. In the extra step the initial value is included. In the “multiple time series mode” third dimension correspond to different timeseries.
 P: (no_steps+1, state_dim, state_dim) 3D array
 Filter estimates of the state covariances. In the extra step the initial value is included.
log_likelihood: double or (1, time_series_no) 3D array.
If the parameter calc_log_likelihood was set to true, return logarithm of marginal likelihood of the statespace model. If the parameter was false, return None. In the “multiple time series mode” it is a vector providing log_likelihood for each time series. grad_log_likelihood: column vector or (grad_params_no, time_series_no) matrix
 If calc_grad_log_likelihood is true, return gradient of log likelihood with respect to parameters. It returns it column wise, so in “multiple time series mode” gradients for each time series is in the corresponding column.
 AQcomp: object
 Contains some precomputed values for converting continuos model into discrete one. It can be used later in the smoothing pahse.

classmethod
cont_discr_rts_smoother
(state_dim, filter_means, filter_covars, p_dynamic_callables=None, X=None, F=None, L=None, Qc=None)[source]¶ Continuosdiscrete Rauch–Tung–Striebel(RTS) smoother.
This function implements Rauch–Tung–Striebel(RTS) smoother algorithm based on the results of _cont_discr_kalman_filter_raw.
 Model:
 d/dt x(t) = F * x(t) + L * w(t); w(t) ~ N(0, Qc) y_{k} = H_{k} * x_{k} + r_{k}; r_{k1} ~ N(0, R_{k})
Returns estimated smoother distributions x_{k} ~ N(m_{k}, P(k))
 filter_means: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array
 Results of the Kalman Filter means estimation.
 filter_covars: (no_steps+1, state_dim, state_dim) 3D array
 Results of the Kalman Filter covariance estimation.
 Dynamic_callables: object or None
 Object form the filter phase which provides functions for computing A, Q, dA, dQ fro discrete model from the continuos model.
 X, F, L, Qc: matrices
 If AQcomp is None, these matrices are used to create this object from scratch.
 M: (no_steps+1,state_dim) matrix
 Smoothed estimates of the state means
 P: (no_steps+1,state_dim, state_dim) 3D array
 Smoothed estimates of the state covariances

static
lti_sde_to_descrete
(F, L, Qc, dt, compute_derivatives=False, grad_params_no=None, P_inf=None, dP_inf=None, dF=None, dQc=None)[source]¶ Linear TimeInvariant Stochastic Differential Equation (LTI SDE):
dx(t) = F x(t) dt + L d eta ,where
x(t): (vector) stochastic process eta: (vector) Brownian motion process F, L: (time invariant) matrices of corresponding dimensions Qc: covariance of noise.This function rewrites it into the corresponding statespace form:
x_{k} = A_{k} * x_{k1} + q_{k1}; q_{k1} ~ N(0, Q_{k1})TODO: this function can be redone to “preprocess dataset”, when close time points are handeled properly (with rounding parameter) and values are averaged accordingly.
F,L: LTI SDE matrices of corresponding dimensions
 Qc: matrix (n,n)
 Covarince between different dimensions of noise eta. n is the dimensionality of the noise.
 dt: double or iterable
 Time difference used on this iteration. If dt is iterable, then A and Q_noise are computed for every unique dt
 compute_derivatives: boolean
 Whether derivatives of A and Q are required.
 grad_params_no: int
 Number of gradient parameters
P_inf: (state_dim. state_dim) matrix
dP_inf
 dF: 3D array
 Derivatives of F
 dQc: 3D array
 Derivatives of Qc
 dR: 3D array
 Derivatives of R
 A: matrix
 A_{k}. Because we have LTI SDE only dt can affect on matrix difference for different k.
 Q_noise: matrix
 Covariance matrix of (vector) q_{k1}. Only dt can affect the matrix difference for different k.
 reconstruct_index: array
 If dt was iterable return three dimensinal arrays A and Q_noise. Third dimension of these arrays correspond to unique dt’s. This reconstruct_index contain indices of the original dt’s in the uninue dt sequence. A[:,:, reconstruct_index[5]] is matrix A of 6th(indices start from zero) dt in the original sequence.
 dA: 3D array
 Derivatives of A
 dQ: 3D array
 Derivatives of Q

class

class
DescreteStateSpace
[source]¶ Bases:
object
This class implents statespace inference for linear and nonlinear statespace models. Linear models are: x_{k} = A_{k} * x_{k1} + q_{k1}; q_{k1} ~ N(0, Q_{k1}) y_{k} = H_{k} * x_{k} + r_{k}; r_{k1} ~ N(0, R_{k})
Nonlinear: x_{k} = f_a(k, x_{k1}, A_{k}) + q_{k1}; q_{k1} ~ N(0, Q_{k1}) y_{k} = f_h(k, x_{k}, H_{k}) + r_{k}; r_{k1} ~ N(0, R_{k}) Here f_a and f_h are some functions of k (iteration number), x_{k1} or x_{k} (state value on certain iteration), A_{k} and H_{k}  Jacobian matrices of f_a and f_h respectively. In the linear case they are exactly A_{k} and H_{k}.
Currently two nonlinear Gaussian filter algorithms are implemented: Extended Kalman Filter (EKF), Statistically linearized Filter (SLF), which implementations are very similar.

classmethod
extended_kalman_filter
(p_state_dim, p_a, p_f_A, p_f_Q, p_h, p_f_H, p_f_R, Y, m_init=None, P_init=None, calc_log_likelihood=False)[source]¶ Extended Kalman Filter
p_state_dim: integer
 p_a: if None  the function from the linear model is assumed. No non
linearity in the dynamic is assumed.
function (k, x_{k1}, A_{k}). Dynamic function. k: (iteration number), x_{k1}: (previous state) x_{k}: Jacobian matrices of f_a. In the linear case it is exactly A_{k}.
 p_f_A: matrix  in this case function which returns this matrix is assumed.
Look at this parameter description in kalman_filter function.
function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.
k: (iteration number), m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.
 p_f_Q: matrix. In this case function which returns this matrix is asumed.
Look at this parameter description in kalman_filter function.
function (k). Returns noise matrix of dynamic model on iteration k. k: (iteration number).
 p_h: if None  the function from the linear measurement model is assumed.
No nonlinearity in the measurement is assumed.
function (k, x_{k}, H_{k}). Measurement function. k: (iteration number), x_{k}: (current state) H_{k}: Jacobian matrices of f_h. In the linear case it is exactly H_{k}.
 p_f_H: matrix  in this case function which returns this matrix is assumed.
 function (k, m, P) return Jacobian of dynamic function, it is passed into p_h. k: (iteration number), m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.
 p_f_R: matrix. In this case function which returns this matrix is asumed.
 function (k). Returns noise matrix of measurement equation on iteration k. k: (iteration number).
 Y: matrix or vector
 Data. If Y is matrix then samples are along 0th dimension and features along the 1st. May have missing values.
 p_mean: vector
 Initial distribution mean. If None it is assumed to be zero
 P_init: square symmetric matrix or scalar
 Initial covariance of the states. If the parameter is scalar then it is assumed that initial covariance matrix is unit matrix multiplied by this scalar. If None the unit matrix is used instead.
 calc_log_likelihood: boolean
 Whether to calculate marginal likelihood of the statespace model.

classmethod
kalman_filter
(p_A, p_Q, p_H, p_R, Y, index=None, m_init=None, P_init=None, p_kalman_filter_type='regular', calc_log_likelihood=False, calc_grad_log_likelihood=False, grad_params_no=None, grad_calc_params=None)[source]¶ This function implements the basic Kalman Filter algorithm These notations for the StateSpace model are assumed:
x_{k} = A_{k} * x_{k1} + q_{k1}; q_{k1} ~ N(0, Q_{k1}) y_{k} = H_{k} * x_{k} + r_{k}; r_{k1} ~ N(0, R_{k})Returns estimated filter distributions x_{k} ~ N(m_{k}, P(k))
1) The function generaly do not modify the passed parameters. If it happens then it is an error. There are several exeprions: scalars can be modified into a matrix, in some rare cases shapes of the derivatives matrices may be changed, it is ignored for now.
2) Copies of p_A, p_Q, index are created in memory to be used later in smoother. References to copies are kept in “matrs_for_smoother” return parameter.
3) Function support “multiple time series mode” which means that exactly the same StateSpace model is used to filter several sets of measurements. In this case third dimension of Y should include these statespace measurements Log_likelihood and Grad_log_likelihood have the corresponding dimensions then.
4) Calculation of Grad_log_likelihood is not supported if matrices A,Q, H, or R changes over time. (later may be changed)
5) Measurement may include missing values. In this case update step is not done for this measurement. (later may be changed)
 p_A: scalar, square matrix, 3D array
 A_{k} in the model. If matrix then A_{k} = A  constant. If it is 3D array then A_{k} = p_A[:,:, index[0,k]]
 p_Q: scalar, square symmetric matrix, 3D array
 Q_{k1} in the model. If matrix then Q_{k1} = Q  constant. If it is 3D array then Q_{k1} = p_Q[:,:, index[1,k]]
 p_H: scalar, matrix (measurement_dim, state_dim) , 3D array
 H_{k} in the model. If matrix then H_{k} = H  constant. If it is 3D array then H_{k} = p_Q[:,:, index[2,k]]
 p_R: scalar, square symmetric matrix, 3D array
 R_{k} in the model. If matrix then R_{k} = R  constant. If it is 3D array then R_{k} = p_R[:,:, index[3,k]]
 Y: matrix or vector or 3D array
 Data. If Y is matrix then samples are along 0th dimension and features along the 1st. If 3D array then third dimension correspond to “multiple time series mode”.
 index: vector
 Which indices (on 3rd dimension) from arrays p_A, p_Q,p_H, p_R to use on every time step. If this parameter is None then it is assumed that p_A, p_Q, p_H, p_R do not change over time and indices are not needed. index[0,:]  correspond to A, index[1,:]  correspond to Q index[2,:]  correspond to H, index[3,:]  correspond to R. If index.shape[0] == 1, it is assumed that indides for all matrices are the same.
 m_init: vector or matrix
 Initial distribution mean. If None it is assumed to be zero. For “multiple time series mode” it is matrix, second dimension of which correspond to different time series. In regular case (“one time series mode”) it is a vector.
 P_init: square symmetric matrix or scalar
 Initial covariance of the states. If the parameter is scalar then it is assumed that initial covariance matrix is unit matrix multiplied by this scalar. If None the unit matrix is used instead. “multiple time series mode” does not affect it, since it does not affect anything related to state variaces.
 calc_log_likelihood: boolean
 Whether to calculate marginal likelihood of the statespace model.
 calc_grad_log_likelihood: boolean
 Whether to calculate gradient of the marginal likelihood of the statespace model. If true then “grad_calc_params” parameter must provide the extra parameters for gradient calculation.
 grad_params_no: int
 If previous parameter is true, then this parameters gives the total number of parameters in the gradient.
 grad_calc_params: dictionary
 Dictionary with derivatives of model matrices with respect to parameters “dA”, “dQ”, “dH”, “dR”, “dm_init”, “dP_init”. They can be None, in this case zero matrices (no dependence on parameters) is assumed. If there is only one parameter then third dimension is automatically added.
 M: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array
 Filter estimates of the state means. In the extra step the initial value is included. In the “multiple time series mode” third dimension correspond to different timeseries.
 P: (no_steps+1, state_dim, state_dim) 3D array
 Filter estimates of the state covariances. In the extra step the initial value is included.
 log_likelihood: double or (1, time_series_no) 3D array.
 If the parameter calc_log_likelihood was set to true, return logarithm of marginal likelihood of the statespace model. If the parameter was false, return None. In the “multiple time series mode” it is a vector providing log_likelihood for each time series.
 grad_log_likelihood: column vector or (grad_params_no, time_series_no) matrix
 If calc_grad_log_likelihood is true, return gradient of log likelihood with respect to parameters. It returns it column wise, so in “multiple time series mode” gradients for each time series is in the corresponding column.
 matrs_for_smoother: dict
 Dictionary with model functions for smoother. The intrinsic model functions are computed in this functions and they are returned to use in smoother for convenience. They are: ‘p_a’, ‘p_f_A’, ‘p_f_Q’ The dictionary contains the same fields.

classmethod
rts_smoother
(state_dim, p_dynamic_callables, filter_means, filter_covars)[source]¶ This function implements Rauch–Tung–Striebel(RTS) smoother algorithm based on the results of kalman_filter_raw. These notations are the same:
x_{k} = A_{k} * x_{k1} + q_{k1}; q_{k1} ~ N(0, Q_{k1}) y_{k} = H_{k} * x_{k} + r_{k}; r_{k1} ~ N(0, R_{k})Returns estimated smoother distributions x_{k} ~ N(m_{k}, P(k))
 p_a: function (k, x_{k1}, A_{k}). Dynamic function.
 k (iteration number), starts at 0 x_{k1} State from the previous step A_{k} Jacobian matrices of f_a. In the linear case it is exactly A_{k}.
 p_f_A: function (k, m, P) return Jacobian of dynamic function, it is
 passed into p_a. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.
 p_f_Q: function (k). Returns noise matrix of dynamic model on iteration k.
 k (iteration number). starts at 0
 filter_means: (no_steps+1,state_dim) matrix or (no_steps+1,state_dim, time_series_no) 3D array
 Results of the Kalman Filter means estimation.
 filter_covars: (no_steps+1, state_dim, state_dim) 3D array
 Results of the Kalman Filter covariance estimation.
 M: (no_steps+1, state_dim) matrix
 Smoothed estimates of the state means
 P: (no_steps+1, state_dim, state_dim) 3D array
 Smoothed estimates of the state covariances

classmethod

class
DescreteStateSpaceMeta
[source]¶ Bases:
type
Substitute necessary methods from cython.
After thos method the class object is created

Dynamic_Callables_Class
¶ alias of
GPy.models.state_space_main.Dynamic_Callables_Python

class
Dynamic_Callables_Python
[source]¶ Bases:
object

Ak
(k, m, P)[source]¶ function (k, m, P) return Jacobian of dynamic function, it is passed into p_a.
k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

Q_srk
(k)[source]¶ function (k). Returns the square root of noise matrix of dynamic model on iteration k.
k (iteration number). starts at 0
This function is implemented to use SVD prediction step.

Qk
(k)[source]¶  function (k). Returns noise matrix of dynamic model on iteration k.
 k (iteration number). starts at 0

dAk
(k)[source]¶  function (k). Returns the derivative of A on iteration k.
 k (iteration number). starts at 0

dQk
(k)[source]¶  function (k). Returns the derivative of Q on iteration k.
 k (iteration number). starts at 0


Measurement_Callables_Class
¶ alias of
GPy.models.state_space_main.Measurement_Callables_Python

class
Measurement_Callables_Python
[source]¶ Bases:
object

Hk
(k, m_pred, P_pred)[source]¶  function (k, m, P) return Jacobian of measurement function, it is
 passed into p_h. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

R_isrk
(k)[source]¶  function (k). Returns the square root of the noise matrix of
 measurement equation on iteration k. k (iteration number). starts at 0
This function is implemented to use SVD prediction step.

Rk
(k)[source]¶  function (k). Returns noise matrix of measurement equation
 on iteration k. k (iteration number). starts at 0

dHk
(k)[source]¶  function (k). Returns the derivative of H on iteration k.
 k (iteration number). starts at 0

dRk
(k)[source]¶  function (k). Returns the derivative of R on iteration k.
 k (iteration number). starts at 0


Q_handling_Class
¶

class
Q_handling_Python
(Q, index, Q_time_var_index, unique_Q_number, dQ=None)[source]¶ Bases:
GPy.models.state_space_main.Dynamic_Callables_Python
 R  array with noise on various steps. The result of preprocessing
 the noise input.
 index  for each step of Kalman filter contains the corresponding index
 in the array.
 R_time_var_index  another index in the array R. Computed earlier and
 passed here.
 unique_R_number  number of unique noise matrices below which square
 roots are cached and above which they are computed each time.
 dQ: 3D array[:, :, param_num]
 derivative of Q. Derivative is supported only when Q do not change over time
 Object which has two necessary functions:
 f_R(k) inv_R_square_root(k)

Q_srk
(k)[source]¶  function (k). Returns the square root of noise matrix of dynamic model
 on iteration k.
k (iteration number). starts at 0
This function is implemented to use SVD prediction step.

R_handling_Class
¶

class
R_handling_Python
(R, index, R_time_var_index, unique_R_number, dR=None)[source]¶ Bases:
GPy.models.state_space_main.Measurement_Callables_Python
The calss handles noise matrix R.
 R  array with noise on various steps. The result of preprocessing
 the noise input.
 index  for each step of Kalman filter contains the corresponding index
 in the array.
 R_time_var_index  another index in the array R. Computed earlier and
 is passed here.
 unique_R_number  number of unique noise matrices below which square
 roots are cached and above which they are computed each time.
 dR: 3D array[:, :, param_num]
 derivative of R. Derivative is supported only when R do not change over time
 Object which has two necessary functions:
 f_R(k) inv_R_square_root(k)

Std_Dynamic_Callables_Class
¶ alias of
GPy.models.state_space_main.Std_Dynamic_Callables_Python

class
Std_Dynamic_Callables_Python
(A, A_time_var_index, Q, index, Q_time_var_index, unique_Q_number, dA=None, dQ=None)[source]¶ Bases:
GPy.models.state_space_main.Q_handling_Python

Ak
(k, m_pred, P_pred)[source]¶  function (k, m, P) return Jacobian of measurement function, it is
 passed into p_h. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.

dAk
(k)[source]¶ function (k). Returns the derivative of A on iteration k. k (iteration number). starts at 0


Std_Measurement_Callables_Class
¶ alias of
GPy.models.state_space_main.Std_Measurement_Callables_Python

class
Std_Measurement_Callables_Python
(H, H_time_var_index, R, index, R_time_var_index, unique_R_number, dH=None, dR=None)[source]¶ Bases:
GPy.models.state_space_main.R_handling_Python

Hk
(k, m_pred, P_pred)[source]¶  function (k, m, P) return Jacobian of measurement function, it is
 passed into p_h. k (iteration number), starts at 0 m: point where Jacobian is evaluated P: parameter for Jacobian, usually covariance matrix.


balance_matrix
(A)[source]¶ Balance matrix, i.e. finds such similarity transformation of the original matrix A: A = T * bA * T^{1}, where norms of columns of bA and of rows of bA are as close as possible. It is usually used as a preprocessing step in eigenvalue calculation routine. It is useful also for StateSpace models.
 See also:
 [1] Beresford N. Parlett and Christian Reinsch (1969). Balancing
 a matrix for calculation of eigenvalues and eigenvectors. Numerische Mathematik, 13(4): 293304.
 A: square matrix
 Matrix to be balanced
 bA: matrix
 Balanced matrix
 T: matrix
 Left part of the similarity transformation
 T_inv: matrix
 Right part of the similarity transformation.
GPy.models.state_space_model module¶

class
StateSpace
(X, Y, kernel=None, noise_var=1.0, kalman_filter_type='regular', use_cython=False, balance=False, name='StateSpace')[source]¶ Bases:
GPy.core.model.Model
balance: bool Whether to balance or not the model as a whole

plot
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, samples_likelihood=0, lower=2.5, upper=97.5, plot_data=True, plot_inducing=True, plot_density=False, predict_kw=None, projection='2d', legend=True, **kwargs)¶ Convenience function for plotting the fit of a GP.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
If you want fine graned control use the specific plotting functions supplied in the model.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 samples_likelihood (int) – the number of samples to draw from the GP and apply the likelihood noise. This is usually not what you want!
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 projection ({2d3d}) – plot in 2d or 3d?
 legend (bool) – convenience, whether to put a legend on the plot or not.

plot_confidence
(lower=2.5, upper=97.5, plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', label='gp confidence', predict_kw=None, **kwargs)¶ Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 which_data_ycols (arraylike) – which columns of the output y (!) to plot (arraylike or list of ints)
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_data
(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **plot_kwargs)¶  Plot the training data
 For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.
Can plot only part of the data using which_data_rows and which_data_ycols.
Parameters:  which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
 projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
 label (str) – the label for the plot
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using
Returns list: of plots created.

plot_data_error
(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **error_kwargs)¶ Plot the training data input error.
For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.
Can plot only part of the data using which_data_rows and which_data_ycols.
Parameters:  which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
 projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 label (str) – the label for the plot
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using
Returns list: of plots created.

plot_density
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=35, label='gp density', predict_kw=None, **kwargs)¶ Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 which_data_ycols (arraylike) – which columns of y to plot (arraylike or list of ints)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_errorbars_trainset
(which_data_rows='all', which_data_ycols='all', fixed_inputs=None, plot_raw=False, apply_link=False, label=None, projection='2d', predict_kw=None, **plot_kwargs)¶ Plot the errorbars of the GP likelihood on the training data. These are the errorbars after the appropriate approximations according to the likelihood are done.
This also works for heteroscedastic likelihoods.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 which_data_ycols – when the data has several columns (independant outputs), only plot these
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 predict_kwargs (dict) – kwargs for the prediction used to predict the right quantiles.
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_f
(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_latent
(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_mean
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=20, projection='2d', label='gp mean', predict_kw=None, **kwargs)¶ Plot the mean of the GP.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols (arraylike) – which columns of y to plot (arraylike or list of ints)
 levels (int) – for 2D plotting, the number of contour levels to use is
 projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
 label (str) – the label for the plot.
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_noiseless
(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_samples
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=True, apply_link=False, visible_dims=None, which_data_ycols='all', samples=3, projection='2d', label='gp_samples', predict_kw=None, **kwargs)¶ Plot the mean of the GP.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
 plot_raw (bool) – plot the latent function (usually denoted f) only? This is usually what you want!
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 which_data_ycols (arraylike) – which columns of y to plot (arraylike or list of ints)
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 levels (int) – for 2D plotting, the number of contour levels to use is

GPy.models.state_space_setup module¶
This module is intended for the setup of state_space_main module. The need of this module appeared because of the way state_space_main module connected with cython code.
GPy.models.tp_regression module¶

class
TPRegression
(X, Y, kernel=None, deg_free=5.0, normalizer=None, mean_function=None, name='TP regression')[source]¶ Bases:
GPy.core.model.Model
Studentt Process model for regression, as presented in
Shah, A., Wilson, A. and Ghahramani, Z., 2014, April. Studentt processes as alternatives to Gaussian processes. In Artificial Intelligence and Statistics (pp. 877885).Parameters:  X – input observations
 Y – observed values
 kernel – a GPy kernel, defaults to rbf
 deg_free – initial value for the degrees of freedom hyperparameter
 normalizer (Norm) –
[False]
Normalize Y with the norm given. If normalizer is False, no normalization will be done If it is None, we use GaussianNorm(alization)
Note
Multiple independent outputs are allowed using columns of Y

log_likelihood
()[source]¶ The log marginal likelihood of the model, \(p(\mathbf{y})\), this is the objective function of the model being optimised

parameters_changed
()[source]¶ Method that is called upon any changes to
Param
variables within the model. In particular in this class this method reperforms inference, recalculating the posterior, log marginal likelihood and gradients of the modelWarning
This method is not designed to be called manually, the framework is set up to automatically call this method upon changes to parameters, if you call this method yourself, there may be unexpected consequences.

plot
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, samples_likelihood=0, lower=2.5, upper=97.5, plot_data=True, plot_inducing=True, plot_density=False, predict_kw=None, projection='2d', legend=True, **kwargs)¶ Convenience function for plotting the fit of a GP.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
If you want fine graned control use the specific plotting functions supplied in the model.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 samples_likelihood (int) – the number of samples to draw from the GP and apply the likelihood noise. This is usually not what you want!
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 projection ({2d3d}) – plot in 2d or 3d?
 legend (bool) – convenience, whether to put a legend on the plot or not.

plot_confidence
(lower=2.5, upper=97.5, plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', label='gp confidence', predict_kw=None, **kwargs)¶ Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 which_data_ycols (arraylike) – which columns of the output y (!) to plot (arraylike or list of ints)
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_data
(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **plot_kwargs)¶  Plot the training data
 For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.
Can plot only part of the data using which_data_rows and which_data_ycols.
Parameters:  which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
 projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
 label (str) – the label for the plot
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using
Returns list: of plots created.

plot_data_error
(which_data_rows='all', which_data_ycols='all', visible_dims=None, projection='2d', label=None, **error_kwargs)¶ Plot the training data input error.
For higher dimensions than two, use fixed_inputs to plot the data points with some of the inputs fixed.
Can plot only part of the data using which_data_rows and which_data_ycols.
Parameters:  which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 visible_dims (a numpy array) – an array specifying the input dimensions to plot (maximum two)
 projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 label (str) – the label for the plot
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using
Returns list: of plots created.

plot_density
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=35, label='gp density', predict_kw=None, **kwargs)¶ Plot the confidence interval between the percentiles lower and upper. E.g. the 95% confidence interval is $2.5, 97.5$. Note: Only implemented for one dimension!
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 which_data_ycols (arraylike) – which columns of y to plot (arraylike or list of ints)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_errorbars_trainset
(which_data_rows='all', which_data_ycols='all', fixed_inputs=None, plot_raw=False, apply_link=False, label=None, projection='2d', predict_kw=None, **plot_kwargs)¶ Plot the errorbars of the GP likelihood on the training data. These are the errorbars after the appropriate approximations according to the likelihood are done.
This also works for heteroscedastic likelihoods.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 which_data_ycols – when the data has several columns (independant outputs), only plot these
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 predict_kwargs (dict) – kwargs for the prediction used to predict the right quantiles.
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_f
(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_latent
(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_mean
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=False, apply_link=False, visible_dims=None, which_data_ycols='all', levels=20, projection='2d', label='gp mean', predict_kw=None, **kwargs)¶ Plot the mean of the GP.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
 plot_raw (bool) – plot the latent function (usually denoted f) only?
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols (arraylike) – which columns of y to plot (arraylike or list of ints)
 levels (int) – for 2D plotting, the number of contour levels to use is
 projection ({'2d','3d'}) – whether to plot in 2d or 3d. This only applies when plotting two dimensional inputs!
 label (str) – the label for the plot.
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here

plot_noiseless
(plot_limits=None, fixed_inputs=None, resolution=None, apply_link=False, which_data_ycols='all', which_data_rows='all', visible_dims=None, levels=20, samples=0, lower=2.5, upper=97.5, plot_density=False, plot_data=True, plot_inducing=True, projection='2d', legend=True, predict_kw=None, **kwargs)¶ Convinience function for plotting the fit of a GP. This is the same as plot, except it plots the latent function fit of the GP!
If you want fine graned control use the specific plotting functions supplied in the model.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [default:200]
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 which_data_ycols ('all' or a list of integers) – when the data has several columns (independant outputs), only plot these
 which_data_rows ('all' or a slice object to slice self.X, self.Y) – which of the training data to plot (default all)
 visible_dims (arraylike) – an array specifying the input dimensions to plot (maximum two)
 levels (int) – the number of levels in the density (number bigger then 1, where 35 is smooth and 1 is the same as plot_confidence). You can go higher then 50 if the result is not smooth enough for you.
 samples (int) – the number of samples to draw from the GP and plot into the plot. This will allways be samples from the latent function.
 lower (float) – the lower percentile to plot
 upper (float) – the upper percentile to plot
 plot_data (bool) – plot the data into the plot?
 plot_inducing (bool) – plot inducing inputs?
 plot_density (bool) – plot density instead of the confidence interval?
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 error_kwargs (dict) – kwargs for the error plot for the plotting library you are using
 plot_kwargs (kwargs) – kwargs for the data plot for the plotting library you are using

plot_samples
(plot_limits=None, fixed_inputs=None, resolution=None, plot_raw=True, apply_link=False, visible_dims=None, which_data_ycols='all', samples=3, projection='2d', label='gp_samples', predict_kw=None, **kwargs)¶ Plot the mean of the GP.
You can deactivate the legend for this one plot by supplying None to label.
Give the Y_metadata in the predict_kw if you need it.
Parameters:  plot_limits (np.array) – The limits of the plot. If 1D [xmin,xmax], if 2D [[xmin,ymin],[xmax,ymax]]. Defaluts to data limits
 fixed_inputs (a list of tuples) – a list of tuple [(i,v), (i,v)…], specifying that input dimension i should be set to value v.
 resolution (int) – The resolution of the prediction [defaults are 1D:200, 2D:50]
 plot_raw (bool) – plot the latent function (usually denoted f) only? This is usually what you want!
 apply_link (bool) – whether to apply the link function of the GP to the raw prediction.
 visible_dims (arraylike) – which columns of the input X (!) to plot (arraylike or list of ints)
 which_data_ycols (arraylike) – which columns of y to plot (arraylike or list of ints)
 predict_kw (dict) – the keyword arguments for the prediction. If you want to plot a specific kernel give dict(kern=<specific kernel>) in here
 levels (int) – for 2D plotting, the number of contour levels to use is

posterior_samples
(X, size=10, full_cov=False, Y_metadata=None, likelihood=None, **predict_kwargs)[source]¶ Samples the posterior GP at the points X, equivalent to posterior_samples_f due to the absence of a likelihood.

posterior_samples_f
(X, size=10, full_cov=True, **predict_kwargs)[source]¶ Samples the posterior TP at the points X.
Parameters:  X (np.ndarray (Nnew x self.input_dim)) – The points at which to take the samples.
 size (int.) – the number of a posteriori samples.
 full_cov (bool.) – whether to return the full covariance matrix, or just the diagonal.
Returns: fsim: set of simulations
Return type: np.ndarray (D x N x samples) (if D==1 we flatten out the first dimension)

predict
(Xnew, full_cov=False, kern=None, **kwargs)[source]¶ Predict the function(s) at the new point(s) Xnew. For Studentt processes, this method is equivalent to predict_noiseless as no likelihood is included in the model.

predict_noiseless
(Xnew, full_cov=False, kern=None)[source]¶ Predict the underlying function f at the new point(s) Xnew.
Parameters:  Xnew (np.ndarray (Nnew x self.input_dim)) – The points at which to make a prediction
 full_cov (bool) – whether to return the full covariance matrix, or just the diagonal
 kern – The kernel to use for prediction (defaults to the model kern).
Returns:  (mean, var):
mean: posterior mean, a Numpy array, Nnew x self.input_dim var: posterior variance, a Numpy array, Nnew x 1 if full_cov=False, Nnew x Nnew otherwise
If full_cov and self.input_dim > 1, the return shape of var is Nnew x Nnew x self.input_dim. If self.input_dim == 1, the return shape is Nnew x Nnew. This is to allow for different normalizations of the output dimensions.

predict_quantiles
(X, quantiles=(2.5, 97.5), kern=None, **kwargs)[source]¶ Get the predictive quantiles around the prediction at X
Parameters:  X (np.ndarray (Xnew x self.input_dim)) – The points at which to make a prediction
 quantiles (tuple) – tuple of quantiles, default is (2.5, 97.5) which is the 95% interval
 kern – optional kernel to use for prediction
Returns: list of quantiles for each X and predictive quantiles for interval combination
Return type: [np.ndarray (Xnew x self.output_dim), np.ndarray (Xnew x self.output_dim)]
GPy.models.warped_gp module¶

class
WarpedGP
(X, Y, kernel=None, warping_function=None, warping_terms=3, normalizer=False)[source]¶ Bases:
GPy.core.gp.GP
This defines a GP Regression model that applies a warping function to the output.

log_predictive_density
(x_test, y_test, Y_metadata=None)[source]¶ Calculation of the log predictive density. Notice we add the jacobian of the warping function here.
Parameters:  x_test ((Nx1) array) – test locations (x_{*})
 y_test ((Nx1) array) – test observations (y_{*})
 Y_metadata – metadata associated with the test points

predict
(Xnew, kern=None, pred_init=None, Y_metadata=None, median=False, deg_gauss_hermite=20, likelihood=None)[source]¶ Prediction results depend on:  The value of the self.predict_in_warped_space flag  The median flag passed as argument The likelihood keyword is never used, it is just to follow the plotting API.

predict_quantiles
(X, quantiles=(2.5, 97.5), Y_metadata=None, likelihood=None, kern=None)[source]¶ Get the predictive quantiles around the prediction at X
Parameters:  X (np.ndarray (Xnew x self.input_dim)) – The points at which to make a prediction
 quantiles (tuple) – tuple of quantiles, default is (2.5, 97.5) which is the 95% interval
Returns: list of quantiles for each X and predictive quantiles for interval combination
Return type: [np.ndarray (Xnew x self.input_dim), np.ndarray (Xnew x self.input_dim)]
