# GPy.likelihoods package¶

## GPy.likelihoods.bernoulli module¶

Bernoulli likelihood

$p(y_{i}|\lambda(f_{i})) = \lambda(f_{i})^{y_{i}}(1-f_{i})^{1-y_{i}}$

Note

Y takes values in either {-1, 1} or {0, 1}. link function should have the domain [0, 1], e.g. probit (default) or Heaviside

Hessian at y, given inv_link_f, w.r.t inv_link_f the hessian will be 0 unless i == j i.e. second derivative logpdf at y given inverse link of f_i and inverse link of f_j w.r.t inverse link of f_i and inverse link of f_j.

$\frac{d^{2}\ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)^{2}} = \frac{-y_{i}}{\lambda(f)^{2}} - \frac{(1-y_{i})}{(1-\lambda(f))^{2}}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in bernoulli Diagonal of log hessian matrix (second derivative of log likelihood evaluated at points inverse link of f. Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on inverse link of f_i not on inverse link of f_(j!=i)

Third order derivative log-likelihood function at y given inverse link of f w.r.t inverse link of f

$\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = \frac{2y_{i}}{\lambda(f)^{3}} - \frac{2(1-y_{i}}{(1-\lambda(f))^{3}}$
Parameters: inv_link_f (Nx1 array) – latent variables passed through inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in bernoulli third derivative of log likelihood evaluated at points inverse_link(f) Nx1 array

$\frac{d\ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \frac{y_{i}}{\lambda(f_{i})} - \frac{(1 - y_{i})}{(1 - \lambda(f_{i}))}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in bernoulli gradient of log likelihood evaluated at points inverse link of f. Nx1 array

Log Likelihood function given inverse link of f.

$\ln p(y_{i}|\lambda(f_{i})) = y_{i}\log\lambda(f_{i}) + (1-y_{i})\log (1-f_{i})$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in bernoulli log likelihood evaluated at points inverse link of f. float

Moments match of the marginal approximation in EP algorithm

Parameters: i – number of observation (int) tau_i – precision of the cavity distribution (float) v_i – mean/variance of the cavity distribution (float)

Likelihood function given inverse link of f.

$p(y_{i}|\lambda(f_{i})) = \lambda(f_{i})^{y_{i}}(1-f_{i})^{1-y_{i}}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in bernoulli likelihood evaluated for this point float

Get the “quantiles” of the binary labels (Bernoulli draws). all the quantiles must be either 0 or 1, since those are the only values the draw can take!

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable
to_dict()[source]

## GPy.likelihoods.binomial module¶

Binomial likelihood

$p(y_{i}|\lambda(f_{i})) = \lambda(f_{i})^{y_{i}}(1-f_{i})^{1-y_{i}}$

Note

Y takes values in either {-1, 1} or {0, 1}. link function should have the domain [0, 1], e.g. probit (default) or Heaviside

Hessian at y, given inv_link_f, w.r.t inv_link_f the hessian will be 0 unless i == j i.e. second derivative logpdf at y given inverse link of f_i and inverse link of f_j w.r.t inverse link of f_i and inverse link of f_j.

$\frac{d^{2}\ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)^{2}} = \frac{-y_{i}}{\lambda(f)^{2}} - \frac{(N-y_{i})}{(1-\lambda(f))^{2}}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in binomial Diagonal of log hessian matrix (second derivative of log likelihood evaluated at points inverse link of f. Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on inverse link of f_i not on inverse link of f_(j!=i)

Third order derivative log-likelihood function at y given inverse link of f w.r.t inverse link of f

$\frac{d^{2}\ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)^{2}} = \frac{2y_{i}}{\lambda(f)^{3}} - \frac{2(N-y_{i})}{(1-\lambda(f))^{3}}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata not used in binomial Diagonal of log hessian matrix (second derivative of log likelihood evaluated at points inverse link of f. Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on inverse link of f_i not on inverse link of f_(j!=i)

$\frac{d^{2}\ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)^{2}} = \frac{y_{i}}{\lambda(f)} - \frac{(N-y_{i})}{(1-\lambda(f))}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata must contain ‘trials’ gradient of log likelihood evaluated at points inverse link of f. Nx1 array

Log Likelihood function given inverse link of f.

$\ln p(y_{i}|\lambda(f_{i})) = y_{i}\log\lambda(f_{i}) + (1-y_{i})\log (1-f_{i})$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata must contain ‘trials’ log likelihood evaluated at points inverse link of f. float

Likelihood function given inverse link of f.

$p(y_{i}|\lambda(f_{i})) = \lambda(f_{i})^{y_{i}}(1-f_{i})^{1-y_{i}}$
Parameters: inv_link_f (Nx1 array) – latent variables inverse link of f. y (Nx1 array) – data Y_metadata – Y_metadata must contain ‘trials’ likelihood evaluated for this point float

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

## GPy.likelihoods.exponential module¶

Expoential likelihood Y is expected to take values in {0,1,2,…} —– $$L(x) = exp(lambda) * lambda**Y_i / Y_i!$$

$\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}\lambda(f)} = -\frac{1}{\lambda(f_{i})^{2}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in exponential distribution Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

$\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = \frac{2}{\lambda(f_{i})^{3}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in exponential distribution third derivative of likelihood evaluated at points f Nx1 array

$\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \frac{1}{\lambda(f)} - y_{i}$
Parameters: link_f (Nx1 array) – latent variables (f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in exponential distribution gradient of likelihood evaluated at points Nx1 array

$\ln p(y_{i}|\lambda(f_{i})) = \ln \lambda(f_{i}) - y_{i}\lambda(f_{i})$
Parameters: link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in exponential distribution likelihood evaluated for this point float

$p(y_{i}|\lambda(f_{i})) = \lambda(f_{i})\exp (-y\lambda(f_{i}))$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in exponential distribution likelihood evaluated for this point float

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

## GPy.likelihoods.gamma module¶

Gamma likelihood

$\begin{split}p(y_{i}|\lambda(f_{i})) = \frac{\beta^{\alpha_{i}}}{\Gamma(\alpha_{i})}y_{i}^{\alpha_{i}-1}e^{-\beta y_{i}}\\ \alpha_{i} = \beta y_{i}\end{split}$

$\begin{split}\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}\lambda(f)} = -\beta^{2}\frac{d\Psi(\alpha_{i})}{d\alpha_{i}}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

$\begin{split}\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = -\beta^{3}\frac{d^{2}\Psi(\alpha_{i})}{d\alpha_{i}}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution third derivative of likelihood evaluated at points f Nx1 array

$\begin{split}\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \beta (\log \beta y_{i}) - \Psi(\alpha_{i})\beta\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables (f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution gradient of likelihood evaluated at points Nx1 array

$\begin{split}\ln p(y_{i}|\lambda(f_{i})) = \alpha_{i}\log \beta - \log \Gamma(\alpha_{i}) + (\alpha_{i} - 1)\log y_{i} - \beta y_{i}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution likelihood evaluated for this point float

$\begin{split}p(y_{i}|\lambda(f_{i})) = \frac{\beta^{\alpha_{i}}}{\Gamma(\alpha_{i})}y_{i}^{\alpha_{i}-1}e^{-\beta y_{i}}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution likelihood evaluated for this point float

## GPy.likelihoods.gaussian module¶

A lot of this code assumes that the link function is the identity.

I think laplace code is okay, but I’m quite sure that the EP moments will only work if the link is identity.

Furthermore, exact Guassian inference can only be done for the identity link, so we should be asserting so for all calls which relate to that.

James 11/12/13

Gaussian likelihood

$\ln p(y_{i}|\lambda(f_{i})) = -\frac{N \ln 2\pi}{2} - \frac{\ln |K|}{2} - \frac{(y_{i} - \lambda(f_{i}))^{T}\sigma^{-2}(y_{i} - \lambda(f_{i}))}{2}$
Parameters: variance – variance value of the Gaussian distribution N (int) – Number of data points

The hessian will be 0 unless i == j

$\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}f} = -\frac{1}{\sigma^{2}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian Diagonal of log hessian matrix (second derivative of log likelihood evaluated at points link(f)) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

$\frac{d}{d\sigma^{2}}(\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}\lambda(f)}) = \frac{1}{\sigma^{4}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log hessian evaluated at points link(f_i) and link(f_j) w.r.t variance parameter Nx1 array

$\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = 0$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian third derivative of log likelihood evaluated at points link(f) Nx1 array

$\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \frac{1}{\sigma^{2}}(y_{i} - \lambda(f_{i}))$

Derivative of the dlogpdf_dlink w.r.t variance parameter (noise_variance)

$\frac{d}{d\sigma^{2}}(\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)}) = \frac{1}{\sigma^{4}}(-y_{i} + \lambda(f_{i}))$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter Nx1 array

Gradient of the log-likelihood function at y given link(f), w.r.t variance parameter (noise_variance)

$\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\sigma^{2}} = -\frac{N}{2\sigma^{2}} + \frac{(y_{i} - \lambda(f_{i}))^{2}}{2\sigma^{4}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter float

assumes independence

$\ln p(y_{i}|\lambda(f_{i})) = -\frac{N \ln 2\pi}{2} - \frac{\ln |K|}{2} - \frac{(y_{i} - \lambda(f_{i}))^{T}\sigma^{-2}(y_{i} - \lambda(f_{i}))}{2}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian log likelihood evaluated for this point float

Moments match of the marginal approximation in EP algorithm

Parameters: i – number of observation (int) tau_i – precision of the cavity distribution (float) v_i – mean/variance of the cavity distribution (float)

$\ln p(y_{i}|\lambda(f_{i})) = -\frac{N \ln 2\pi}{2} - \frac{\ln |K|}{2} - \frac{(y_{i} - \lambda(f_{i}))^{T}\sigma^{-2}(y_{i} - \lambda(f_{i}))}{2}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian likelihood evaluated for this point float
predictive_mean(mu, sigma)[source]
predictive_variance(mu, sigma, predictive_mean=None)[source]

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable
to_dict()[source]

## GPy.likelihoods.likelihood module¶

Likelihood base class, used to defing p(y|f).

All instances use _inverse_ link functions, which can be swapped out. It is expected that inheriting classes define a default inverse link function

To use this class, inherit and define missing functionality.

Inheriting classes must implement:
pdf_link : a bound method which turns the output of the link function into the pdf logpdf_link : the logarithm of the above
To enable use with EP, inheriting classes must define:
TODO: a suitable derivative function for any parameters of the class
It is also desirable to define:
moments_match_ep : a function to compute the EP moments If this isn’t defined, the moments will be computed using 1D quadrature.
To enable use with Laplace approximation, inheriting classes must define:
Some derivative functions AS TODO

For exact Gaussian inference, define JH TODO

MCMC_pdf_samples(fNew, num_samples=1000, starting_loc=None, stepsize=0.1, burn_in=1000, Y_metadata=None)[source]

Simple implementation of Metropolis sampling algorithm

Will run a parallel chain for each input dimension (treats each f independently) Thus assumes f*_1 independant of f*_2 etc.

Parameters: num_samples – Number of samples to take fNew – f at which to sample around starting_loc – Starting locations of the independant chains (usually will be conditional_mean of likelihood), often link_f stepsize – Stepsize for the normal proposal distribution (will need modifying) burnin – number of samples to use for burnin (will need modifying) Y_metadata – Y_metadata for pdf
conditional_mean(gp)[source]

The mean of the random variable conditioned on one value of the GP

conditional_variance(gp)[source]

The variance of the random variable conditioned on one value of the GP

d2logpdf_df2(*args, **kwargs)

TODO: Doc strings

d3logpdf_df3(*args, **kwargs)

Evaluates the link function link(f) then computes the derivative of log likelihood using it Uses the Faa di Bruno’s formula for the chain rule

$\frac{d\log p(y|\lambda(f))}{df} = \frac{d\log p(y|\lambda(f))}{d\lambda(f)}\frac{d\lambda(f)}{df}$
Parameters: f (Nx1 array) – latent variables f y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution - not used derivative of log likelihood evaluated for this point 1xN array

TODO: Doc strings

TODO: Doc strings

static from_dict(input_dict)[source]

Calculation of the log predictive density

Parameters: y_test ((Nx1) array) – test observations (y_{*}) mu_star ((Nx1) array) – predictive mean of gaussian p(f_{*}|mu_{*}, var_{*}) var_star ((Nx1) array) – predictive variance of gaussian p(f_{*}|mu_{*}, var_{*})

Calculation of the log predictive density via sampling

Parameters: y_test ((Nx1) array) – test observations (y_{*}) mu_star ((Nx1) array) – predictive mean of gaussian p(f_{*}|mu_{*}, var_{*}) var_star ((Nx1) array) – predictive variance of gaussian p(f_{*}|mu_{*}, var_{*}) num_samples (int) – num samples of p(f_{*}|mu_{*}, var_{*}) to take

Evaluates the link function link(f) then computes the log likelihood (log pdf) using it

Parameters: f (Nx1 array) – latent variables f y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution - not used log likelihood evaluated for this point float

Convenience function that can overridden for functions where this could be computed more efficiently

Parameters: obs – observed output tau – cavity distribution 1st natural parameter (precision) v – cavity distribution 2nd natural paramenter (mu*precision)

Evaluates the link function link(f) then computes the likelihood (pdf) using it

Parameters: f (Nx1 array) – latent variables f y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution - not used likelihood evaluated for this point float

Quadrature calculation of the predictive mean: E(Y_star|Y) = E( E(Y_star|f_star, Y) )

Parameters: mu – mean of posterior sigma – standard deviation of posterior

Compute mean, variance of the predictive distibution.

Parameters: mu – mean of the latent variable, f, of posterior var – variance of the latent variable, f, of posterior full_cov (Boolean) – whether to use the full covariance or just the diagonal

Approximation to the predictive variance: V(Y_star)

The following variance decomposition is used: V(Y_star) = E( V(Y_star|f_star)**2 ) + V( E(Y_star|f_star) )**2

Parameters: Predictive_mean: mu – mean of posterior sigma – standard deviation of posterior output’s predictive mean, if None _predictive_mean function will be called.
request_num_latent_functions(Y)[source]

The likelihood should infer how many latent functions are needed for the likelihood

Default is the number of outputs

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable samples – number of samples to take for each f location
to_dict()[source]

E_p(f) [ log p(y|f) ] d/dm E_p(f) [ log p(y|f) ] d/dv E_p(f) [ log p(y|f) ]

where p(f) is a Gaussian with mean m and variance v. The shapes of Y, m and v should match.

if no gh_points are passed, we construct them using defualt options

## GPy.likelihoods.loggaussian module¶

$p(y_{i}|f_{i}, z_{i}) = \prod_{i=1}^{n} (\frac{ry^{r-1}}{\exp{f(x_{i})}})^{1-z_i} (1 + (\frac{y}{\exp(f(x_{i}))})^{r})^{z_i-2}$


Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter Nx1 array
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter Nx1 array

Gradient of the log-likelihood function at y given f, w.r.t shape parameter


Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ derivative of likelihood evaluated at points f w.r.t variance parameter float

derivative of logpdf wrt link_f param .. math:

:param y: data
:type y: Nx1 array
:param Y_metadata: includes censoring information in dictionary key 'censored'
:returns: likelihood evaluated for this point
:rtype: float
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter Nx1 array
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter Nx1 array
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata not used in gaussian derivative of log likelihood evaluated at points link(f) w.r.t variance parameter Nx1 array

Gradient of the log-likelihood function at y given f, w.r.t variance parameter


Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ derivative of likelihood evaluated at points f w.r.t variance parameter float
Parameters: link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ likelihood evaluated for this point float
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ likelihood evaluated for this point float

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

Pull out the gradients, be careful as the order must match the order in which the parameters are added

## GPy.likelihoods.loglogistic module¶

$p(y_{i}|f_{i}, z_{i}) = \prod_{i=1}^{n} (\frac{ry^{r-1}}{\exp{f(x_{i})}})^{1-z_i} (1 + (\frac{y}{\exp(f(x_{i}))})^{r})^{z_i-2}$


Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))


Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ derivative of hessian evaluated at points f and f_j w.r.t variance parameter Nx1 array


Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ third derivative of likelihood evaluated at points f Nx1 array


Parameters: link_f (Nx1 array) – latent variables (f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ gradient of likelihood evaluated at points Nx1 array

Derivative of the dlogpdf_dlink w.r.t shape parameter


Parameters: inv_link_f (Nx1 array) – latent variables inv_link_f y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ derivative of likelihood evaluated at points f w.r.t variance parameter Nx1 array

Gradient of the log-likelihood function at y given f, w.r.t shape parameter


Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ derivative of likelihood evaluated at points f w.r.t variance parameter float


Parameters: link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ likelihood evaluated for this point float


Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ likelihood evaluated for this point float

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

Pull out the gradients, be careful as the order must match the order in which the parameters are added

## GPy.likelihoods.mixed_noise module¶

class MixedNoise(likelihoods_list, name='mixed_noise')[source]

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

## GPy.likelihoods.poisson module¶

Poisson likelihood

$p(y_{i}|\lambda(f_{i})) = \frac{\lambda(f_{i})^{y_{i}}}{y_{i}!}e^{-\lambda(f_{i})}$

Note

Y is expected to take values in {0,1,2,…}

conditional_mean(gp)[source]

The mean of the random variable conditioned on one value of the GP

conditional_variance(gp)[source]

The variance of the random variable conditioned on one value of the GP

$\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}\lambda(f)} = \frac{-y_{i}}{\lambda(f_{i})^{2}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

$\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = \frac{2y_{i}}{\lambda(f_{i})^{3}}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution third derivative of likelihood evaluated at points f Nx1 array

$\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \frac{y_{i}}{\lambda(f_{i})} - 1$
Parameters: link_f (Nx1 array) – latent variables (f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution gradient of likelihood evaluated at points Nx1 array

$\ln p(y_{i}|\lambda(f_{i})) = -\lambda(f_{i}) + y_{i}\log \lambda(f_{i}) - \log y_{i}!$
Parameters: link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution likelihood evaluated for this point float

$p(y_{i}|\lambda(f_{i})) = \frac{\lambda(f_{i})^{y_{i}}}{y_{i}!}e^{-\lambda(f_{i})}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution likelihood evaluated for this point float

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

## GPy.likelihoods.student_t module¶

Student T likelihood

For nomanclature see Bayesian Data Analysis 2003 p576

$p(y_{i}|\lambda(f_{i})) = \frac{\Gamma\left(\frac{v+1}{2}\right)}{\Gamma\left(\frac{v}{2}\right)\sqrt{v\pi\sigma^{2}}}\left(1 + \frac{1}{v}\left(\frac{(y_{i} - f_{i})^{2}}{\sigma^{2}}\right)\right)^{\frac{-v+1}{2}}$
conditional_mean(gp)[source]
conditional_variance(gp)[source]

$\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}\lambda(f)} = \frac{(v+1)((y_{i}-\lambda(f_{i}))^{2} - \sigma^{2}v)}{((y_{i}-\lambda(f_{i}))^{2} + \sigma^{2}v)^{2}}$
Parameters: inv_link_f (Nx1 array) – latent variables inv_link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

$\frac{d}{d\sigma^{2}}(\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}f}) = \frac{v(v+1)(\sigma^{2}v - 3(y_{i} - \lambda(f_{i}))^{2})}{(\sigma^{2}v + (y_{i} - \lambda(f_{i}))^{2})^{3}}$
Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution derivative of hessian evaluated at points f and f_j w.r.t variance parameter Nx1 array

$\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = \frac{-2(v+1)((y_{i} - \lambda(f_{i}))^3 - 3(y_{i} - \lambda(f_{i})) \sigma^{2} v))}{((y_{i} - \lambda(f_{i})) + \sigma^{2} v)^3}$
Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution third derivative of likelihood evaluated at points f Nx1 array

$\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \frac{(v+1)(y_{i}-\lambda(f_{i}))}{(y_{i}-\lambda(f_{i}))^{2} + \sigma^{2}v}$
Parameters: inv_link_f (Nx1 array) – latent variables (f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution gradient of likelihood evaluated at points Nx1 array

Derivative of the dlogpdf_dlink w.r.t variance parameter (t_noise)

$\frac{d}{d\sigma^{2}}(\frac{d \ln p(y_{i}|\lambda(f_{i}))}{df}) = \frac{-2\sigma v(v + 1)(y_{i}-\lambda(f_{i}))}{(y_{i}-\lambda(f_{i}))^2 + \sigma^2 v)^2}$
Parameters: inv_link_f (Nx1 array) – latent variables inv_link_f y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution derivative of likelihood evaluated at points f w.r.t variance parameter Nx1 array

Gradient of the log-likelihood function at y given f, w.r.t variance parameter (t_noise)

$\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\sigma^{2}} = \frac{v((y_{i} - \lambda(f_{i}))^{2} - \sigma^{2})}{2\sigma^{2}(\sigma^{2}v + (y_{i} - \lambda(f_{i}))^{2})}$
Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution derivative of likelihood evaluated at points f w.r.t variance parameter float

$\ln p(y_{i}|\lambda(f_{i})) = \ln \Gamma\left(\frac{v+1}{2}\right) - \ln \Gamma\left(\frac{v}{2}\right) - \ln \sqrt{v \pi\sigma^{2}} - \frac{v+1}{2}\ln \left(1 + \frac{1}{v}\left(\frac{(y_{i} - \lambda(f_{i}))^{2}}{\sigma^{2}}\right)\right)$
Parameters: inv_link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution likelihood evaluated for this point float

$p(y_{i}|\lambda(f_{i})) = \frac{\Gamma\left(\frac{v+1}{2}\right)}{\Gamma\left(\frac{v}{2}\right)\sqrt{v\pi\sigma^{2}}}\left(1 + \frac{1}{v}\left(\frac{(y_{i} - \lambda(f_{i}))^{2}}{\sigma^{2}}\right)\right)^{\frac{-v+1}{2}}$
Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in student t distribution likelihood evaluated for this point float

Returns a set of samples of observations based on a given value of the latent variable.

Parameters: gp – latent variable

Pull out the gradients, be careful as the order must match the order in which the parameters are added

## GPy.likelihoods.weibull module¶

Implementing Weibull likelihood function …

$\begin{split}\frac{d^{2} \ln p(y_{i}|\lambda(f_{i}))}{d^{2}\lambda(f)} = -\beta^{2}\frac{d\Psi(\alpha_{i})}{d\alpha_{i}}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution Diagonal of hessian matrix (second derivative of likelihood evaluated at points f) Nx1 array

Note

Will return diagonal of hessian, since every where else it is 0, as the likelihood factorizes over cases (the distribution for y_i depends only on link(f_i) not on link(f_(j!=i))

Derivative of hessian of loglikelihood wrt r-shape parameter. :param link_f: :param y: :param Y_metadata: :return:

Parameters: f – y – Y_metadata –

$\begin{split}\frac{d^{3} \ln p(y_{i}|\lambda(f_{i}))}{d^{3}\lambda(f)} = -\beta^{3}\frac{d^{2}\Psi(\alpha_{i})}{d\alpha_{i}}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution third derivative of likelihood evaluated at points f Nx1 array

$\begin{split}\frac{d \ln p(y_{i}|\lambda(f_{i}))}{d\lambda(f)} = \beta (\log \beta y_{i}) - \Psi(\alpha_{i})\beta\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables (f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution gradient of likelihood evaluated at points Nx1 array

First order derivative derivative of loglikelihood wrt r:shape parameter

Parameters: link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in gamma distribution third derivative of likelihood evaluated at points f Nx1 array
Parameters: f – y – Y_metadata –

Gradient of the log-likelihood function at y given f, w.r.t shape parameter


Parameters: inv_link_f (Nx1 array) – latent variables link(f) y (Nx1 array) – data Y_metadata – includes censoring information in dictionary key ‘censored’ derivative of likelihood evaluated at points f w.r.t variance parameter float
Parameters: f – y – Y_metadata –

$\begin{split}\ln p(y_{i}|\lambda(f_{i})) = \alpha_{i}\log \beta - \log \Gamma(\alpha_{i}) + (\alpha_{i} - 1)\log y_{i} - \beta y_{i}\\ \alpha_{i} = \beta y_{i}\end{split}$
Parameters: link_f (Nx1 array) – latent variables (link(f)) y (Nx1 array) – data Y_metadata – Y_metadata which is not used in poisson distribution likelihood evaluated for this point float