# GPy.util package¶

## Introduction¶

A variety of utility functions including matrix operations and quick access to test datasets.

## GPy.util.block_matrices module¶

block_dot(A, B, diagonal=False)[source]

Element wise dot product on block matricies

+——+——+ +——+——+ +——-+——-+ | | | | | | |A11.B11|B12.B12| | A11 | A12 | | B11 | B12 | | | | +——+——+ o +——+——| = +——-+——-+ | | | | | | |A21.B21|A22.B22| | A21 | A22 | | B21 | B22 | | | | +————-+ +——+——+ +——-+——-+

..Note

If any block of either (A or B) are stored as 1d vectors then we assume that it denotes a diagonal matrix efficient dot product using numpy broadcasting will be used, i.e. A11*B11

If either (A or B) of the diagonal matrices are stored as vectors then a more efficient dot product using numpy broadcasting will be used, i.e. A11*B11

get_block_shapes(B)[source]
get_block_shapes_3d(B)[source]
get_blocks(A, blocksizes)[source]
get_blocks_3d(A, blocksizes, pagesizes=None)[source]

Given a 3d matrix, make a block matrix, where the first and second dimensions are blocked according to blocksizes, and the pages are blocked using pagesizes

unblock(B)[source]

## GPy.util.choleskies module¶

Given the derivative of an objective fn with respect to the cholesky L, compute the derivate with respect to the original matrix K, defined as

K = LL^T

where L was obtained by Cholesky decomposition

flat_to_triang(flat_mat)
indexes_to_fix_for_low_rank(rank, size)[source]

Work out which indexes of the flatteneed array should be fixed if we want the cholesky to represent a low rank matrix

multiple_dpotri(Ls)[source]
safe_root(N)[source]
triang_to_cov(L)[source]
triang_to_flat(L)

## GPy.util.classification module¶

conf_matrix(p, labels, names=['1', '0'], threshold=0.5, show=True)[source]

Returns error rate and true/false positives in a binary classification problem - Actual classes are displayed by column. - Predicted classes are displayed by row.

Parameters: p – array of class ‘1’ probabilities. labels – array of actual classes. names – list of class names, defaults to [‘1’,’0’]. threshold – probability value used to decide the class. show (False|True) – whether the matrix should be shown or not

## GPy.util.cluster_with_offset module¶

cluster(data, inputs, verbose=False)[source]

Clusters data

Using the new offset model, this method uses a greedy algorithm to cluster the data. It starts with all the data points in separate clusters and tests whether combining them increases the overall log-likelihood (LL). It then iteratively joins pairs of clusters which cause the greatest increase in the LL, until no join increases the LL.

arguments: inputs – the ‘X’s in a list, one item per cluster data – the ‘Y’s in a list, one item per cluster

returns a list of the clusters.

get_log_likelihood(inputs, data, clust)[source]

Get the LL of a combined set of clusters, ignoring time series offsets.

Get the log likelihood of a cluster without worrying about the fact different time series are offset. We’re using it here really for those cases in which we only have one cluster to get the loglikelihood of.

arguments: inputs – the ‘X’s in a list, one item per cluster data – the ‘Y’s in a list, one item per cluster clust – list of clusters to use

returns a tuple: log likelihood and the offset (which is always zero for this model)

get_log_likelihood_offset(inputs, data, clust)[source]

Get the log likelihood of a combined set of clusters, fitting the offsets

arguments: inputs – the ‘X’s in a list, one item per cluster data – the ‘Y’s in a list, one item per cluster clust – list of clusters to use

returns a tuple: log likelihood and the offset

## GPy.util.datasets module¶

Check with the user that the are happy with terms and conditions for the data set.

boston_housing(data_set='boston_housing')[source]
boxjenkins_airline(data_set='boxjenkins_airline', num_train=96)[source]
brendan_faces(data_set='brendan_faces')[source]
cifar10_patches(data_set='cifar-10')[source]

The Candian Institute for Advanced Research 10 image data set. Code for loading in this data is taken from this Boris Babenko’s blog post, original code available here: http://bbabenko.tumblr.com/post/86756017649/learning-low-level-vision-feautres-in-10-lines-of-code

cmu_mocap(subject, train_motions, test_motions=[], sample_every=4, data_set='cmu_mocap')[source]

Load a given subject’s training and test motions from the CMU motion capture data.

cmu_mocap_35_walk_jog(data_set='cmu_mocap')[source]

Load CMU subject 35’s walking and jogging motions, the same data that was used by Taylor, Roweis and Hinton at NIPS 2007. but without their preprocessing. Also used by Lawrence at AISTATS 2007.

cmu_mocap_49_balance(data_set='cmu_mocap')[source]

Load CMU subject 49’s one legged balancing motion that was used by Alvarez, Luengo and Lawrence at AISTATS 2009.

cmu_urls_files(subj_motions, messages=True)[source]

Find which resources are missing on the local disk for the requested CMU motion capture motions.

creep_data(data_set='creep_rupture')[source]

Brun and Yoshida’s metal creep rupture data.

crescent_data(num_data=200, seed=10000)[source]

Data set formed from a mixture of four Gaussians. In each class two of the Gaussians are elongated at right angles to each other and offset to form an approximation to the crescent data that is popular in semi-supervised learning as a toy problem.

param num_data_part:
number of data to be sampled (default is 200).
type num_data:int
param seed:random seed to be used for data generation.
type seed:int
data_available(dataset_name=None)[source]

Check if the data set is available on the local machine already.

data_details_return(data, data_set)[source]

Update the data component of the data dictionary with details drawn from the data_resources.

decampos_digits(data_set='decampos_characters', which_digits=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9])[source]
della_gatta_TRP63_gene_expression(data_set='della_gatta', gene_number=None)[source]

Check with the user that the are happy with terms and conditions for the data set, then download it.

drosophila_knirps(data_set='drosophila_protein')[source]
drosophila_protein(data_set='drosophila_protein')[source]
football_data(season='1314', data_set='football_data')[source]

Football data from English games since 1993. This downloads data from football-data.co.uk for the given season.

fruitfly_tomancak(data_set='fruitfly_tomancak', gene_number=None)[source]
global_average_temperature(data_set='global_temperature', num_train=1000, refresh_data=False)[source]

Warning, if you use this function multiple times in a row you get blocked due to terms of service violations. The function will cache the result of your query, if you wish to refresh an old query set refresh_data to True.

The function is inspired by this notebook: http://nbviewer.ipython.org/github/sahuguet/notebooks/blob/master/GoogleTrends%20meet%20Notebook.ipynb

hapmap3(data_set='hapmap3')[source]

The HapMap phase three SNP dataset - 1184 samples out of 11 populations.

SNP_matrix (A) encoding [see Paschou et all. 2007 (PCA-Correlated SNPs…)]: Let (B1,B2) be the alphabetically sorted bases, which occur in the j-th SNP, then

/ 1, iff SNPij==(B1,B1)
Aij = | 0, iff SNPij==(B1,B2)
-1, iff SNPij==(B2,B2)

The SNP data and the meta information (such as iid, sex and phenotype) are stored in the dataframe datadf, index is the Individual ID, with following columns for metainfo:

• family_id -> Family ID
• paternal_id -> Paternal ID
• maternal_id -> Maternal ID
• sex -> Sex (1=male; 2=female; other=unknown)
• phenotype -> Phenotype (-9, or 0 for unknown)
• population -> Population string (e.g. ‘ASW’ - ‘YRI’)
• rest are SNP rs (ids)

• Chromosome:
• autosomal chromosemes -> 1-22
• X X chromosome -> 23
• Y Y chromosome -> 24
• XY Pseudo-autosomal region of X -> 25
• MT Mitochondrial -> 26
• Relative Positon (to Chromosome) [base pairs]
isomap_faces(num_samples=698, data_set='isomap_face_data')[source]
lee_yeast_ChIP(data_set='lee_yeast_ChIP')[source]
mauna_loa(data_set='mauna_loa', num_train=545, refresh_data=False)[source]
oil(data_set='three_phase_oil_flow')[source]

The three phase oil data from Bishop and James (1993).

oil_100(seed=10000, data_set='three_phase_oil_flow')[source]
olivetti_faces(data_set='olivetti_faces')[source]
olivetti_glasses(data_set='olivetti_glasses', num_training=200, seed=10000)[source]
olympic_100m_men(data_set='rogers_girolami_data')[source]
olympic_100m_women(data_set='rogers_girolami_data')[source]
olympic_200m_men(data_set='rogers_girolami_data')[source]
olympic_200m_women(data_set='rogers_girolami_data')[source]
olympic_400m_men(data_set='rogers_girolami_data')[source]
olympic_400m_women(data_set='rogers_girolami_data')[source]
olympic_marathon_men(data_set='olympic_marathon_men')[source]
olympic_sprints(data_set='rogers_girolami_data')[source]

All olympics sprint winning times for multiple output prediction.

osu_run1(data_set='osu_run1', sample_every=4)[source]
prompt_user(prompt)[source]

reporthook(a, b, c)[source]
ripley_synth(data_set='ripley_prnn_data')[source]
robot_wireless(data_set='robot_wireless')[source]
sample_class(f)[source]
silhouette(data_set='ankur_pose_data')[source]
simulation_BGPLVM()[source]
singlecell(data_set='singlecell')[source]
singlecell_rna_seq_deng(dataset='singlecell_deng')[source]
singlecell_rna_seq_islam(dataset='singlecell_islam')[source]
sod1_mouse(data_set='sod1_mouse')[source]
spellman_yeast(data_set='spellman_yeast')[source]
spellman_yeast_cdc15(data_set='spellman_yeast')[source]
swiss_roll(num_samples=3000, data_set='swiss_roll')[source]
swiss_roll_1000()[source]
swiss_roll_generated(num_samples=1000, sigma=0.0)[source]
toy_linear_1d_classification(seed=10000)[source]
toy_rbf_1d(seed=10000, num_samples=500)[source]

Samples values of a function from an RBF covariance with very small noise for inputs uniformly distributed between -1 and 1.

Parameters: seed (int) – seed to use for random sampling. num_samples (int) – number of samples to sample in the function (default 500).
toy_rbf_1d_50(seed=10000)[source]
xw_pen(data_set='xw_pen')[source]

## GPy.util.debug module¶

The module for some general debug tools

checkFinite(arr, name=None)[source]
checkFullRank(m, tol=1e-10, name=None, force_check=False)[source]

## GPy.util.decorators module¶

silence_errors(f)[source]

This wraps a function and it silences numpy errors that happen during the execution. After the function has exited, it restores the previous state of the warnings.

## GPy.util.diag module¶

Add b to the view of A in place (!). Returns modified A. Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters: A (ndarray) – 2 dimensional array b (ndarray-like) – either one dimensional or scalar offset (int) – same as in view. view of A, which is adjusted inplace
divide(A, b, offset=0)[source]

Divide the view of A by b in place (!). Returns modified A Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters: A (ndarray) – 2 dimensional array b (ndarray-like) – either one dimensional or scalar offset (int) – same as in view. view of A, which is adjusted inplace
multiply(A, b, offset=0)

Times the view of A with b in place (!). Returns modified A Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters: A (ndarray) – 2 dimensional array b (ndarray-like) – either one dimensional or scalar offset (int) – same as in view. view of A, which is adjusted inplace
offdiag_view(A, offset=0)[source]
subtract(A, b, offset=0)[source]

Subtract b from the view of A in place (!). Returns modified A. Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters: A (ndarray) – 2 dimensional array b (ndarray-like) – either one dimensional or scalar offset (int) – same as in view. view of A, which is adjusted inplace
times(A, b, offset=0)[source]

Times the view of A with b in place (!). Returns modified A Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters: A (ndarray) – 2 dimensional array b (ndarray-like) – either one dimensional or scalar offset (int) – same as in view. view of A, which is adjusted inplace
view(A, offset=0)[source]

Get a view on the diagonal elements of a 2D array.

This is actually a view (!) on the diagonal of the array, so you can in-place adjust the view.

:param ndarray A: 2 dimensional numpy array :param int offset: view offset to give back (negative entries allowed) :rtype: ndarray view of diag(A)

>>> import numpy as np
>>> X = np.arange(9).reshape(3,3)
>>> view(X)
array([0, 4, 8])
>>> d = view(X)
>>> d += 2
>>> view(X)
array([ 2,  6, 10])
>>> view(X, offset=-1)
array([3, 7])
>>> subtract(X, 3, offset=-1)
array([[ 2,  1,  2],
[ 0,  6,  5],
[ 6,  4, 10]])

## GPy.util.functions module¶

clip_exp(x)[source]
differfln(x0, x1)[source]
logistic(x)[source]
logisticln(x)[source]
normcdf(x)[source]
normcdfln(x)[source]

## GPy.util.gpu_init module¶

The package for scikits.cuda initialization

Global variables: initSuccess providing CUBLAS handle: cublas_handle

closeGPU()[source]

## GPy.util.initialization module¶

Created on 24 Feb 2014

@author: maxz

initialize_latent(init, input_dim, Y)[source]

## GPy.util.input_warping_functions module¶

class IdentifyWarping[source]

The identity warping function, for testing

f(X, test_data=False)[source]
class InputWarpingFunction(name)[source]

Abstract class for input warping functions

f(X, test=False)[source]
class InputWarpingTest[source]

The identity warping function, for testing

f(X, test_data=False)[source]
class KumarWarping(X, warping_indices=None, epsilon=None, Xmin=None, Xmax=None)[source]

Kumar Warping for input data

X : array_like, shape = (n_samples, n_features)
The input data that is going to be warped
warping_indices: list of int, optional
The features that are going to be warped Default to warp all the features
epsilon: float, optional
Used to normalized input data to [0+e, 1-e] Default to 1e-6
Xmin : list of float, Optional
The min values for each feature defined by users Default to the train minimum
Xmax : list of float, Optional
The max values for each feature defined by users Default to the train maximum
warping_indices: list of int
The features that are going to be warped Default to warp all the features
warping_dim: int
The number of features to be warped
Xmin : list of float
The min values for each feature defined by users Default to the train minimum
Xmax : list of float
The max values for each feature defined by users Default to the train maximum
epsilon: float
Used to normalized input data to [0+e, 1-e] Default to 1e-6
X_normalized : array_like, shape = (n_samples, n_features)
The normalized training X
scaling : list of float, length = n_features in X
Defined as 1.0 / (self.Xmax - self.Xmin)
params : list of Param
The list of all the parameters used in Kumar Warping
num_parameters: int
The number of parameters used in Kumar Warping
f(X, test_data=False)[source]

Apply warping_function to some Input data

X : array_like, shape = (n_samples, n_features)

test_data: bool, optional
Default to False, should set to True when transforming test data
X_warped : array_like, shape = (n_samples, n_features)
The warped input data

f(x) = 1 - (1 - x^a)^b

Compute the gradient of warping function with respect to X

X : array_like, shape = (n_samples, n_features)
grad : array_like, shape = (n_samples, n_features)
The gradient for every location at X

grad = a * b * x ^(a-1) * (1 - x^a)^(b-1)

Update the gradients of marginal log likelihood with respect to the parameters of warping function

X : array_like, shape = (n_samples, n_features)
The input BEFORE warping
dL_dW : array_like, shape = (n_samples, n_features)
The gradient of marginal log likelihood with respect to the Warped input

let w = f(x), the input after warping, then dW_da = b * (1 - x^a)^(b - 1) * x^a * ln(x) dW_db = - (1 - x^a)^b * ln(1 - x^a) dL_da = dL_dW * dW_da dL_db = dL_dW * dW_db

## GPy.util.linalg module¶

DSYR(*args, **kwargs)[source]
DSYR_blas(A, x, alpha=1.0)[source]

Performs a symmetric rank-1 update operation: A <- A + alpha * np.dot(x,x.T)

Parameters: A – Symmetric NxN np.array x – Nx1 np.array alpha – scalar
DSYR_numpy(A, x, alpha=1.0)[source]

Performs a symmetric rank-1 update operation: A <- A + alpha * np.dot(x,x.T)

Parameters: A – Symmetric NxN np.array x – Nx1 np.array alpha – scalar
backsub_both_sides(L, X, transpose='left')[source]

Return L^-T * X * L^-1, assumuing X is symmetrical and L is lower cholesky

dpotri(A, lower=1)[source]

Wrapper for lapack dpotri function

DPOTRI - compute the inverse of a real symmetric positive
definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by DPOTRF
Parameters: A – Matrix A lower – is matrix lower (true) or upper (false) A inverse
dpotrs(A, B, lower=1)[source]

Wrapper for lapack dpotrs function :param A: Matrix A :param B: Matrix B :param lower: is matrix lower (true) or upper (false) :returns:

dtrtri(L)[source]

Inverts a Cholesky lower triangular matrix

Parameters: L – lower triangular matrix inverse of L
dtrtrs(A, B, lower=1, trans=0, unitdiag=0)[source]

Wrapper for lapack dtrtrs function

DTRTRS solves a triangular system of the form

A * X = B or A**T * X = B,

where A is a triangular matrix of order N, and B is an N-by-NRHS matrix. A check is made to verify that A is nonsingular.

Parameters: A – Matrix A(triangular) B – Matrix B lower – is matrix lower (true) or upper (false) Solution to A * X = B or A**T * X = B
force_F_ordered(A)[source]

return a F ordered version of A, assuming A is triangular

force_F_ordered_symmetric(A)[source]

return a F ordered version of A, assuming A is symmetric

ij_jlk_to_ilk(A, B)[source]

Faster version of einsum ‘ij,jlk->ilk’

ijk_jlk_to_il(A, B)[source]

Faster version of einsum einsum(‘ijk,jlk->il’, A,B)

ijk_ljk_to_ilk(A, B)[source]

Faster version of einsum np.einsum(‘ijk,ljk->ilk’, A, B)

I.e A.dot(B.T) for every dimension

jitchol(A, maxtries=5)[source]
mdot(*args)[source]

Multiply all the arguments using matrix product rules. The output is equivalent to multiplying the arguments one by one from left to right using dot(). Precedence can be controlled by creating tuples of arguments, for instance mdot(a,((b,c),d)) multiplies a (a*((b*c)*d)). Note that this means the output of dot(a,b) and mdot(a,b) will differ if a or b is a pure tuple of numbers.

multiple_pdinv(A)[source]
Parameters: A – A DxDxN numpy array (each A[:,:,i] is pd) the inverses of A np.ndarray 0.5* the log of the determinants of A np.array
pca(Y, input_dim)[source]

Principal component analysis: maximum likelihood solution by SVD

Parameters: Y – NxD np.array of data input_dim – int, dimension of projection Nxinput_dim np.array of dimensionality reduced data input_dimxD mapping from X to Y
pddet(A)[source]

Determinant of a positive definite matrix, only symmetric matricies though

pdinv(A, *args)[source]
Parameters: A – A DxD pd numpy array the inverse of A np.ndarray the Cholesky decomposition of A np.ndarray the Cholesky decomposition of Ai np.ndarray the log of the determinant of A float64
ppca(Y, Q, iterations=100)[source]

EM implementation for probabilistic pca.

Parameters: Y (array-like) – Observed Data Q (int) – Dimensionality for reduced array iterations (int) – number of iterations for EM
symmetrify(A, upper=False)[source]

Take the square matrix A and make it symmetrical by copting elements from the lower half to the upper

works IN PLACE.

note: tries to use cython, falls back to a slower numpy version

tdot(*args, **kwargs)[source]
tdot_blas(mat, out=None)[source]

returns np.dot(mat, mat.T), but faster for large 2D arrays of doubles.

tdot_numpy(mat, out=None)[source]
trace_dot(a, b)[source]

Efficiently compute the trace of the matrix product of a and b

## GPy.util.ln_diff_erfs module¶

ln_diff_erfs(x1, x2, return_sign=False)[source]

Function for stably computing the log of difference of two erfs in a numerically stable manner. :param x1 : argument of the positive erf :type x1: ndarray :param x2 : argument of the negative erf :type x2: ndarray :return: tuple containing (log(abs(erf(x1) - erf(x2))), sign(erf(x1) - erf(x2)))

Based on MATLAB code that was written by Antti Honkela and modified by David Luengo and originally derived from code by Neil Lawrence.

## GPy.util.misc module¶

blockify_dhess_dtheta(func)[source]
blockify_hessian(func)[source]
blockify_third(func)[source]
chain_1(df_dg, dg_dx)[source]

Generic chaining function for first derivative

$\frac{d(f . g)}{dx} = \frac{df}{dg} \frac{dg}{dx}$
chain_2(d2f_dg2, dg_dx, df_dg, d2g_dx2)[source]

Generic chaining function for second derivative

$\frac{d^{2}(f . g)}{dx^{2}} = \frac{d^{2}f}{dg^{2}}(\frac{dg}{dx})^{2} + \frac{df}{dg}\frac{d^{2}g}{dx^{2}}$
chain_3(d3f_dg3, dg_dx, d2f_dg2, d2g_dx2, df_dg, d3g_dx3)[source]

Generic chaining function for third derivative

$\frac{d^{3}(f . g)}{dx^{3}} = \frac{d^{3}f}{dg^{3}}(\frac{dg}{dx})^{3} + 3\frac{d^{2}f}{dg^{2}}\frac{dg}{dx}\frac{d^{2}g}{dx^{2}} + \frac{df}{dg}\frac{d^{3}g}{dx^{3}}$
kmm_init(X, m=10)[source]

This is the same initialization algorithm that is used in Kmeans++. It’s quite simple and very useful to initialize the locations of the inducing points in sparse GPs.

Parameters: X – data m – number of inducing points
linear_grid(D, n=100, min_max=(-100, 100))[source]

Creates a D-dimensional grid of n linearly spaced points

Parameters: D – dimension of the grid n – number of points min_max – (min, max) list
opt_wrapper(m, **kwargs)[source]

Thit function just wraps the optimization procedure of a GPy object so that optimize() pickleable (necessary for multiprocessing).

param_to_array(*param)[source]

Convert an arbitrary number of parameters to :class:ndarray class objects. This is for converting parameter objects to numpy arrays, when using scipy.weave.inline routine. In scipy.weave.blitz there is no automatic array detection (even when the array inherits from :class:ndarray)

safe_cube(f)[source]
safe_exp(f)[source]
safe_square(f)[source]
safe_three_times(f)[source]

## GPy.util.mocap module¶

class acclaim_skeleton(file_name=None)[source]

Bases: GPy.util.mocap.skeleton

get_child_xyz(ind, channels)[source]

Loads an ASF file into a skeleton structure.

Parameters: file_name – The file name to load in.

Read bone data from an acclaim skeleton file stream.

Read channels from an acclaim file.

Read documentation from an acclaim skeleton file stream.

Read hierarchy information from acclaim skeleton file stream.

Read a line from a file string and check it isn’t either empty or commented before returning.

Read the root node from an acclaim skeleton file stream.

Loads an acclaim skeleton format from a file stream.

Read units from an acclaim skeleton file stream.

resolve_indices(index, start_val)[source]

Get indices for the skeleton from the channels when loading in channel data.

save_channels(file_name, channels)[source]
set_rotation_matrices()[source]

Set the meta information at each vertex to contain the correct matrices C and Cinv as prescribed by the rotations and rotation orders.

to_xyz(channels)[source]
writ_channels(fid, channels)[source]
class skeleton[source]

Bases: GPy.util.mocap.tree

connection_matrix()[source]
finalize()[source]

After loading in a skeleton ensure parents are correct, vertex orders are correct and rotation matrices are correct.

smooth_angle_channels(channels)[source]

Remove discontinuities in angle channels so that they don’t cause artifacts in algorithms that rely on the smoothness of the functions.

to_xyz(channels)[source]
class tree[source]

Bases: object

branch_str(index, indent='')[source]
find_children()[source]

Take a tree and set the children according to the parents.

Takes a tree structure which lists the parents of each vertex and computes the children for each vertex and places them in.

find_parents()[source]

Take a tree and set the parents according to the children

Takes a tree structure which lists the children of each vertex and computes the parents for each vertex and places them in.

find_root()[source]

Finds the index of the root node of the tree.

get_index_by_id(id)[source]

Give the index associated with a given vertex id.

get_index_by_name(name)[source]

Give the index associated with a given vertex name.

order_vertices()[source]

Order vertices in the graph such that parents always have a lower index than children.

swap_vertices(i, j)[source]

Swap two vertices in the tree structure array. swap_vertex swaps the location of two vertices in a tree structure array.

Parameters: tree – the tree for which two vertices are to be swapped. i – the index of the first vertex to be swapped. j – the index of the second vertex to be swapped. the tree structure with the two vertex locations swapped.
class vertex(name, id, parents=[], children=[], meta={})[source]

Bases: object

Load in a data set of marker points from the Ohio State University C3D motion capture files (http://accad.osu.edu/research/mocap/mocap_data.htm).

parse_text(file_name)[source]

Parse data from Ohio State University text mocap files (http://accad.osu.edu/research/mocap/mocap_data.htm).

Read a file detailing which markers should be connected to which for motion capture data.

rotation_matrix(xangle, yangle, zangle, order='zxy', degrees=False)[source]

Compute the rotation matrix for an angle in each direction. This is a helper function for computing the rotation matrix for a given set of angles in a given order.

Parameters: xangle – rotation for x-axis. yangle – rotation for y-axis. zangle – rotation for z-axis. order – the order for the rotations.

## GPy.util.multioutput module¶

ICM(input_dim, num_outputs, kernel, W_rank=1, W=None, kappa=None, name='ICM')[source]

Builds a kernel for an Intrinsic Coregionalization Model

Input_dim: Input dimensionality (does not include dimension of indices) Number of outputs kernel (a GPy kernel) – kernel that will be multiplied by the coregionalize kernel (matrix B). W_rank (integer) – number tuples of the corregionalization parameters ‘W’
LCM(input_dim, num_outputs, kernels_list, W_rank=1, name='ICM')[source]

Builds a kernel for an Linear Coregionalization Model

Input_dim: Input dimensionality (does not include dimension of indices) Number of outputs kernel (a GPy kernel) – kernel that will be multiplied by the coregionalize kernel (matrix B). W_rank (integer) – number tuples of the corregionalization parameters ‘W’
Private(input_dim, num_outputs, kernel, output, kappa=None, name='X')[source]

Builds a kernel for an Intrinsic Coregionalization Model

Input_dim: Input dimensionality Number of outputs kernel (a GPy kernel) – kernel that will be multiplied by the coregionalize kernel (matrix B). W_rank (integer) – number tuples of the corregionalization parameters ‘W’
build_XY(input_list, output_list=None, index=None)[source]
build_likelihood(Y_list, noise_index, likelihoods_list=None)[source]
get_slices(input_list)[source]
index_to_slices(index)[source]

take a numpy array of integers (index) and return a nested list of slices such that the slices describe the start, stop points for each integer in the index.

e.g. >>> index = np.asarray([0,0,0,1,1,1,2,2,2]) returns >>> [[slice(0,3,None)],[slice(3,6,None)],[slice(6,9,None)]]

or, a more complicated example >>> index = np.asarray([0,0,1,1,0,2,2,2,1,1]) returns >>> [[slice(0,2,None),slice(4,5,None)],[slice(2,4,None),slice(8,10,None)],[slice(5,8,None)]]

## GPy.util.netpbmfile module¶

Read and write image data from respectively to Netpbm files.

This implementation follows the Netpbm format specifications at http://netpbm.sourceforge.net/doc/. No gamma correction is performed.

The following image formats are supported: PBM (bi-level), PGM (grayscale), PPM (color), PAM (arbitrary), XV thumbnail (RGB332, read-only).

Author: Christoph Gohlke Laboratory for Fluorescence Dynamics, University of California, Irvine 2013.01.18

### Examples¶

>>> im1 = numpy.array([[0, 1],[65534, 65535]], dtype=numpy.uint16)
>>> imsave('_tmp.pgm', im1)
>>> assert numpy.all(im1 == im2)
class NetpbmFile(arg=None, **kwargs)[source]

Bases: object

Read and write Netpbm PAM, PBM, PGM, PPM, files.

Initialize instance from filename, open file, or numpy array.

asarray(copy=True, cache=False, **kwargs)[source]

Return image data from file as numpy array.

close()[source]

Close open file. Future asarray calls might fail.

write(arg, **kwargs)[source]

Write instance to file.

Return image data from Netpbm file as numpy array.

args and kwargs are arguments to NetpbmFile.asarray().

imsave(filename, data, maxval=None, pam=False)[source]

Write image data to Netpbm file.

>>> image = numpy.array([[0, 1],[65534, 65535]], dtype=numpy.uint16)
>>> imsave('_tmp.pgm', image)

## GPy.util.normalizer module¶

Created on Aug 27, 2014

@author: Max Zwiessele

class Standardize[source]

Bases: GPy.util.normalizer._Norm

inverse_covariance(covariance)[source]

Convert scaled covariance to unscaled. Args:

covariance - numpy array of shape (n, n)
Returns:
covariance - numpy array of shape (n, n, m) where m is number of
outputs
inverse_mean(X)[source]

Project the normalized object X into space of Y

inverse_variance(var)[source]
normalize(Y)[source]

Project Y into normalized space

scale_by(Y)[source]

Use data matrix Y as normalization space to work in.

scaled()[source]

Whether this Norm object has been initialized.

to_dict()[source]

Convert the object into a json serializable dictionary.

Note: It uses the private method _save_to_input_dict of the parent.

Return dict: json serializable dictionary containing the needed information to instantiate the object

## GPy.util.parallel module¶

The module of tools for parallelization (MPI)

divide_data(datanum, rank, size)[source]
get_id_within_node(comm=None)[source]
optimize_parallel(model, optimizer=None, messages=True, max_iters=1000, outpath='.', interval=100, name=None, **kwargs)[source]

## GPy.util.pca module¶

Created on 10 Sep 2012

@author: Max Zwiessele @copyright: Max Zwiessele 2012

class PCA(X)[source]

Bases: object

PCA module with automatic primal/dual determination.

center(X)[source]

Center X in PCA space.

plot_2d(X, labels=None, s=20, marker='o', dimensions=(0, 1), ax=None, colors=None, fignum=None, cmap=None, **kwargs)[source]

Plot dimensions dimensions with given labels against each other in PC space. Labels can be any sequence of labels of dimensions X.shape[0]. Labels can be drawn with a subsequent call to legend()

plot_fracs(Q=None, ax=None, fignum=None)[source]

Plot fractions of Eigenvalues sorted in descending order.

project(X, Q=None)[source]

Project X into PCA space, defined by the Q highest eigenvalues. Y = X dot V

The file for utilities related to integration by quadrature methods - will contain implementation for gaussian-kronrod integration.

getSubs(Subs, XK, NK=1)[source]

Integrate f from fmin to fmax, do integration by substitution x = r / (1-r**2) when r goes from -1 to 1 , x goes from -inf to inf. the interval for quadgk function is from -1 to +1, so we transform the space from (-inf,inf) to (-1,1) :param f: :param fmin: :param fmax: :param difftol: :return:

numpy implementation makes use of the code here: http://se.mathworks.com/matlabcentral/fileexchange/18801-quadvgk We here use gaussian kronrod integration already used in gpstuff for evaluating one dimensional integrals. This is vectorised quadrature which means that several functions can be evaluated at the same time over a grid of points. :param f: :param fmin: :param fmax: :param difftol: :return:

## GPy.util.squashers module¶

sigmoid(x)[source]
single_softmax(x)[source]
softmax(x)[source]

## GPy.util.subarray_and_sorting module¶

Module author: Max Zwiessele <ibinbei@gmail.com>

common_subarrays(X, axis=0)[source]

Find common subarrays of 2 dimensional X, where axis is the axis to apply the search over. Common subarrays are returned as a dictionary of <subarray, [index]> pairs, where the subarray is a tuple representing the subarray and the index is the index for the subarray in X, where index is the index to the remaining axis.

:param np.ndarray X: 2d array to check for common subarrays in :param int axis: axis to apply subarray detection over.

When the index is 0, compare rows – columns, otherwise.

In a 2d array: >>> import numpy as np >>> X = np.zeros((3,6), dtype=bool) >>> X[[1,1,1],[0,4,5]] = 1; X[1:,[2,3]] = 1 >>> X array([[False, False, False, False, False, False],

[ True, False, True, True, True, True], [False, False, True, True, False, False]], dtype=bool)
>>> d = common_subarrays(X,axis=1)
>>> len(d)
3
>>> X[:, d[tuple(X[:,0])]]
array([[False, False, False],
[ True,  True,  True],
[False, False, False]], dtype=bool)
>>> d[tuple(X[:,4])] == d[tuple(X[:,0])] == [0, 4, 5]
True
>>> d[tuple(X[:,1])]
[1]

## GPy.util.univariate_Gaussian module¶

cdfNormal(z)[source]

Robust implementations of cdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.

*/

derivLogCdfNormal(z)[source]

Robust implementations of derivative of the log cdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.

inv_std_norm_cdf(x)[source]

Inverse cumulative standard Gaussian distribution Based on Winitzki, S. (2008)

logCdfNormal(z)[source]

Robust implementations of log cdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.
logPdfNormal(z)[source]

Robust implementations of log pdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.
std_norm_pdf(x)[source]

## GPy.util.warping_functions module¶

class IdentityFunction(closed_inverse=True)[source]

Identity warping function. This is for testing and sanity check purposes and should not be used in practice. The closed_inverse flag should only be set to False for debugging and testing purposes.

f(y)[source]

function transformation y is a list of values (GP training data) of shape [N, 1]

gradient of f w.r.t to y

gradient of f w.r.t to y

class LogFunction(closed_inverse=True)[source]

Easy wrapper for applying a fixed log warping function to positive-only values. The closed_inverse flag should only be set to False for debugging and testing purposes.

f(y)[source]

function transformation y is a list of values (GP training data) of shape [N, 1]

gradient of f w.r.t to y

gradient of f w.r.t to y

class TanhFunction(n_terms=3, initial_y=None)[source]

This is the function proposed in Snelson et al.: A sum of tanh functions with linear trends outside the range. Notice the term ‘d’, which scales the linear trend.

n_terms specifies the number of tanh terms to be used

f(y)[source]

Transform y with f using parameter vector psi psi = [[a,b,c]]

$$f = (y * d) + \sum_{terms} a * tanh(b *(y + c))$$

gradient of f w.r.t to y ([N x 1])

Returns: Nx1 vector of derivatives, unless return_precalc is true,

then it also returns the precomputed stuff

gradient of f w.r.t to y and psi

Returns: NxIx4 tensor of partial derivatives
class WarpingFunction(name)[source]

abstract function for warping z = f(y)

f(y, psi)[source]

function transformation y is a list of values (GP training data) of shape [N, 1]

f_inv(z, max_iterations=250, y=None)[source]

Calculate the numerical inverse of f. This should be overwritten for specific warping functions where the inverse can be found in closed form.

Parameters: max_iterations – maximum number of N.R. iterations