GPy.util package

Introduction

A variety of utility functions including matrix operations and quick access to test datasets.

Submodules

GPy.util.block_matrices module

block_dot(A, B, diagonal=False)[source]

Element wise dot product on block matricies

+——+——+ +——+——+ +——-+——-+ | | | | | | |A11.B11|B12.B12| | A11 | A12 | | B11 | B12 | | | | +——+——+ o +——+——| = +——-+——-+ | | | | | | |A21.B21|A22.B22| | A21 | A22 | | B21 | B22 | | | | +————-+ +——+——+ +——-+——-+

..Note

If any block of either (A or B) are stored as 1d vectors then we assume that it denotes a diagonal matrix efficient dot product using numpy broadcasting will be used, i.e. A11*B11

If either (A or B) of the diagonal matrices are stored as vectors then a more efficient dot product using numpy broadcasting will be used, i.e. A11*B11

get_block_shapes(B)[source]
get_block_shapes_3d(B)[source]
get_blocks(A, blocksizes)[source]
get_blocks_3d(A, blocksizes, pagesizes=None)[source]

Given a 3d matrix, make a block matrix, where the first and second dimensions are blocked according to blocksizes, and the pages are blocked using pagesizes

unblock(B)[source]

GPy.util.choleskies module

backprop_gradient(dL, L)

Given the derivative of an objective fn with respect to the cholesky L, compute the derivate with respect to the original matrix K, defined as

K = LL^T

where L was obtained by Cholesky decomposition

flat_to_triang(flat_mat)
indexes_to_fix_for_low_rank(rank, size)[source]

Work out which indexes of the flatteneed array should be fixed if we want the cholesky to represent a low rank matrix

multiple_dpotri(Ls)[source]
safe_root(N)[source]
triang_to_cov(L)[source]
triang_to_flat(L)

GPy.util.choleskies_cython module

GPy.util.classification module

conf_matrix(p, labels, names=['1', '0'], threshold=0.5, show=True)[source]

Returns error rate and true/false positives in a binary classification problem - Actual classes are displayed by column. - Predicted classes are displayed by row.

Parameters:
  • p – array of class ‘1’ probabilities.
  • labels – array of actual classes.
  • names – list of class names, defaults to [‘1’,’0’].
  • threshold – probability value used to decide the class.
  • show (False|True) – whether the matrix should be shown or not

GPy.util.cluster_with_offset module

cluster(data, inputs, verbose=False)[source]

Clusters data

Using the new offset model, this method uses a greedy algorithm to cluster the data. It starts with all the data points in separate clusters and tests whether combining them increases the overall log-likelihood (LL). It then iteratively joins pairs of clusters which cause the greatest increase in the LL, until no join increases the LL.

arguments: inputs – the ‘X’s in a list, one item per cluster data – the ‘Y’s in a list, one item per cluster

returns a list of the clusters.

get_log_likelihood(inputs, data, clust)[source]

Get the LL of a combined set of clusters, ignoring time series offsets.

Get the log likelihood of a cluster without worrying about the fact different time series are offset. We’re using it here really for those cases in which we only have one cluster to get the loglikelihood of.

arguments: inputs – the ‘X’s in a list, one item per cluster data – the ‘Y’s in a list, one item per cluster clust – list of clusters to use

returns a tuple: log likelihood and the offset (which is always zero for this model)

get_log_likelihood_offset(inputs, data, clust)[source]

Get the log likelihood of a combined set of clusters, fitting the offsets

arguments: inputs – the ‘X’s in a list, one item per cluster data – the ‘Y’s in a list, one item per cluster clust – list of clusters to use

returns a tuple: log likelihood and the offset

GPy.util.config module

GPy.util.datasets module

authorize_download(dataset_name=None)[source]

Check with the user that the are happy with terms and conditions for the data set.

boston_housing(data_set='boston_housing')[source]
boxjenkins_airline(data_set='boxjenkins_airline', num_train=96)[source]
brendan_faces(data_set='brendan_faces')[source]
cifar10_patches(data_set='cifar-10')[source]

The Candian Institute for Advanced Research 10 image data set. Code for loading in this data is taken from this Boris Babenko’s blog post, original code available here: http://bbabenko.tumblr.com/post/86756017649/learning-low-level-vision-feautres-in-10-lines-of-code

cmu_mocap(subject, train_motions, test_motions=[], sample_every=4, data_set='cmu_mocap')[source]

Load a given subject’s training and test motions from the CMU motion capture data.

cmu_mocap_35_walk_jog(data_set='cmu_mocap')[source]

Load CMU subject 35’s walking and jogging motions, the same data that was used by Taylor, Roweis and Hinton at NIPS 2007. but without their preprocessing. Also used by Lawrence at AISTATS 2007.

cmu_mocap_49_balance(data_set='cmu_mocap')[source]

Load CMU subject 49’s one legged balancing motion that was used by Alvarez, Luengo and Lawrence at AISTATS 2009.

cmu_urls_files(subj_motions, messages=True)[source]

Find which resources are missing on the local disk for the requested CMU motion capture motions.

creep_data(data_set='creep_rupture')[source]

Brun and Yoshida’s metal creep rupture data.

crescent_data(num_data=200, seed=10000)[source]

Data set formed from a mixture of four Gaussians. In each class two of the Gaussians are elongated at right angles to each other and offset to form an approximation to the crescent data that is popular in semi-supervised learning as a toy problem.

param num_data_part:
 number of data to be sampled (default is 200).
type num_data:int
param seed:random seed to be used for data generation.
type seed:int
data_available(dataset_name=None)[source]

Check if the data set is available on the local machine already.

data_details_return(data, data_set)[source]

Update the data component of the data dictionary with details drawn from the data_resources.

decampos_digits(data_set='decampos_characters', which_digits=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9])[source]
della_gatta_TRP63_gene_expression(data_set='della_gatta', gene_number=None)[source]
download_data(dataset_name=None)[source]

Check with the user that the are happy with terms and conditions for the data set, then download it.

download_rogers_girolami_data(data_set='rogers_girolami_data')[source]
download_url(url, store_directory, save_name=None, messages=True, suffix='')[source]

Download a file from a url and save it to disk.

drosophila_knirps(data_set='drosophila_protein')[source]
drosophila_protein(data_set='drosophila_protein')[source]
football_data(season='1314', data_set='football_data')[source]

Football data from English games since 1993. This downloads data from football-data.co.uk for the given season.

fruitfly_tomancak(data_set='fruitfly_tomancak', gene_number=None)[source]
global_average_temperature(data_set='global_temperature', num_train=1000, refresh_data=False)[source]

Data downloaded from Google trends for given query terms.

Warning, if you use this function multiple times in a row you get blocked due to terms of service violations. The function will cache the result of your query, if you wish to refresh an old query set refresh_data to True.

The function is inspired by this notebook: http://nbviewer.ipython.org/github/sahuguet/notebooks/blob/master/GoogleTrends%20meet%20Notebook.ipynb

hapmap3(data_set='hapmap3')[source]

The HapMap phase three SNP dataset - 1184 samples out of 11 populations.

SNP_matrix (A) encoding [see Paschou et all. 2007 (PCA-Correlated SNPs…)]: Let (B1,B2) be the alphabetically sorted bases, which occur in the j-th SNP, then

/ 1, iff SNPij==(B1,B1)
Aij = | 0, iff SNPij==(B1,B2)
-1, iff SNPij==(B2,B2)

The SNP data and the meta information (such as iid, sex and phenotype) are stored in the dataframe datadf, index is the Individual ID, with following columns for metainfo:

  • family_id -> Family ID
  • paternal_id -> Paternal ID
  • maternal_id -> Maternal ID
  • sex -> Sex (1=male; 2=female; other=unknown)
  • phenotype -> Phenotype (-9, or 0 for unknown)
  • population -> Population string (e.g. ‘ASW’ - ‘YRI’)
  • rest are SNP rs (ids)

More information is given in infodf:

  • Chromosome:
    • autosomal chromosemes -> 1-22
    • X X chromosome -> 23
    • Y Y chromosome -> 24
    • XY Pseudo-autosomal region of X -> 25
    • MT Mitochondrial -> 26
  • Relative Positon (to Chromosome) [base pairs]
isomap_faces(num_samples=698, data_set='isomap_face_data')[source]
lee_yeast_ChIP(data_set='lee_yeast_ChIP')[source]
mauna_loa(data_set='mauna_loa', num_train=545, refresh_data=False)[source]
oil(data_set='three_phase_oil_flow')[source]

The three phase oil data from Bishop and James (1993).

oil_100(seed=10000, data_set='three_phase_oil_flow')[source]
olivetti_faces(data_set='olivetti_faces')[source]
olivetti_glasses(data_set='olivetti_glasses', num_training=200, seed=10000)[source]
olympic_100m_men(data_set='rogers_girolami_data')[source]
olympic_100m_women(data_set='rogers_girolami_data')[source]
olympic_200m_men(data_set='rogers_girolami_data')[source]
olympic_200m_women(data_set='rogers_girolami_data')[source]
olympic_400m_men(data_set='rogers_girolami_data')[source]
olympic_400m_women(data_set='rogers_girolami_data')[source]
olympic_marathon_men(data_set='olympic_marathon_men')[source]
olympic_sprints(data_set='rogers_girolami_data')[source]

All olympics sprint winning times for multiple output prediction.

osu_run1(data_set='osu_run1', sample_every=4)[source]
prompt_user(prompt)[source]

Ask user for agreeing to data set licenses.

pumadyn(seed=10000, data_set='pumadyn-32nm')[source]
reporthook(a, b, c)[source]
ripley_synth(data_set='ripley_prnn_data')[source]
robot_wireless(data_set='robot_wireless')[source]
sample_class(f)[source]
silhouette(data_set='ankur_pose_data')[source]
simulation_BGPLVM()[source]
singlecell(data_set='singlecell')[source]
singlecell_rna_seq_deng(dataset='singlecell_deng')[source]
singlecell_rna_seq_islam(dataset='singlecell_islam')[source]
sod1_mouse(data_set='sod1_mouse')[source]
spellman_yeast(data_set='spellman_yeast')[source]
spellman_yeast_cdc15(data_set='spellman_yeast')[source]
swiss_roll(num_samples=3000, data_set='swiss_roll')[source]
swiss_roll_1000()[source]
swiss_roll_generated(num_samples=1000, sigma=0.0)[source]
toy_linear_1d_classification(seed=10000)[source]
toy_rbf_1d(seed=10000, num_samples=500)[source]

Samples values of a function from an RBF covariance with very small noise for inputs uniformly distributed between -1 and 1.

Parameters:
  • seed (int) – seed to use for random sampling.
  • num_samples (int) – number of samples to sample in the function (default 500).
toy_rbf_1d_50(seed=10000)[source]
xw_pen(data_set='xw_pen')[source]

GPy.util.debug module

The module for some general debug tools

checkFinite(arr, name=None)[source]
checkFullRank(m, tol=1e-10, name=None, force_check=False)[source]

GPy.util.decorators module

silence_errors(f)[source]

This wraps a function and it silences numpy errors that happen during the execution. After the function has exited, it restores the previous state of the warnings.

GPy.util.diag module

add(A, b, offset=0)[source]

Add b to the view of A in place (!). Returns modified A. Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters:
  • A (ndarray) – 2 dimensional array
  • b (ndarray-like) – either one dimensional or scalar
  • offset (int) – same as in view.
Return type:

view of A, which is adjusted inplace

divide(A, b, offset=0)[source]

Divide the view of A by b in place (!). Returns modified A Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters:
  • A (ndarray) – 2 dimensional array
  • b (ndarray-like) – either one dimensional or scalar
  • offset (int) – same as in view.
Return type:

view of A, which is adjusted inplace

multiply(A, b, offset=0)

Times the view of A with b in place (!). Returns modified A Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters:
  • A (ndarray) – 2 dimensional array
  • b (ndarray-like) – either one dimensional or scalar
  • offset (int) – same as in view.
Return type:

view of A, which is adjusted inplace

offdiag_view(A, offset=0)[source]
subtract(A, b, offset=0)[source]

Subtract b from the view of A in place (!). Returns modified A. Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters:
  • A (ndarray) – 2 dimensional array
  • b (ndarray-like) – either one dimensional or scalar
  • offset (int) – same as in view.
Return type:

view of A, which is adjusted inplace

times(A, b, offset=0)[source]

Times the view of A with b in place (!). Returns modified A Broadcasting is allowed, thus b can be scalar.

if offset is not zero, make sure b is of right shape!

Parameters:
  • A (ndarray) – 2 dimensional array
  • b (ndarray-like) – either one dimensional or scalar
  • offset (int) – same as in view.
Return type:

view of A, which is adjusted inplace

view(A, offset=0)[source]

Get a view on the diagonal elements of a 2D array.

This is actually a view (!) on the diagonal of the array, so you can in-place adjust the view.

:param ndarray A: 2 dimensional numpy array :param int offset: view offset to give back (negative entries allowed) :rtype: ndarray view of diag(A)

>>> import numpy as np
>>> X = np.arange(9).reshape(3,3)
>>> view(X)
array([0, 4, 8])
>>> d = view(X)
>>> d += 2
>>> view(X)
array([ 2,  6, 10])
>>> view(X, offset=-1)
array([3, 7])
>>> subtract(X, 3, offset=-1)
array([[ 2,  1,  2],
       [ 0,  6,  5],
       [ 6,  4, 10]])

GPy.util.functions module

clip_exp(x)[source]
differfln(x0, x1)[source]
logistic(x)[source]
logisticln(x)[source]
normcdf(x)[source]
normcdfln(x)[source]

GPy.util.gpu_init module

The package for scikits.cuda initialization

Global variables: initSuccess providing CUBLAS handle: cublas_handle

closeGPU()[source]

GPy.util.initialization module

Created on 24 Feb 2014

@author: maxz

initialize_latent(init, input_dim, Y)[source]

GPy.util.input_warping_functions module

class IdentifyWarping[source]

Bases: GPy.util.input_warping_functions.InputWarpingFunction

The identity warping function, for testing

f(X, test_data=False)[source]
fgrad_X(X)[source]
update_grads(X, dL_dW)[source]
class InputWarpingFunction(name)[source]

Bases: GPy.core.parameterization.parameterized.Parameterized

Abstract class for input warping functions

f(X, test=False)[source]
fgrad_x(X)[source]
update_grads(X, dL_dW)[source]
class InputWarpingTest[source]

Bases: GPy.util.input_warping_functions.InputWarpingFunction

The identity warping function, for testing

f(X, test_data=False)[source]
fgrad_X(X)[source]
update_grads(X, dL_dW)[source]
class KumarWarping(X, warping_indices=None, epsilon=None, Xmin=None, Xmax=None)[source]

Bases: GPy.util.input_warping_functions.InputWarpingFunction

Kumar Warping for input data

X : array_like, shape = (n_samples, n_features)
The input data that is going to be warped
warping_indices: list of int, optional
The features that are going to be warped Default to warp all the features
epsilon: float, optional
Used to normalized input data to [0+e, 1-e] Default to 1e-6
Xmin : list of float, Optional
The min values for each feature defined by users Default to the train minimum
Xmax : list of float, Optional
The max values for each feature defined by users Default to the train maximum
warping_indices: list of int
The features that are going to be warped Default to warp all the features
warping_dim: int
The number of features to be warped
Xmin : list of float
The min values for each feature defined by users Default to the train minimum
Xmax : list of float
The max values for each feature defined by users Default to the train maximum
epsilon: float
Used to normalized input data to [0+e, 1-e] Default to 1e-6
X_normalized : array_like, shape = (n_samples, n_features)
The normalized training X
scaling : list of float, length = n_features in X
Defined as 1.0 / (self.Xmax - self.Xmin)
params : list of Param
The list of all the parameters used in Kumar Warping
num_parameters: int
The number of parameters used in Kumar Warping
f(X, test_data=False)[source]

Apply warping_function to some Input data

X : array_like, shape = (n_samples, n_features)

test_data: bool, optional
Default to False, should set to True when transforming test data
X_warped : array_like, shape = (n_samples, n_features)
The warped input data

f(x) = 1 - (1 - x^a)^b

fgrad_X(X)[source]

Compute the gradient of warping function with respect to X

X : array_like, shape = (n_samples, n_features)
The location to compute gradient
grad : array_like, shape = (n_samples, n_features)
The gradient for every location at X

grad = a * b * x ^(a-1) * (1 - x^a)^(b-1)

update_grads(X, dL_dW)[source]

Update the gradients of marginal log likelihood with respect to the parameters of warping function

X : array_like, shape = (n_samples, n_features)
The input BEFORE warping
dL_dW : array_like, shape = (n_samples, n_features)
The gradient of marginal log likelihood with respect to the Warped input

let w = f(x), the input after warping, then dW_da = b * (1 - x^a)^(b - 1) * x^a * ln(x) dW_db = - (1 - x^a)^b * ln(1 - x^a) dL_da = dL_dW * dW_da dL_db = dL_dW * dW_db

GPy.util.linalg module

DSYR(*args, **kwargs)[source]
DSYR_blas(A, x, alpha=1.0)[source]

Performs a symmetric rank-1 update operation: A <- A + alpha * np.dot(x,x.T)

Parameters:
  • A – Symmetric NxN np.array
  • x – Nx1 np.array
  • alpha – scalar
DSYR_numpy(A, x, alpha=1.0)[source]

Performs a symmetric rank-1 update operation: A <- A + alpha * np.dot(x,x.T)

Parameters:
  • A – Symmetric NxN np.array
  • x – Nx1 np.array
  • alpha – scalar
backsub_both_sides(L, X, transpose='left')[source]

Return L^-T * X * L^-1, assumuing X is symmetrical and L is lower cholesky

dpotri(A, lower=1)[source]

Wrapper for lapack dpotri function

DPOTRI - compute the inverse of a real symmetric positive
definite matrix A using the Cholesky factorization A = U**T*U or A = L*L**T computed by DPOTRF
Parameters:
  • A – Matrix A
  • lower – is matrix lower (true) or upper (false)
Returns:

A inverse

dpotrs(A, B, lower=1)[source]

Wrapper for lapack dpotrs function :param A: Matrix A :param B: Matrix B :param lower: is matrix lower (true) or upper (false) :returns:

dtrtri(L)[source]

Inverts a Cholesky lower triangular matrix

Parameters:L – lower triangular matrix
Return type:inverse of L
dtrtrs(A, B, lower=1, trans=0, unitdiag=0)[source]

Wrapper for lapack dtrtrs function

DTRTRS solves a triangular system of the form

A * X = B or A**T * X = B,

where A is a triangular matrix of order N, and B is an N-by-NRHS matrix. A check is made to verify that A is nonsingular.

Parameters:
  • A – Matrix A(triangular)
  • B – Matrix B
  • lower – is matrix lower (true) or upper (false)
Returns:

Solution to A * X = B or A**T * X = B

force_F_ordered(A)[source]

return a F ordered version of A, assuming A is triangular

force_F_ordered_symmetric(A)[source]

return a F ordered version of A, assuming A is symmetric

ij_jlk_to_ilk(A, B)[source]

Faster version of einsum ‘ij,jlk->ilk’

ijk_jlk_to_il(A, B)[source]

Faster version of einsum einsum(‘ijk,jlk->il’, A,B)

ijk_ljk_to_ilk(A, B)[source]

Faster version of einsum np.einsum(‘ijk,ljk->ilk’, A, B)

I.e A.dot(B.T) for every dimension

jitchol(A, maxtries=5)[source]
mdot(*args)[source]

Multiply all the arguments using matrix product rules. The output is equivalent to multiplying the arguments one by one from left to right using dot(). Precedence can be controlled by creating tuples of arguments, for instance mdot(a,((b,c),d)) multiplies a (a*((b*c)*d)). Note that this means the output of dot(a,b) and mdot(a,b) will differ if a or b is a pure tuple of numbers.

multiple_pdinv(A)[source]
Parameters:A – A DxDxN numpy array (each A[:,:,i] is pd)
Rval invs:the inverses of A
Rtype invs:np.ndarray
Rval hld:0.5* the log of the determinants of A
Rtype hld:np.array
pca(Y, input_dim)[source]

Principal component analysis: maximum likelihood solution by SVD

Parameters:
  • Y – NxD np.array of data
  • input_dim – int, dimension of projection
Rval X:
  • Nxinput_dim np.array of dimensionality reduced data
Rval W:
  • input_dimxD mapping from X to Y
pddet(A)[source]

Determinant of a positive definite matrix, only symmetric matricies though

pdinv(A, *args)[source]
Parameters:A – A DxD pd numpy array
Rval Ai:the inverse of A
Rtype Ai:np.ndarray
Rval L:the Cholesky decomposition of A
Rtype L:np.ndarray
Rval Li:the Cholesky decomposition of Ai
Rtype Li:np.ndarray
Rval logdet:the log of the determinant of A
Rtype logdet:float64
ppca(Y, Q, iterations=100)[source]

EM implementation for probabilistic pca.

Parameters:
  • Y (array-like) – Observed Data
  • Q (int) – Dimensionality for reduced array
  • iterations (int) – number of iterations for EM
symmetrify(A, upper=False)[source]

Take the square matrix A and make it symmetrical by copting elements from the lower half to the upper

works IN PLACE.

note: tries to use cython, falls back to a slower numpy version

tdot(*args, **kwargs)[source]
tdot_blas(mat, out=None)[source]

returns np.dot(mat, mat.T), but faster for large 2D arrays of doubles.

tdot_numpy(mat, out=None)[source]
trace_dot(a, b)[source]

Efficiently compute the trace of the matrix product of a and b

GPy.util.linalg_cython module

GPy.util.linalg_gpu module

GPy.util.ln_diff_erfs module

ln_diff_erfs(x1, x2, return_sign=False)[source]

Function for stably computing the log of difference of two erfs in a numerically stable manner. :param x1 : argument of the positive erf :type x1: ndarray :param x2 : argument of the negative erf :type x2: ndarray :return: tuple containing (log(abs(erf(x1) - erf(x2))), sign(erf(x1) - erf(x2)))

Based on MATLAB code that was written by Antti Honkela and modified by David Luengo and originally derived from code by Neil Lawrence.

GPy.util.misc module

blockify_dhess_dtheta(func)[source]
blockify_hessian(func)[source]
blockify_third(func)[source]
chain_1(df_dg, dg_dx)[source]

Generic chaining function for first derivative

\[\frac{d(f . g)}{dx} = \frac{df}{dg} \frac{dg}{dx}\]
chain_2(d2f_dg2, dg_dx, df_dg, d2g_dx2)[source]

Generic chaining function for second derivative

\[\frac{d^{2}(f . g)}{dx^{2}} = \frac{d^{2}f}{dg^{2}}(\frac{dg}{dx})^{2} + \frac{df}{dg}\frac{d^{2}g}{dx^{2}}\]
chain_3(d3f_dg3, dg_dx, d2f_dg2, d2g_dx2, df_dg, d3g_dx3)[source]

Generic chaining function for third derivative

\[\frac{d^{3}(f . g)}{dx^{3}} = \frac{d^{3}f}{dg^{3}}(\frac{dg}{dx})^{3} + 3\frac{d^{2}f}{dg^{2}}\frac{dg}{dx}\frac{d^{2}g}{dx^{2}} + \frac{df}{dg}\frac{d^{3}g}{dx^{3}}\]
kmm_init(X, m=10)[source]

This is the same initialization algorithm that is used in Kmeans++. It’s quite simple and very useful to initialize the locations of the inducing points in sparse GPs.

Parameters:
  • X – data
  • m – number of inducing points
linear_grid(D, n=100, min_max=(-100, 100))[source]

Creates a D-dimensional grid of n linearly spaced points

Parameters:
  • D – dimension of the grid
  • n – number of points
  • min_max – (min, max) list
opt_wrapper(m, **kwargs)[source]

Thit function just wraps the optimization procedure of a GPy object so that optimize() pickleable (necessary for multiprocessing).

param_to_array(*param)[source]

Convert an arbitrary number of parameters to :class:ndarray class objects. This is for converting parameter objects to numpy arrays, when using scipy.weave.inline routine. In scipy.weave.blitz there is no automatic array detection (even when the array inherits from :class:ndarray)

safe_cube(f)[source]
safe_exp(f)[source]
safe_quad(f)[source]
safe_square(f)[source]
safe_three_times(f)[source]

GPy.util.mocap module

class acclaim_skeleton(file_name=None)[source]

Bases: GPy.util.mocap.skeleton

get_child_xyz(ind, channels)[source]
load_channels(file_name)[source]
load_skel(file_name)[source]

Loads an ASF file into a skeleton structure.

Parameters:file_name – The file name to load in.
read_bonedata(fid)[source]

Read bone data from an acclaim skeleton file stream.

read_channels(fid)[source]

Read channels from an acclaim file.

read_documentation(fid)[source]

Read documentation from an acclaim skeleton file stream.

read_hierarchy(fid)[source]

Read hierarchy information from acclaim skeleton file stream.

read_line(fid)[source]

Read a line from a file string and check it isn’t either empty or commented before returning.

read_root(fid)[source]

Read the root node from an acclaim skeleton file stream.

read_skel(fid)[source]

Loads an acclaim skeleton format from a file stream.

read_units(fid)[source]

Read units from an acclaim skeleton file stream.

resolve_indices(index, start_val)[source]

Get indices for the skeleton from the channels when loading in channel data.

save_channels(file_name, channels)[source]
set_rotation_matrices()[source]

Set the meta information at each vertex to contain the correct matrices C and Cinv as prescribed by the rotations and rotation orders.

to_xyz(channels)[source]
writ_channels(fid, channels)[source]
class skeleton[source]

Bases: GPy.util.mocap.tree

connection_matrix()[source]
finalize()[source]

After loading in a skeleton ensure parents are correct, vertex orders are correct and rotation matrices are correct.

smooth_angle_channels(channels)[source]

Remove discontinuities in angle channels so that they don’t cause artifacts in algorithms that rely on the smoothness of the functions.

to_xyz(channels)[source]
class tree[source]

Bases: object

branch_str(index, indent='')[source]
find_children()[source]

Take a tree and set the children according to the parents.

Takes a tree structure which lists the parents of each vertex and computes the children for each vertex and places them in.

find_parents()[source]

Take a tree and set the parents according to the children

Takes a tree structure which lists the children of each vertex and computes the parents for each vertex and places them in.

find_root()[source]

Finds the index of the root node of the tree.

get_index_by_id(id)[source]

Give the index associated with a given vertex id.

get_index_by_name(name)[source]

Give the index associated with a given vertex name.

order_vertices()[source]

Order vertices in the graph such that parents always have a lower index than children.

swap_vertices(i, j)[source]

Swap two vertices in the tree structure array. swap_vertex swaps the location of two vertices in a tree structure array.

Parameters:
  • tree – the tree for which two vertices are to be swapped.
  • i – the index of the first vertex to be swapped.
  • j – the index of the second vertex to be swapped.
Rval tree:

the tree structure with the two vertex locations swapped.

class vertex(name, id, parents=[], children=[], meta={})[source]

Bases: object

load_text_data(dataset, directory, centre=True)[source]

Load in a data set of marker points from the Ohio State University C3D motion capture files (http://accad.osu.edu/research/mocap/mocap_data.htm).

parse_text(file_name)[source]

Parse data from Ohio State University text mocap files (http://accad.osu.edu/research/mocap/mocap_data.htm).

read_connections(file_name, point_names)[source]

Read a file detailing which markers should be connected to which for motion capture data.

rotation_matrix(xangle, yangle, zangle, order='zxy', degrees=False)[source]

Compute the rotation matrix for an angle in each direction. This is a helper function for computing the rotation matrix for a given set of angles in a given order.

Parameters:
  • xangle – rotation for x-axis.
  • yangle – rotation for y-axis.
  • zangle – rotation for z-axis.
  • order – the order for the rotations.

GPy.util.multioutput module

ICM(input_dim, num_outputs, kernel, W_rank=1, W=None, kappa=None, name='ICM')[source]

Builds a kernel for an Intrinsic Coregionalization Model

Input_dim:

Input dimensionality (does not include dimension of indices)

Num_outputs:

Number of outputs

Parameters:
  • kernel (a GPy kernel) – kernel that will be multiplied by the coregionalize kernel (matrix B).
  • W_rank (integer) – number tuples of the corregionalization parameters ‘W’
LCM(input_dim, num_outputs, kernels_list, W_rank=1, name='ICM')[source]

Builds a kernel for an Linear Coregionalization Model

Input_dim:

Input dimensionality (does not include dimension of indices)

Num_outputs:

Number of outputs

Parameters:
  • kernel (a GPy kernel) – kernel that will be multiplied by the coregionalize kernel (matrix B).
  • W_rank (integer) – number tuples of the corregionalization parameters ‘W’
Private(input_dim, num_outputs, kernel, output, kappa=None, name='X')[source]

Builds a kernel for an Intrinsic Coregionalization Model

Input_dim:

Input dimensionality

Num_outputs:

Number of outputs

Parameters:
  • kernel (a GPy kernel) – kernel that will be multiplied by the coregionalize kernel (matrix B).
  • W_rank (integer) – number tuples of the corregionalization parameters ‘W’
build_XY(input_list, output_list=None, index=None)[source]
build_likelihood(Y_list, noise_index, likelihoods_list=None)[source]
get_slices(input_list)[source]
index_to_slices(index)[source]

take a numpy array of integers (index) and return a nested list of slices such that the slices describe the start, stop points for each integer in the index.

e.g. >>> index = np.asarray([0,0,0,1,1,1,2,2,2]) returns >>> [[slice(0,3,None)],[slice(3,6,None)],[slice(6,9,None)]]

or, a more complicated example >>> index = np.asarray([0,0,1,1,0,2,2,2,1,1]) returns >>> [[slice(0,2,None),slice(4,5,None)],[slice(2,4,None),slice(8,10,None)],[slice(5,8,None)]]

GPy.util.netpbmfile module

Read and write image data from respectively to Netpbm files.

This implementation follows the Netpbm format specifications at http://netpbm.sourceforge.net/doc/. No gamma correction is performed.

The following image formats are supported: PBM (bi-level), PGM (grayscale), PPM (color), PAM (arbitrary), XV thumbnail (RGB332, read-only).

Author:Christoph Gohlke
Organization:Laboratory for Fluorescence Dynamics, University of California, Irvine
Version:2013.01.18

Requirements

Examples

>>> im1 = numpy.array([[0, 1],[65534, 65535]], dtype=numpy.uint16)
>>> imsave('_tmp.pgm', im1)
>>> im2 = imread('_tmp.pgm')
>>> assert numpy.all(im1 == im2)
class NetpbmFile(arg=None, **kwargs)[source]

Bases: object

Read and write Netpbm PAM, PBM, PGM, PPM, files.

Initialize instance from filename, open file, or numpy array.

asarray(copy=True, cache=False, **kwargs)[source]

Return image data from file as numpy array.

close()[source]

Close open file. Future asarray calls might fail.

write(arg, **kwargs)[source]

Write instance to file.

imread(filename, *args, **kwargs)[source]

Return image data from Netpbm file as numpy array.

args and kwargs are arguments to NetpbmFile.asarray().

>>> image = imread('_tmp.pgm')
imsave(filename, data, maxval=None, pam=False)[source]

Write image data to Netpbm file.

>>> image = numpy.array([[0, 1],[65534, 65535]], dtype=numpy.uint16)
>>> imsave('_tmp.pgm', image)

GPy.util.normalizer module

Created on Aug 27, 2014

@author: Max Zwiessele

class Standardize[source]

Bases: GPy.util.normalizer._Norm

inverse_covariance(covariance)[source]

Convert scaled covariance to unscaled. Args:

covariance - numpy array of shape (n, n)
Returns:
covariance - numpy array of shape (n, n, m) where m is number of
outputs
inverse_mean(X)[source]

Project the normalized object X into space of Y

inverse_variance(var)[source]
normalize(Y)[source]

Project Y into normalized space

scale_by(Y)[source]

Use data matrix Y as normalization space to work in.

scaled()[source]

Whether this Norm object has been initialized.

to_dict()[source]

Convert the object into a json serializable dictionary.

Note: It uses the private method _save_to_input_dict of the parent.

Return dict:json serializable dictionary containing the needed information to instantiate the object

GPy.util.parallel module

The module of tools for parallelization (MPI)

divide_data(datanum, rank, size)[source]
get_id_within_node(comm=None)[source]
optimize_parallel(model, optimizer=None, messages=True, max_iters=1000, outpath='.', interval=100, name=None, **kwargs)[source]

GPy.util.pca module

Created on 10 Sep 2012

@author: Max Zwiessele @copyright: Max Zwiessele 2012

class PCA(X)[source]

Bases: object

PCA module with automatic primal/dual determination.

center(X)[source]

Center X in PCA space.

plot_2d(X, labels=None, s=20, marker='o', dimensions=(0, 1), ax=None, colors=None, fignum=None, cmap=None, **kwargs)[source]

Plot dimensions dimensions with given labels against each other in PC space. Labels can be any sequence of labels of dimensions X.shape[0]. Labels can be drawn with a subsequent call to legend()

plot_fracs(Q=None, ax=None, fignum=None)[source]

Plot fractions of Eigenvalues sorted in descending order.

project(X, Q=None)[source]

Project X into PCA space, defined by the Q highest eigenvalues. Y = X dot V

GPy.util.quad_integrate module

The file for utilities related to integration by quadrature methods - will contain implementation for gaussian-kronrod integration.

getSubs(Subs, XK, NK=1)[source]
quadgk_int(f, fmin=-inf, fmax=inf, difftol=0.1)[source]

Integrate f from fmin to fmax, do integration by substitution x = r / (1-r**2) when r goes from -1 to 1 , x goes from -inf to inf. the interval for quadgk function is from -1 to +1, so we transform the space from (-inf,inf) to (-1,1) :param f: :param fmin: :param fmax: :param difftol: :return:

quadvgk(feval, fmin, fmax, tol1=1e-05, tol2=1e-05)[source]

numpy implementation makes use of the code here: http://se.mathworks.com/matlabcentral/fileexchange/18801-quadvgk We here use gaussian kronrod integration already used in gpstuff for evaluating one dimensional integrals. This is vectorised quadrature which means that several functions can be evaluated at the same time over a grid of points. :param f: :param fmin: :param fmax: :param difftol: :return:

GPy.util.squashers module

sigmoid(x)[source]
single_softmax(x)[source]
softmax(x)[source]

GPy.util.subarray_and_sorting module

Module author: Max Zwiessele <ibinbei@gmail.com>

common_subarrays(X, axis=0)[source]

Find common subarrays of 2 dimensional X, where axis is the axis to apply the search over. Common subarrays are returned as a dictionary of <subarray, [index]> pairs, where the subarray is a tuple representing the subarray and the index is the index for the subarray in X, where index is the index to the remaining axis.

:param np.ndarray X: 2d array to check for common subarrays in :param int axis: axis to apply subarray detection over.

When the index is 0, compare rows – columns, otherwise.

In a 2d array: >>> import numpy as np >>> X = np.zeros((3,6), dtype=bool) >>> X[[1,1,1],[0,4,5]] = 1; X[1:,[2,3]] = 1 >>> X array([[False, False, False, False, False, False],

[ True, False, True, True, True, True], [False, False, True, True, False, False]], dtype=bool)
>>> d = common_subarrays(X,axis=1)
>>> len(d)
3
>>> X[:, d[tuple(X[:,0])]]
array([[False, False, False],
       [ True,  True,  True],
       [False, False, False]], dtype=bool)
>>> d[tuple(X[:,4])] == d[tuple(X[:,0])] == [0, 4, 5]
True
>>> d[tuple(X[:,1])]
[1]

GPy.util.univariate_Gaussian module

cdfNormal(z)[source]

Robust implementations of cdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.

*/

derivLogCdfNormal(z)[source]

Robust implementations of derivative of the log cdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.

inv_std_norm_cdf(x)[source]

Inverse cumulative standard Gaussian distribution Based on Winitzki, S. (2008)

logCdfNormal(z)[source]

Robust implementations of log cdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.
logPdfNormal(z)[source]

Robust implementations of log pdf of a standard normal.

@see [[https://github.com/mseeger/apbsint/blob/master/src/eptools/potentials/SpecfunServices.h original implementation]] in C from Matthias Seeger.
std_norm_pdf(x)[source]

GPy.util.warping_functions module

class IdentityFunction(closed_inverse=True)[source]

Bases: GPy.util.warping_functions.WarpingFunction

Identity warping function. This is for testing and sanity check purposes and should not be used in practice. The closed_inverse flag should only be set to False for debugging and testing purposes.

f(y)[source]

function transformation y is a list of values (GP training data) of shape [N, 1]

fgrad_y(y)[source]

gradient of f w.r.t to y

fgrad_y_psi(y, return_covar_chain=False)[source]

gradient of f w.r.t to y

update_grads(Y_untransformed, Kiy)[source]
class LogFunction(closed_inverse=True)[source]

Bases: GPy.util.warping_functions.WarpingFunction

Easy wrapper for applying a fixed log warping function to positive-only values. The closed_inverse flag should only be set to False for debugging and testing purposes.

f(y)[source]

function transformation y is a list of values (GP training data) of shape [N, 1]

fgrad_y(y)[source]

gradient of f w.r.t to y

fgrad_y_psi(y, return_covar_chain=False)[source]

gradient of f w.r.t to y

update_grads(Y_untransformed, Kiy)[source]
class TanhFunction(n_terms=3, initial_y=None)[source]

Bases: GPy.util.warping_functions.WarpingFunction

This is the function proposed in Snelson et al.: A sum of tanh functions with linear trends outside the range. Notice the term ‘d’, which scales the linear trend.

n_terms specifies the number of tanh terms to be used

f(y)[source]

Transform y with f using parameter vector psi psi = [[a,b,c]]

\(f = (y * d) + \sum_{terms} a * tanh(b *(y + c))\)

fgrad_y(y, return_precalc=False)[source]

gradient of f w.r.t to y ([N x 1])

Returns:Nx1 vector of derivatives, unless return_precalc is true,

then it also returns the precomputed stuff

fgrad_y_psi(y, return_covar_chain=False)[source]

gradient of f w.r.t to y and psi

Returns:NxIx4 tensor of partial derivatives
update_grads(Y_untransformed, Kiy)[source]
class WarpingFunction(name)[source]

Bases: GPy.core.parameterization.parameterized.Parameterized

abstract function for warping z = f(y)

f(y, psi)[source]

function transformation y is a list of values (GP training data) of shape [N, 1]

f_inv(z, max_iterations=250, y=None)[source]

Calculate the numerical inverse of f. This should be overwritten for specific warping functions where the inverse can be found in closed form.

Parameters:max_iterations – maximum number of N.R. iterations
fgrad_y(y, psi)[source]

gradient of f w.r.t to y

fgrad_y_psi(y, psi)[source]

gradient of f w.r.t to y

plot(xmin, xmax)[source]