Kernels¶
George comes equipped with a suite of standard covariance functions or kernels that can be combined to build more complex models. The standard kernels fall into the following categories:
- Basic Kernels — trivial (constant or parameterless) functions,
- Radial Kernels — functions that depend only on the radial distance between points in some user-defined metric, and
- Periodic Kernels — exactly period functions that, when combined with a radial kernel, can model quasi-periodic signals.
Combining Kernels describes how to combine kernels to build more sophisticated models and Implementing New Kernels explains how you would go about incorporating a custom kernel.
Note: every kernel takes an optional ndim keyword that must be set to the number of input dimensions for your problem.
Implementation Details¶
It’s worth understanding how these kernels are implemented. Most of the hard work is done at a low level (in C++) and the Python is only a thin wrapper to this functionality. This makes the code fast and consistent across interfaces but it also means that it isn’t currently possible to implement new kernel functions efficiently without recompiling the code. Almost every kernel has hyperparameters that you can set to control its behavior and these can be accessed via the pars property. The values in this array are in the same order as you specified them when initializing the kernel and, in the case of composite kernels (see Combining Kernels) the order goes from left to right. For example,
from george import kernels
k = 2.0 * kernels.Matern32Kernel(5.0)
print(k.pars)
# array([ 2., 5.])
In general, kernel functions have some—possibly different—natural parameterization that can be useful for parameter inference. This can be accessed via the vector property and for most kernels, this will be—unless otherwise specified—the natural logarithm of the pars array. So, for our previous example,
k = 2.0 * kernels.Matern32Kernel(5.0)
print(k.vector)
# array([ 0.69314718, 1.60943791])
George is smart about when it recomputes the kernel and it will only do this if you change the parameters. Therefore, the best way to make changes is by subscripting the kernel. It’s worth noting that subscripting changes the vector array (not pars) so following up our previous example, we can do something like
import numpy as np
k = 2.0 * kernels.Matern32Kernel(5.0)
k[0] = np.log(4.0)
print(k.pars)
# array([ 4., 5.])
k[:] = np.log([6.0, 10.0])
print(k.pars)
# array([ 6., 10.])
Note
The gradient of each kernel is given with respect to vector not pars. This means that in most cases the gradient taken in terms of the logarithm of the hyperparameters.
Basic Kernels¶
- class george.kernels.Kernel(*pars, **kwargs)¶
The abstract kernel type. Every kernel implemented in George should be a subclass of this object.
Parameters: - pars – The hyper-parameters of the kernel.
- ndim – (optional) The number of input dimensions of the kernel. (default: 1)
- class george.kernels.ConstantKernel(value, ndim=1)¶
This kernel returns the constant
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = c\]where \(c\) is a parameter.
Parameters: value – The constant value \(c\) in the above equation.
- class george.kernels.WhiteKernel(value, ndim=1)¶
This kernel returns constant along the diagonal.
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = c \, \delta_{ij}\]where \(c\) is the parameter.
Parameters: value – The constant value \(c\) in the above equation.
- class george.kernels.DotProductKernel(ndim=1)¶
The dot-product kernel takes the form
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \mathbf{x}_i^{\mathrm{T}} \cdot \mathbf{x}_j\]
Radial Kernels¶
- class george.kernels.RadialKernel(metric, ndim=1, dim=-1, extra=[])¶
This kernel (and more importantly its subclasses) computes the distance between two samples in an arbitrary metric and applies a radial function to this distance.
Parameters: - metric – The specification of the metric. This can be a float, in which case the metric is considered isotropic with the variance in each dimension given by the value of metric. Alternatively, metric can be a list of variances for each dimension. In this case, it should have length ndim. The fully general (not axis-aligned) metric hasn’t been implemented yet but it’s on the to do list!
- dim – (optional) If provided, this will apply the kernel in only the specified dimension.
- class george.kernels.ExpKernel(metric, ndim=1, dim=-1, extra=[])¶
The exponential kernel is a RadialKernel where the value at a given radius \(r^2\) is given by:
\[k({r_{ij}}) = \exp \left ( -|r| \right )\]Parameters: metric – The custom metric specified as described in the RadialKernel description.
- class george.kernels.ExpSquaredKernel(metric, ndim=1, dim=-1, extra=[])¶
The exponential-squared kernel is a RadialKernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \exp \left ( -\frac{r^2}{2} \right )\]Parameters: metric – The custom metric specified as described in the RadialKernel description.
- class george.kernels.Matern32Kernel(metric, ndim=1, dim=-1, extra=[])¶
The Matern-3/2 kernel is a RadialKernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \left( 1+\sqrt{3\,r^2} \right)\, \exp \left (-\sqrt{3\,r^2} \right )\]Parameters: metric – The custom metric specified as described in the RadialKernel description.
- class george.kernels.Matern52Kernel(metric, ndim=1, dim=-1, extra=[])¶
The Matern-5/2 kernel is a RadialKernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \left( 1+\sqrt{5\,r^2} + \frac{5\,r^2}{3} \right)\, \exp \left (-\sqrt{5\,r^2} \right )\]Parameters: metric – The custom metric specified as described in the RadialKernel description.
Periodic Kernels¶
- class george.kernels.CosineKernel(period, ndim=1, dim=0)¶
The cosine kernel is given by:
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \cos\left(\frac{2\,\pi}{P}\,\left|x_i-x_j\right| \right)\]where \(P\) is the period.
Parameters: period – The period \(P\) of the oscillation (in the same units as \(\mathbf{x}\)). Note: A shortcoming of this kernel is that it currently only accepts a single period so it’s not very applicable to problems with input dimension larger than one.
- class george.kernels.ExpSine2Kernel(gamma, period, ndim=1, dim=0)¶
The exp-sine-squared kernel is used to model stellar rotation and might be applicable in some other contexts. It is given by the equation:
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \exp \left( -\Gamma\,\sin^2\left[ \frac{\pi}{P}\,\left|x_i-x_j\right| \right] \right)\]where \(\Gamma\) is the “scale” of the correlation and \(P\) is the period of the oscillation measured in the same units as \(\mathbf{x}\).
Parameters: - gamma – The scale \(\Gamma\) of the correlations.
- period – The period \(P\) of the oscillation (in the same units as \(\mathbf{x}\)).
- dim – (optional) The dimension along which this kernel should apply. By default, this will be the zero-th axis.
Combining Kernels¶
More complicated kernels can be constructed by algebraically combining the basic kernels listed in the previous sections. In particular, all the kernels support addition and multiplication. For example, an exponential-squared kernel with a non-trivial variance can be constructed as follows:
from george import kernels
kernel = 1e-3 * kernels.ExpSquaredKernel(3.4)
This is equivalent to:
from math import sqrt
kernel = kernels.Product(kernels.ConstantKernel(sqrt(1e-3)),
kernels.ExpSquaredKernel(3.4))
As demonstrated in Tutorial: setting the hyperparameters, a mixture of kernels can be implemented with addition:
k1 = 1e-3 * kernels.ExpSquaredKernel(3.4)
k2 = 1e-4 * kernels.Matern32Kernel(14.53)
kernel = k1 + k2
Implementing New Kernels¶
Implementing custom kernels in George is a bit of a pain in the current version. For now, the only way to do it is with the PythonKernel where you provide a Python function that computes the value of the kernel function at a single pair of training points.
- class george.kernels.PythonKernel(f, g=None, pars=(), dx=1.234e-06, ndim=1)¶
A custom kernel evaluated in Python. The gradient is optionally evaluated numerically. For big problems, this type of kernel will probably be unbearably slow because each evaluation is done point-wise. Unfortunately, this is the only way to implement custom kernels without re-compiling George. Hopefully we can solve this in the future!
Parameters: - f – A callable that evaluates the kernel function given arguments (x1, x2, p) where x1 and x2 are numpy array defining the coordinates of the samples and p is the numpy array giving the current settings of the parameters.
- g – (optional) A function with the same calling parameters as f but it should return the numpy array with the gradient of the kernel function. If this function isn’t given then the gradient is evaluated using centered finite difference.
- pars – (optional) The initial list of parameter values. If this isn’t provided then the kernel is assumed to have no parameters.
- dx – (optional) The step size used for the gradient computation when using finite difference.