A Python interface to the Kepler data

A Python interface to the Kepler data

If you’re here, then you probably already know about the Kepler mission. You probably also know that it can be a bit of a pain to get access to this public dataset. As I understand things, the canonical source for a catalog of planet candidates—or more precisely Kepler Objects of Interest (KOIs)—is the NASA Exoplanet Archive but the data is available through MAST. There are programmatic interfaces (APIs) available for both of these services but it can still be frustrating to interact with them in an automated way. That’s why I made kplr.

kplr provides a lightweight Pythonic interface to the catalogs and data. Below, I’ll describe the features provided by kplr but to get things started, let’s see an example of how you would go about finding the published parameters of a KOI and download the light curve data.

import kplr
client = kplr.API()

# Find a KOI.
koi = client.koi(952.01)
print(koi.koi_period)

# This KOI has an associated star.
star = koi.star
print(star.kic_teff)

# Download the lightcurves for this KOI.
lightcurves = koi.get_light_curves()
for lc in lightcurves:
    print(lc.filename)

Installation

You can install kplr using the standard Python packaging tool pip:

pip install kplr

or (if you must) easy_install:

easy_install kplr

The development version can be installed using pip:

pip install -e git+https://github.com/dfm/kplr#egg=kplr-dev

or by cloning the GitHub repository:

git clone https://github.com/dfm/kplr.git
cd kplr
python setup.py install

API Interface

The basic work flow for interacting with the APIs starts by initializing an API object:

import kplr
client = kplr.API()

Then, this object provides methods for constructing various queries to find

Kepler Objects of Interest

The kplr KOI search interfaces with The Exoplanet Archive API to return the most up to date information possible. In particular, it searches the cumulative table. As shown in the sample code at the top of this page, it is very easy to retrieve the listing for a single KOI:

koi = client.koi(952.01)

Note the .01 in the KOI ID. This is required because a KOI is specified by the full number (not just 952 which will fail). The object will have an attribute for each column listed in the Exoplanet Archive documentation. For example, the period and error bars (positive and negative respectively) are given by

print(koi.koi_period, koi.koi_period_err1, koi.koi_period_err2)

For KOI 952.01, this result will print 5.901269, 1.7e-05, -1.7e-05.

Finding a set of KOIs that satisfy search criteria is a little more complicated because you must provide syntax that is understood by the Exoplanet Archive. For example, to find all the KOIs with period longer than 200 days, you would run

kois = client.kois(where="koi_period>200")

At the time of writing, this should return 224 KOI objects. If you then wanted to sort by period, you could include the sort keyword argument:

kois = client.kois(where="koi_period>200", sort="koi_period")

or, equivalently,

kois = client.kois(where="koi_period>200", sort=("koi_period", 1))

You can specify the sort order to be descending by using

kois = client.kois(where="koi_period>200", sort=("koi_period", -1))

Confirmed Planets

The confirmed planet interface queries the confirmed planets table using the MAST API. To find a specific planet using this interface, you can use the API.planet() function

planet = client.planet("32b")

or equivalently

planet = client.planet("Kepler-32b")

This object has attributes for each column given in the table in the MAST documentation. For example, the corresponding KOI name for this planet is given by

print(planet.kepoi_name)

In this case, you should see 952.01.

The query syntax on MAST is a little different than on the Exoplanet Archive. For example, to find planets with estimated radii less than 2 Earth radii, you would run

planets = client.planets(koi_prad="<2")

where koi_prad is the name of a column in the MAST documentation table.

The syntax for sorting the results is the same as described above for the KOIs. To sort the above search by period, you would run

planets = client.planets(koi_prad="<2", sort="koi_period")

Kepler Input Catalog Targets

Access to the Kepler Input Catalog (KIC) is also provided by the MAST API’s kic10 table. It can, therefore, be queried using syntax similar to the confirmed planet table. For example, a particular star can be found as follows

star = client.star(9787239)

Similarly, a query can be run on the table using the following syntax:

stars = client.stars(kic_teff="5700..5800")

To select a set of stars in a 2MASS color range with (non-NULL) estimated temperatures, you would run something like:

stars = client.stars(kic_jkcolor="0.3..0.4", kic_teff="!\\null")

Note: by default, the API.stars() endpoint is limited to 100 results because it’s very easy to time out the MAST server if you’re not careful. To change this behavior, you can specify the max_records keyword argument:

stars = client.stars(kic_jkcolor="0.3..0.4", kic_teff="!\\null", max_records=500)

Data Access

Note: to interact with the Kepler data, you will need to be able to read the FITS files. kplr automatically supports loading the data using pyfits so it’s probably easiest to make sure that you have that installed before trying the examples in this section.

The MAST servers are the main source for the Kepler data products. kplr supports two types of data: light curves and target pixel files. These products are described in detail in the Kepler Archive Manual but in summary:

  • the target pixel files contain the lightly-processed CCD readouts from small fields around the telemetered Kepler targets, and
  • the light curve files contain the results of the aperture photometric pipeline applied to the pixel files and various housekeeping columns.

All of the objects described above (KOI, Planet and Star) have a get_light_curves() method and a get_target_pixel_files() method. These methods return (possibly empty) lists of LightCurve and TargetPixelFile objects, respectively. Both of the above methods take three keyword arguments: short_cadence, fetch and clobber. short_cadence defaults to True and it decides whether or not the “short cadence” data should be included. If it is False, only the “long cadence” data are returned. If fetch is True, the data are automatically downloaded from the MAST server if they don’t already exist locally. Otherwise, if fetch is False (default) the data aren’t downloaded until the first time they are opened. Finally, clobber sets the behavior when a local copy of the file already exists. If a data file has been corrupted, it can be useful to set clobber=True to make sure that the bad file is overwritten.

Below is an example for the best practice for loading a set of light curves for a particular object:

# Find the target KOI.
koi = client.koi(952.01)

# Get a list of light curve datasets.
lcs = koi.get_light_curves(short_cadence=False)

# Loop over the datasets and read in the data.
time, flux, ferr, quality = [], [], [], []
for lc in lcs:
    with lc.open() as f:
        # The lightcurve data are in the first FITS HDU.
        hdu_data = f[1].data
        time.append(hdu_data["time"])
        flux.append(hdu_data["sap_flux"])
        ferr.append(hdu_data["sap_flux_err"])
        quality.append(hdu_data["sap_quality"])
Fork me on GitHub