1,664 research outputs found
Fast and scalable Gaussian process modeling with applications to astronomical time series
The growing field of large-scale time domain astronomy requires methods for
probabilistic data analysis that are computationally tractable, even with large
datasets. Gaussian Processes are a popular class of models used for this
purpose but, since the computational cost scales, in general, as the cube of
the number of data points, their application has been limited to small
datasets. In this paper, we present a novel method for Gaussian Process
modeling in one-dimension where the computational requirements scale linearly
with the size of the dataset. We demonstrate the method by applying it to
simulated and real astronomical time series datasets. These demonstrations are
examples of probabilistic inference of stellar rotation periods, asteroseismic
oscillation spectra, and transiting planet parameters. The method exploits
structure in the problem when the covariance function is expressed as a mixture
of complex exponentials, without requiring evenly spaced observations or
uniform noise. This form of covariance arises naturally when the process is a
mixture of stochastically-driven damped harmonic oscillators -- providing a
physical motivation for and interpretation of this choice -- but we also
demonstrate that it can be a useful effective model in some other cases. We
present a mathematical description of the method and compare it to existing
scalable Gaussian Process methods. The method is fast and interpretable, with a
range of potential applications within astronomical data analysis and beyond.
We provide well-tested and documented open-source implementations of this
method in C++, Python, and Julia.Comment: Updated in response to referee. Submitted to the AAS Journals.
Comments (still) welcome. Code available: https://github.com/dfm/celerit
A Causal, Data-Driven Approach to Modeling the Kepler Data
Astronomical observations are affected by several kinds of noise, each with
its own causal source; there is photon noise, stochastic source variability,
and residuals coming from imperfect calibration of the detector or telescope.
The precision of NASA Kepler photometry for exoplanet science---the most
precise photometric measurements of stars ever made---appears to be limited by
unknown or untracked variations in spacecraft pointing and temperature, and
unmodeled stellar variability. Here we present the Causal Pixel Model (CPM) for
Kepler data, a data-driven model intended to capture variability but preserve
transit signals. The CPM works at the pixel level so that it can capture very
fine-grained information about the variation of the spacecraft. The CPM
predicts each target pixel value from a large number of pixels of other stars
sharing the instrument variabilities while not containing any information on
possible transits in the target star. In addition, we use the target star's
future and past (auto-regression). By appropriately separating, for each data
point, the data into training and test sets, we ensure that information about
any transit will be perfectly isolated from the model. The method has four
hyper-parameters (the number of predictor stars, the auto-regressive window
size, and two L2-regularization amplitudes for model components), which we set
by cross-validation. We determine a generic set of hyper-parameters that works
well for most of the stars and apply the method to a corresponding set of
target stars. We find that we can consistently outperform (for the purposes of
exoplanet detection) the Kepler Pre-search Data Conditioning (PDC) method for
exoplanet discovery.Comment: Accepted for publication in the PAS
- …