We apply machine learning techniques in an attempt to predict and classify
stellar properties from noisy and sparse time series data. We preprocessed over
94 GB of Kepler light curves from MAST to classify according to ten distinct
physical properties using both representation learning and feature engineering
approaches. Studies using machine learning in the field have been primarily
done on simulated data, making our study one of the first to use real light
curve data for machine learning approaches. We tuned our data using previous
work with simulated data as a template and achieved mixed results between the
two approaches. Representation learning using a Long Short-Term Memory (LSTM)
Recurrent Neural Network (RNN) produced no successful predictions, but our work
with feature engineering was successful for both classification and regression.
In particular, we were able to achieve values for stellar density, stellar
radius, and effective temperature with low error (~ 2 - 4%) and good accuracy
(~ 75%) for classifying the number of transits for a given star. The results
show promise for improvement for both approaches upon using larger datasets
with a larger minority class. This work has the potential to provide a
foundation for future tools and techniques to aid in the analysis of
astrophysical data.Comment: Accepted to The Astronomical Journa