A FLEXIBLE PROCEDURE FOR POSITIVE–UNLABELED LEARNING & PERIODS ESTIMATION FOR MIRAS USING MULTI-BAND LIGHT CURVES AND INVERSE PERIOD-LUMINOSITY RELATIONS

Abstract

This dissertation contains two independent projects: the first project develops a general methodology for solving the Positive–Unlabeled (PU) learning problem, and the second project creates a hierarchical Bayesian model that solves a specific astronomical problem – periods estimation for Miras. In the first project, we deal with the PU learning which considers two samples, a positive set P with observations from only one class and an unlabeled set U with observations from two classes. The goal is to classify observations in U. Class mixture proportion estimation (MPE) in U is a key step in PU learning. Blanchard et al. (2010) show that MPE in PU learning is a generalization of the problem of estimating the proportion of true null hypotheses in multiple testing problems. Motivated by this idea, we propose a flexible framework: fistly reduce the problem to one dimension via construction of a probabilistic classifier trained on the P and U data sets, and then apply a one–dimensional mixture proportion method to the observation class probabilities. The flexibility of this framework lies in the freedom to choose the classifier and the one–dimensional MPE method. Using this framework, we propose two mixture proportion estimators: one adapts ROC technique (Storey, 2002; Scott, 2015), and another adapts isotonic regression (Patra and Sen, 2015). Theoretically we prove the consistency of these two estimators. Empirically we demonstrate that our proposed estimators have competitive performance on simulated waveform data and a protein signaling problem. And the implementations of our estimators are tuning parameter free. The second project of this dissertation is to present an inverse Period-Luminosity relation (PLR) enhanced multi-band semi-parametric model (SP3) to efficiently recover periods for quasiperiodic variable stars such as Miras. Mira variables are promising distance indicators because the oxygen-rich type Miras follow a tight PLR in the near-infrared. However, the Mira light curves are quasi-periodic, making their period estimation significantly challenging. In recent few years, several methods have been developed to estimate period for Miras. He et al. (2016) develop a single-band semi-parametric model based on the Gaussian processes tool. Yuan et al. (2018) extend the above model to a multi-band case. These two models are designed for fitting observations for single Mira (single-band or multi-band) and do not use the PLR. To borrow the strength across light curves, our proposed SP3 model uses inverse Period-Luminosity relation (iPLR) to adaptively feed a frequency prior to each light curve. This model outperforms existing methods in various simulated data sets

    Similar works