Search CORE

83 research outputs found

A Unified Analysis of Multi-task Functional Linear Regression Models with Manifold Constraint and Composite Quadratic Penalty

Author: He Kejun
He Shiyuan
Ye Hanxuan
Publication venue
Publication date: 09/11/2022
Field of study

This work studies the multi-task functional linear regression models where both the covariates and the unknown regression coefficients (called slope functions) are curves. For slope function estimation, we employ penalized splines to balance bias, variance, and computational complexity. The power of multi-task learning is brought in by imposing additional structures over the slope functions. We propose a general model with double regularization over the spline coefficient matrix: i) a matrix manifold constraint, and ii) a composite penalty as a summation of quadratic terms. Many multi-task learning approaches can be treated as special cases of this proposed model, such as a reduced-rank model and a graph Laplacian regularized model. We show the composite penalty induces a specific norm, which helps to quantify the manifold curvature and determine the corresponding proper subset in the manifold tangent space. The complexity of tangent space subset is then bridged to the complexity of geodesic neighbor via generic chaining. A unified convergence upper bound is obtained and specifically applied to the reduced-rank model and the graph Laplacian regularized model. The phase transition behaviors for the estimators are examined as we vary the configurations of model parameters

arXiv.org e-Print Archive

Spline Estimation of Functional Principal Components via Manifold Conjugate Gradient Algorithm

Author: He Kejun
He Shiyuan
Ye Hanxuan
Publication venue
Publication date: 09/11/2022
Field of study

Functional principal component analysis has become the most important dimension reduction technique in functional data analysis. Based on B-spline approximation, functional principal components (FPCs) can be efficiently estimated by the expectation-maximization (EM) and the geometric restricted maximum likelihood (REML) algorithms under the strong assumption of Gaussianity on the principal component scores and observational errors. When computing the solution, the EM algorithm does not exploit the underlying geometric manifold structure, while the performance of REML is known to be unstable. In this article, we propose a conjugate gradient algorithm over the product manifold to estimate FPCs. This algorithm exploits the manifold geometry structure of the overall parameter space, thus improving its search efficiency and estimation accuracy. In addition, a distribution-free interpretation of the loss function is provided from the viewpoint of matrix Bregman divergence, which explains why the proposed method works well under general distribution settings. We also show that a roughness penalization can be easily incorporated into our algorithm with a potentially better fit. The appealing numerical performance of the proposed method is demonstrated by simulation studies and the analysis of a Type Ia supernova light curve dataset

arXiv.org e-Print Archive

Functional Light Curve Models for Type Ia Supernovae and Mira Variables, with Their Application of Distance Determination

Author: He Shiyuan
Publication venue
Publication date: 21/08/2017
Field of study

Both type Ia supernovae and variable stars are important distance indicators in astronomy. The peak luminosity of type Ia supernovae and the period-luminosity relation of Miras can be employed for relative distance determination. For both SNIa and Mira, we develop light curve models with noisy, sparse and irregularly-sampled data. We develop a functional principal component method for SNIa light curves. Each SNIa light curve is expressed as a linear combination of a mean function and several principal component functions. The coefficients of the principal component functions are called scores. The proposed method takes into account peak registration, shape constraints and is equipped with a fast training algorithm. The resulting model provides high quality fit to each light curve. In addition, the scores present powerful characterization of SNIa. They demonstrate connection with interstellar dusting, spectral classes and other physical properties. Moreover, the method provides a functional linear form in place of the commonly used ΔM15 parameter for distance predictions. We also develop a semi-parametric model for Mira period estimation. The proposed method has a close relation with a Gaussian process model, and is solved in an empirical Bayesian framework. The empirical Bayesian is solved by a fast quasi-Newton algorithm with warm start, and combined with a grid search in the frequency parameter due to the related high multimodality. The proposed method is compared with the traditional Lomb-Scargle method in a large-scale simulation and shows considerable improvement

Texas A&M Repository

The M33 Synoptic Stellar Survey. II. Mira Variables

Author: He Shiyuan
Huang Jianhua Z.
Long James
Macri Lucas M.
Yuan Wenlong
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2017
Field of study

We present the discovery of 1847 Mira candidates in the Local Group galaxy M33 using a novel semi-parametric periodogram technique coupled with a Random Forest classifier. The algorithms were applied to ~2.4x10^5 I-band light curves previously obtained by the M33 Synoptic Stellar Survey. We derive preliminary Period-Luminosity relations at optical, near- & mid-infrared wavelengths and compare them to the corresponding relations in the Large Magellanic Cloud.Comment: Includes small corrections to match the published versio

arXiv.org e-Print Archive

Crossref

Texas A&M Repository