Search CORE

15 research outputs found

Spectral approximations in machine learning

Author: Homrighausen Darren
McDonald Daniel J.
Publication venue
Publication date: 01/01/2011
Field of study

In many areas of machine learning, it becomes necessary to find the eigenvector decompositions of large matrices. We discuss two methods for reducing the computational burden of spectral decompositions: the more venerable Nystom extension and a newly introduced algorithm based on random projections. Previous work has centered on the ability to reconstruct the original matrix. We argue that a more interesting and relevant comparison is their relative performance in clustering and classification tasks using the approximate eigenvectors as features. We demonstrate that performance is task specific and depends on the rank of the approximation.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Semi-supervised Learning for Photometric Supernova Classification

Author: Breiman
Breiman
Breiman
Breiman
Chad M. Schafer
Chapelle
Coifman
Darren Homrighausen
Dovi Poznanski
Dubath
Falck
Freeman
Gong
Grimmett
Hastie
Homeier
Johnson
Joseph W. Richards
Kessler
Kessler
Kunz
Kuznetsova
Lafon
Newling
Nugent
Peter E. Freeman
Poznanski
Poznanski
Poznanski
Richards
Richards
Richards
Rodney
Ruppert
Sullivan
Wasserman
Publication venue: 'Wiley'
Publication date: 27/09/2011
Field of study

We present a semi-supervised method for photometric supernova typing. Our approach is to first use the nonlinear dimension reduction technique diffusion map to detect structure in a database of supernova light curves and subsequently employ random forest classification on a spectroscopically confirmed training set to learn a model that can predict the type of each newly observed supernova. We demonstrate that this is an effective method for supernova typing. As supernova numbers increase, our semi-supervised method efficiently utilizes this information to improve classification, a property not enjoyed by template based methods. Applied to supernova data simulated by Kessler et al. (2010b) to mimic those of the Dark Energy Survey, our methods achieve (cross-validated) 95% Type Ia purity and 87% Type Ia efficiency on the spectroscopic sample, but only 50% Type Ia purity and 50% efficiency on the photometric sample due to their spectroscopic follow-up strategy. To improve the performance on the photometric sample, we search for better spectroscopic follow-up procedures by studying the sensitivity of our machine learned supernova classification on the specific strategy used to obtain training sets. With a fixed amount of spectroscopic follow-up time, we find that deeper magnitude-limited spectroscopic surveys are better for producing training sets. For supernova Ia (II-P) typing, we obtain a 44% (1%) increase in purity to 72% (87%) and 30% (162%) increase in efficiency to 65% (84%) of the sample using a 25th (24.5th) magnitude-limited survey instead of the shallower spectroscopic sample used in the original simulations. When redshift information is available, we incorporate it into our analysis using a novel method of altering the diffusion map representation of the supernovae. Incorporating host redshifts leads to a 5% improvement in Type Ia purity and 13% improvement in Type Ia efficiency.Comment: 16 pages, 11 figures, accepted for publication in MNRA

arXiv.org e-Print Archive

Crossref