Decision Tree Classifiers for Star/Galaxy Separation

Abazajian; Abazajian; Abazajian; Ball; Bernstein; Breiman; Brodley; E. C. Vasconcellos; F. L. LaBarbera; Fayyad; Fayyad; Freund; Gama; Geoffrey; H. Frago Campos Velho; H. V. Capelato; Haijian; Heydon-Dumbleton; Holmes; Kohavi; La Barbera; M. Trevisan; MacGillivray; Maddox; Murthy; Odewahn; Odewahn; Quinlam; Quinlam; Quinlam; R. R. de Carvalho; R. R. Gal; R. S. R. Ruiz; Ruiz; Stoughton; Suchkov; Weir; Witten; Yasuda; York

research

Decision Tree Classifiers for Star/Galaxy Separation

Authors: Abazajian
Abazajian
Abazajian
Ball
Bernstein
Breiman
Brodley
E. C. Vasconcellos
F. L. LaBarbera
Fayyad
Fayyad
Freund
Gama
Geoffrey
H. Frago Campos Velho
H. V. Capelato
Haijian
Heydon-Dumbleton
Holmes
Kohavi
La Barbera
M. Trevisan
MacGillivray
Maddox
Murthy
Odewahn
Odewahn
Quinlam
Quinlam
Quinlam
R. R. de Carvalho
R. R. Gal
R. S. R. Ruiz
Ruiz
Stoughton
Suchkov
Weir
Witten
Yasuda
York
Publication date: 8 November 2010
Publisher: 'IOP Publishing'
Doi

Abstract

We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of

884,126

SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals:

14\le r\le21

(

85.2%

) and

r\ge19

(

82.1%

). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT and Ball et al. (2006). We find that our FT classifier is comparable or better in completeness over the full magnitude range