27 research outputs found

    Class Discovery in Galaxy Classification

    Full text link
    In recent years, automated, supervised classification techniques have been fruitfully applied to labeling and organizing large astronomical databases. These methods require off-line classifier training, based on labeled examples from each of the (known) object classes. In practice, only a small batch of labeled examples, hand-labeled by a human expert, may be available for training. Moreover, there may be no labeled examples for some classes present in the data, i.e. the database may contain several unknown classes. Unknown classes may be present due to 1) uncertainty in or lack of knowledge of the measurement process, 2) an inability to adequately ``survey'' a massive database to assess its content (classes), and/or 3) an incomplete scientific hypothesis. In recent work, new class discovery in mixed labeled/unlabeled data was formally posed, with a proposed solution based on mixture models. In this work we investigate this approach, propose a competing technique suitable for class discovery in neural networks, and evaluate both methods for classification and class discovery on several astronomical data sets. Our results demonstrate up to a 57% reduction in classification error compared to a standard neural network classifier that uses only labeled data

    Objective Subclass Determination of Sloan Digital Sky Survey Unknown Spectral Objects

    Full text link
    We analyze a portion of the SDSS photometric catalog, consisting of approximately 10,000 objects that have been spectroscopically classified into stars, galaxies, QSOs, late-type stars and unknown objects (spectroscopically unclassified objects, SUOs), in order to investigate the existence and nature of subclasses of the unclassified objects. We use a modified mixture modeling approach that makes use of both labeled and unlabeled data and performs class discovery on the data set. The modeling was done using four colors derived from the SDSS photometry: (u-g), (g-r), (r-i), and (i-z). This technique discovers putative novel classes by identifying compact clusters that largely contain objects from the spectroscopically unclassified class of objects. These clusters are of possible scientific interest because they represent structured groups of outliers, relative to the known object classes. We identify two such well defined subclasses of the SUOs. One subclass contains 58% SUOs, 40% stars, and 2% galaxies, QSOs, and late-type stars. The other contains 91% SUOs, 6% late-type stars, and 3% stars, galaxies, and QSOs. We discuss possible interpretations of these subclasses while also noting some caution must be applied to purely color-based object classifications. As a side benefit of this limited study we also find two distinct classes, consisting largely of galaxies, that coincide with the recently discussed bimodal galaxy color distribution.Comment: 31 pages; 6 figures; revised version accepted for Ap. J. Added one figure, added discussion, compared method with another approach, added appendix with algorithmic detail
    corecore