27 research outputs found
Class Discovery in Galaxy Classification
In recent years, automated, supervised classification techniques have been
fruitfully applied to labeling and organizing large astronomical databases.
These methods require off-line classifier training, based on labeled examples
from each of the (known) object classes. In practice, only a small batch of
labeled examples, hand-labeled by a human expert, may be available for
training. Moreover, there may be no labeled examples for some classes present
in the data, i.e. the database may contain several unknown classes. Unknown
classes may be present due to 1) uncertainty in or lack of knowledge of the
measurement process, 2) an inability to adequately ``survey'' a massive
database to assess its content (classes), and/or 3) an incomplete scientific
hypothesis. In recent work, new class discovery in mixed labeled/unlabeled data
was formally posed, with a proposed solution based on mixture models. In this
work we investigate this approach, propose a competing technique suitable for
class discovery in neural networks, and evaluate both methods for
classification and class discovery on several astronomical data sets. Our
results demonstrate up to a 57% reduction in classification error compared to a
standard neural network classifier that uses only labeled data
Objective Subclass Determination of Sloan Digital Sky Survey Unknown Spectral Objects
We analyze a portion of the SDSS photometric catalog, consisting of
approximately 10,000 objects that have been spectroscopically classified into
stars, galaxies, QSOs, late-type stars and unknown objects (spectroscopically
unclassified objects, SUOs), in order to investigate the existence and nature
of subclasses of the unclassified objects. We use a modified mixture modeling
approach that makes use of both labeled and unlabeled data and performs class
discovery on the data set. The modeling was done using four colors derived from
the SDSS photometry: (u-g), (g-r), (r-i), and (i-z). This technique discovers
putative novel classes by identifying compact clusters that largely contain
objects from the spectroscopically unclassified class of objects. These
clusters are of possible scientific interest because they represent structured
groups of outliers, relative to the known object classes. We identify two such
well defined subclasses of the SUOs. One subclass contains 58% SUOs, 40% stars,
and 2% galaxies, QSOs, and late-type stars. The other contains 91% SUOs, 6%
late-type stars, and 3% stars, galaxies, and QSOs. We discuss possible
interpretations of these subclasses while also noting some caution must be
applied to purely color-based object classifications. As a side benefit of this
limited study we also find two distinct classes, consisting largely of
galaxies, that coincide with the recently discussed bimodal galaxy color
distribution.Comment: 31 pages; 6 figures; revised version accepted for Ap. J. Added one
figure, added discussion, compared method with another approach, added
appendix with algorithmic detail