156,601 research outputs found
Data-Mining a Large Digital Sky Survey: From the Challenges to the Scientific Results
The analysis and an efficient scientific exploration of the Digital Palomar
Observatory Sky Survey (DPOSS) represents a major technical challenge. The
input data set consists of 3 Terabytes of pixel information, and contains a few
billion sources. We describe some of the specific scientific problems posed by
the data, including searches for distant quasars and clusters of galaxies, and
the data-mining techniques we are exploring in addressing them.
Machine-assisted discovery methods may become essential for the analysis of
such multi-Terabyte data sets. New and future approaches involve unsupervised
classification and clustering analysis in the Giga-object data space, including
various Bayesian techniques. In addition to the searches for known types of
objects in this data base, these techniques may also offer the possibility of
discovering previously unknown, rare types of astronomical objects.Comment: Invited paper, to appear in Applications of Digital Image Processing
XX, ed. A. Tescher, Proc. S.P.I.E. vol. 3164, in press; 10 pages, a
self-contained TeX file, and 3 separate postscript figure
- …