Search CORE

298,975 research outputs found

The detection of globular clusters in galaxies as a data mining problem

Author: Bassino
Bishop
Broyden
Byrd
Carlson
Chang
Davidon
Dirsch
Duda
Dunn
Fletcher
Giuseppe Longo
Goldfarb
Holland
Kotsiantis
Kundu
Massimo Brescia
Maurizio Paolillo
Meng
Paolillo
Peng
Rubinstein
Shanno
Stefano Cavuoti
Sutton
Thomas Puzia
Yang
Zhu
Publication venue: 'Wiley'
Publication date: 16/12/2011
Field of study

We present an application of self-adaptive supervised learning classifiers derived from the Machine Learning paradigm, to the identification of candidate Globular Clusters in deep, wide-field, single band HST images. Several methods provided by the DAME (Data Mining & Exploration) web application, were tested and compared on the NGC1399 HST data described in Paolillo 2011. The best results were obtained using a Multi Layer Perceptron with Quasi Newton learning rule which achieved a classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% of contamination. An extensive set of experiments revealed that the use of accurate structural parameters (effective radius, central surface brightness) does improve the final result, but only by 5%. It is also shown that the method is capable to retrieve also extreme sources (for instance, very extended objects) which are missed by more traditional approaches.Comment: Accepted 2011 December 12; Received 2011 November 28; in original form 2011 October 1

arXiv.org e-Print Archive

OA@INAF - Istituto Nazionale di Astrofisica

Caltech Authors

Syntactic Topic Models

Author: Blei David M.
Boyd-Graber Jordan
Publication venue
Publication date: 01/01/2008
Field of study

The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where sentences are grouped into documents. It assumes that each word is drawn from a latent topic chosen by combining document-level features and the local syntactic context. Each document has a distribution over latent topics, as in topic models, which provides the semantic consistency. Each element in the dependency parse tree also has a distribution over the topics of its children, as in latent-state syntax models, which provides the syntactic consistency. These distributions are convolved so that the topic of each word is likely under both its document and syntactic context. We derive a fast posterior inference algorithm based on variational methods. We report qualitative and quantitative studies on both synthetic data and hand-parsed documents. We show that the STM is a more predictive model of language than current models based only on syntax or only on topics

arXiv.org e-Print Archive

CiteSeerX