298,975 research outputs found
The detection of globular clusters in galaxies as a data mining problem
We present an application of self-adaptive supervised learning classifiers
derived from the Machine Learning paradigm, to the identification of candidate
Globular Clusters in deep, wide-field, single band HST images. Several methods
provided by the DAME (Data Mining & Exploration) web application, were tested
and compared on the NGC1399 HST data described in Paolillo 2011. The best
results were obtained using a Multi Layer Perceptron with Quasi Newton learning
rule which achieved a classification accuracy of 98.3%, with a completeness of
97.8% and 1.6% of contamination. An extensive set of experiments revealed that
the use of accurate structural parameters (effective radius, central surface
brightness) does improve the final result, but only by 5%. It is also shown
that the method is capable to retrieve also extreme sources (for instance, very
extended objects) which are missed by more traditional approaches.Comment: Accepted 2011 December 12; Received 2011 November 28; in original
form 2011 October 1
Syntactic Topic Models
The syntactic topic model (STM) is a Bayesian nonparametric model of language
that discovers latent distributions of words (topics) that are both
semantically and syntactically coherent. The STM models dependency parsed
corpora where sentences are grouped into documents. It assumes that each word
is drawn from a latent topic chosen by combining document-level features and
the local syntactic context. Each document has a distribution over latent
topics, as in topic models, which provides the semantic consistency. Each
element in the dependency parse tree also has a distribution over the topics of
its children, as in latent-state syntax models, which provides the syntactic
consistency. These distributions are convolved so that the topic of each word
is likely under both its document and syntactic context. We derive a fast
posterior inference algorithm based on variational methods. We report
qualitative and quantitative studies on both synthetic data and hand-parsed
documents. We show that the STM is a more predictive model of language than
current models based only on syntax or only on topics
- …