5,458 research outputs found

    Data-Mining a Large Digital Sky Survey: From the Challenges to the Scientific Results

    Get PDF
    The analysis and an efficient scientific exploration of the Digital Palomar Observatory Sky Survey (DPOSS) represents a major technical challenge. The input data set consists of 3 Terabytes of pixel information, and contains a few billion sources. We describe some of the specific scientific problems posed by the data, including searches for distant quasars and clusters of galaxies, and the data-mining techniques we are exploring in addressing them. Machine-assisted discovery methods may become essential for the analysis of such multi-Terabyte data sets. New and future approaches involve unsupervised classification and clustering analysis in the Giga-object data space, including various Bayesian techniques. In addition to the searches for known types of objects in this data base, these techniques may also offer the possibility of discovering previously unknown, rare types of astronomical objects.Comment: Invited paper, to appear in Applications of Digital Image Processing XX, ed. A. Tescher, Proc. S.P.I.E. vol. 3164, in press; 10 pages, a self-contained TeX file, and 3 separate postscript figure

    Approaches to Automated Morphological Classification of Galaxies

    Full text link
    There is an obvious need for automated classification of galaxies, as the number of observed galaxies increases very fast. We examine several approaches to this problem, utilising {\em Artificial Neural Networks} (ANNs). We quote results from a recent study which show that ANNs can classsify galaxies morphologically as well as humans can.Comment: 8 pages, uu-encoded compressed postscript file (containing 2 figures

    Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study

    Full text link
    In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies

    What Your Username Says About You

    Full text link
    Usernames are ubiquitous on the Internet, and they are often suggestive of user demographics. This work looks at the degree to which gender and language can be inferred from a username alone by making use of unsupervised morphology induction to decompose usernames into sub-units. Experimental results on the two tasks demonstrate the effectiveness of the proposed morphological features compared to a character n-gram baseline

    Methods for Amharic part-of-speech tagging

    Get PDF
    The paper describes a set of experiments involving the application of three state-of- the-art part-of-speech taggers to Ethiopian Amharic, using three different tagsets. The taggers showed worse performance than previously reported results for Eng- lish, in particular having problems with unknown words. The best results were obtained using a Maximum Entropy ap- proach, while HMM-based and SVM- based taggers got comparable results
    • …
    corecore