11,201 research outputs found

    Structure in the 3D Galaxy Distribution: I. Methods and Example Results

    Full text link
    Three methods for detecting and characterizing structure in point data, such as that generated by redshift surveys, are described: classification using self-organizing maps, segmentation using Bayesian blocks, and density estimation using adaptive kernels. The first two methods are new, and allow detection and characterization of structures of arbitrary shape and at a wide range of spatial scales. These methods should elucidate not only clusters, but also the more distributed, wide-ranging filaments and sheets, and further allow the possibility of detecting and characterizing an even broader class of shapes. The methods are demonstrated and compared in application to three data sets: a carefully selected volume-limited sample from the Sloan Digital Sky Survey redshift data, a similarly selected sample from the Millennium Simulation, and a set of points independently drawn from a uniform probability distribution -- a so-called Poisson distribution. We demonstrate a few of the many ways in which these methods elucidate large scale structure in the distribution of galaxies in the nearby Universe.Comment: Re-posted after referee corrections along with partially re-written introduction. 80 pages, 31 figures, ApJ in Press. For full sized figures please download from: http://astrophysics.arc.nasa.gov/~mway/lss1.pd

    On the use of self-organizing maps to accelerate vector quantization

    Full text link
    Self-organizing maps (SOM) are widely used for their topology preservation property: neighboring input vectors are quantified (or classified) either on the same location or on neighbor ones on a predefined grid. SOM are also widely used for their more classical vector quantization property. We show in this paper that using SOM instead of the more classical Simple Competitive Learning (SCL) algorithm drastically increases the speed of convergence of the vector quantization process. This fact is demonstrated through extensive simulations on artificial and real examples, with specific SOM (fixed and decreasing neighborhoods) and SCL algorithms.Comment: A la suite de la conference ESANN 199

    Investigation of topographical stability of the concave and convex Self-Organizing Map variant

    Get PDF
    We investigate, by a systematic numerical study, the parameter dependence of the stability of the Kohonen Self-Organizing Map and the Zheng and Greenleaf concave and convex learning with respect to different input distributions, input and output dimensions

    Forest biodiversity maintenance

    Get PDF
    The global biodiversity loss within forest ecosystems has attracted attention during the last decades. Awareness increased both world-wide and in Sweden, which led to changes in the Swedish forests policy. In the Swedish Forestry Act of 1993 the environmental and production goals became equally important, and several new policy implementation instruments were taken into use. One of these was the Woodland Key Habitat Survey, in which indicator species are used to identify sites with conservation values. The first two papers in this thesis assess the relationships between the lichen indicator species found in the study area (in South-Central Sweden) and their growing-substrates and habitats, respectively. Both studies confirm that the indicator species showed habitat preferences which included old or deciduous trees. In Paper II also the habitat preferences of sedentary birds were assessed and the mixed deciduous habitats showed to be the more species rich, compared to pure coniferous forest. Paper III and IV evaluate the forest owners’ intentions and knowledge of nature conservation, as well as their attitudes towards it. In paper III the conservation intentions of the forest owners were estimated on the harvest registration form and compared to the actual retention at the clear-cuts. The intentions were followed by associated practices, however, the retained amounts of stand structures were overall low. I conclude that the intentions did not increase over the study period, the retained amounts were too low to meet, e.g., the Forest Stewardship Council standards, and, also, that such forms could become important information instruments to the forestry authorities concerning intentions and practices of the forest owners. In Paper IV the results of a questionnaire sent to non-industrial private forest (NIPF) owners within the study area, showed that knowledge about conservation and attendance to a recent educational programme, which included conservation information, were positively related to the attitude towards it. However, those occupied with land-use had a more negative attitude towards conservation than others. Also, in the ranking of operational goals for their own conservation efforts, only 7% of the respondents ranked ‘long-term species survival’ as their first priority, while the vast majority ranked ‘forest health’ as number one. In the fifth and last paper the usefulness of indicator species in monitoring systems adapted to NIPF owners, is discussed. In the questionnaire used in paper IV, NIPF owners were asked to mark all species that they could recognise, from a list of 12 forest species. The results showed that they had a weak knowledge of the listed lichens and fungi, but all four birds were recognised by more than 50%. Thus, in indicator species based monitoring systems for NIPF owners and the public only a few easily communicated and conspicuous species of documented indicator value should be used, e.g. vertebrates

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.
Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. 
Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.
&#xa

    Self-organizing map based adaptive sampling

    Get PDF
    We propose a new adaptive sampling method that uses Self-Organizing Maps (SOM). In SOM, densely sampled regions in the input space is represented by a larger area on the map than that of sparsely sampled regions. We use this property to progressively tune-in on the interesting region of the design space. The method does not rely on parameterized distribution, and can sample from multi-modal and non-convex distributions. In this paper, we minimize several mathematical test functions. We also show its performance in inequality-constrained objective satisfaction problem, in which the objective is to seek diversity in solutions satisfying certain upper-bound constraint in the minimized objective. A new merit function and a measure of space-filling quality were proposed for this purpose

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.
Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. 
Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.
&#xa
    • 

    corecore