12,670 research outputs found

    Soft topographic map for clustering and classification of bacteria

    Get PDF
    In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification or erroneous annotations in the database

    The Cyclohedron Test for Finding Periodic Genes in Time Course Expression Studies

    Get PDF
    The problem of finding periodically expressed genes from time course microarray experiments is at the center of numerous efforts to identify the molecular components of biological clocks. We present a new approach to this problem based on the cyclohedron test, which is a rank test inspired by recent advances in algebraic combinatorics. The test has the advantage of being robust to measurement errors, and can be used to ascertain the significance of top-ranked genes. We apply the test to recently published measurements of gene expression during mouse somitogenesis and find 32 genes that collectively are significant. Among these are previously identified periodic genes involved in the Notch/FGF and Wnt signaling pathways, as well as novel candidate genes that may play a role in regulating the segmentation clock. These results confirm that there are an abundance of exceptionally periodic genes expressed during somitogenesis. The emphasis of this paper is on the statistics and combinatorics that underlie the cyclohedron test and its implementation within a multiple testing framework.Comment: Revision consists of reorganization and further statistical discussion; 19 pages, 4 figure

    vrmlgen: An R Package for 3D Data Visualization on the Web

    Get PDF
    The 3-dimensional representation and inspection of complex data is a frequently used strategy in many data analysis domains. Existing data mining software often lacks functionality that would enable users to explore 3D data interactively, especially if one wishes to make dynamic graphical representations directly viewable on the web. In this paper we present vrmlgen, a software package for the statistical programming language R to create 3D data visualizations in web formats like the Virtual Reality Markup Language (VRML) and LiveGraphics3D. vrmlgen can be used to generate 3D charts and bar plots, scatter plots with density estimation contour surfaces, and visualizations of height maps, 3D object models and parametric functions. For greater flexibility, the user can also access low-level plotting methods through a unified interface and freely group different function calls together to create new higher-level plotting methods. Additionally, we present a web tool allowing users to visualize 3D data online and test some of vrmlgen's features without the need to install any software on their computer.

    Swarm-Organized Topographic Mapping

    Get PDF
    Topographieerhaltende Abbildungen versuchen, hochdimensionale oder komplexe Datenbestände auf einen niederdimensionalen Ausgaberaum abzubilden, wobei die Topographie der Daten hinreichend gut wiedergegeben werden soll. Die Qualität solcher Abbildung hängt gewöhnlich vom eingesetzten Nachbarschaftskonzept des konstruierenden Algorithmus ab. Die Schwarm-Organisierte Projektion ermöglicht eine Lösung dieses Parametrisierungsproblems durch die Verwendung von Techniken der Schwarmintelligenz. Die praktische Verwendbarkeit dieser Methodik wurde durch zwei Anwendungen auf dem Feld der Molekularbiologie sowie der Finanzanalytik demonstriert

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.
Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. 
Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.
&#xa

    Supervised learning of short and high-dimensional temporal sequences for life science measurements

    Full text link
    The analysis of physiological processes over time are often given by spectrometric or gene expression profiles over time with only few time points but a large number of measured variables. The analysis of such temporal sequences is challenging and only few methods have been proposed. The information can be encoded time independent, by means of classical expression differences for a single time point or in expression profiles over time. Available methods are limited to unsupervised and semi-supervised settings. The predictive variables can be identified only by means of wrapper or post-processing techniques. This is complicated due to the small number of samples for such studies. Here, we present a supervised learning approach, termed Supervised Topographic Mapping Through Time (SGTM-TT). It learns a supervised mapping of the temporal sequences onto a low dimensional grid. We utilize a hidden markov model (HMM) to account for the time domain and relevance learning to identify the relevant feature dimensions most predictive over time. The learned mapping can be used to visualize the temporal sequences and to predict the class of a new sequence. The relevance learning permits the identification of discriminating masses or gen expressions and prunes dimensions which are unnecessary for the classification task or encode mainly noise. In this way we obtain a very efficient learning system for temporal sequences. The results indicate that using simultaneous supervised learning and metric adaptation significantly improves the prediction accuracy for synthetically and real life data in comparison to the standard techniques. The discriminating features, identified by relevance learning, compare favorably with the results of alternative methods. Our method permits the visualization of the data on a low dimensional grid, highlighting the observed temporal structure

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.
Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. 
Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.
&#xa

    Intrahepatic cholangiocarcinoma: review and update

    Get PDF
    Cholangiocarcinoma (CCA) is a heterogeneous group of malignancies that could develop at any level from the biliary tree. CCA is currently classified into intrahepatic (iCCA), perihilar and distal on the basis of its anatomical location. Of note, these three CCA subtypes have common features but also important inter-tumor and intra-tumor differences that can affect the pathogenesis and outcome. A unique feature of iCCA is that it recognizes as origin tissues, the hepatic parenchyma or large intrahepatic and extrahepatic bile ducts, which are furnished by two distinct stem cell niches, the canals of Hering and the peribiliary glands, respectively. The complexity of iCCA pathogenesis highlights the need of a multidisciplinary, translational and systemic approach to this malignancy. This review will focus on the advances of iCCA epidemiology, histo-morphology, risk factors, molecular pathogenesis, revealing the existence of multiple subsets of iCCA

    Computational exploration of molecular receptive fields in the olfactory bulb reveals a glomerulus-centric chemical map

    Get PDF
    © The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.Progress in olfactory research is currently hampered by incomplete knowledge about chemical receptive ranges of primary receptors. Moreover, the chemical logic underlying the arrangement of computational units in the olfactory bulb has still not been resolved. We undertook a large-scale approach at characterising molecular receptive ranges (MRRs) of glomeruli in the dorsal olfactory bulb (dOB) innervated by the MOR18-2 olfactory receptor, also known as Olfr78, with human ortholog OR51E2. Guided by an iterative approach that combined biological screening and machine learning, we selected 214 odorants to characterise the response of MOR18-2 and its neighbouring glomeruli. We found that a combination of conventional physico-chemical and vibrational molecular descriptors performed best in predicting glomerular responses using nonlinear Support-Vector Regression. We also discovered several previously unknown odorants activating MOR18-2 glomeruli, and obtained detailed MRRs of MOR18-2 glomeruli and their neighbours. Our results confirm earlier findings that demonstrated tunotopy, that is, glomeruli with similar tuning curves tend to be located in spatial proximity in the dOB. In addition, our results indicate chemotopy, that is, a preference for glomeruli with similar physico-chemical MRR descriptions being located in spatial proximity. Together, these findings suggest the existence of a partial chemical map underlying glomerular arrangement in the dOB. Our methodology that combines machine learning and physiological measurements lights the way towards future high-throughput studies to deorphanise and characterise structure-activity relationships in olfaction.Peer reviewe
    • …
    corecore