18 research outputs found

    BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data

    Get PDF
    Based on mixture models, we present a Bayesian method (called BClass) to classify biological entities (e.g. genes) when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group) in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots) and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.

    Were last glacial climate events simultaneous between Greenland and France? A quantitative comparison using non-tuned chronologies

    Get PDF
    Author Posting. © The Author(s), 2009. This is the author's version of the work. It is posted here by permission of John Wiley & Sons for personal use, not for redistribution. The definitive version was published in Journal of Quaternary Science 25 (2010): 387-394, doi:10.1002/jqs.1330.Several large abrupt climate fluctuations during the last glacial have been recorded in Greenland ice cores and archives from other regions. Often these Dansgaard-Oeschger events are assumed to have been synchronous over wide areas, and then used as tie-points to link chronologies between the proxy archives. However, it has not yet been tested independently whether or not these events were indeed synchronous over large areas. Here, we compare Dansgaard-Oeschgertype events in a well-dated record from southeastern France with those in Greenland ice cores. Instead of assuming simultaneous climate events between both archives, we keep their age models independent. Even these well-dated archives possess large chronological uncertainties, that prevent us from inferring synchronous climate events at decadal to multi-centennial time scales. If possible, comparisons between proxy archives should be based on independent, non-tuned time-scales.BW acknowledges support from the Swedish Research Council (VR)

    BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data

    Get PDF
    Based on mixture models, we present a Bayesian method (called BClass) to classify biological entities (e.g. genes) when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group) in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots) and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects

    The Comparison of 14C Wiggle-Matching Results for the ‘Floating’ Tree-Ring Chronology of the Ulandryk-4 Burial Ground (Altai Mountains, Siberia)

    No full text
    From the 18th International Radiocarbon Conference held in Wellington, New Zealand, September 1-5, 2003.Two independent 14C data sets of 10 tree-ring samples from the longest master chronology of the Pazyryk cultural complex were obtained and wiggle-matched to the absolute timescale. The results show very good agreement, within 10-15 calendar yr. The Ulandryk-4 burial ground (mound 1) was dated to about 320-310 cal BC, and this is consistent with wiggle-matching of the Pazyryk burial ground date series.The Radiocarbon archives are made available by Radiocarbon and the University of Arizona Libraries. Contact [email protected] for further information.Migrated from OJS platform February 202
    corecore