18 research outputs found
BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data
Based on mixture models, we present a Bayesian method (called BClass) to classify biological entities (e.g. genes) when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group) in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots) and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.
Were last glacial climate events simultaneous between Greenland and France? A quantitative comparison using non-tuned chronologies
Author Posting. © The Author(s), 2009. This is the author's version of the work. It is posted here by permission of John Wiley & Sons for personal use, not for redistribution. The definitive version was published in Journal of Quaternary Science 25 (2010): 387-394, doi:10.1002/jqs.1330.Several large abrupt climate fluctuations during the last glacial have been recorded in Greenland ice
cores and archives from other regions. Often these Dansgaard-Oeschger
events are assumed to have
been synchronous over wide areas, and then used as tie-points
to link chronologies between the proxy
archives. However, it has not yet been tested independently whether or not these events were indeed
synchronous over large areas. Here, we compare Dansgaard-Oeschgertype
events in a well-dated
record
from southeastern
France with those in Greenland ice cores. Instead of assuming simultaneous climate
events between both archives, we keep their age models independent. Even these well-dated
archives
possess large chronological uncertainties, that prevent us from inferring synchronous climate events at
decadal to multi-centennial
time scales. If possible, comparisons between proxy archives should be
based on independent, non-tuned
time-scales.BW acknowledges support from the Swedish Research Council (VR)
BClass: A Bayesian Approach Based on Mixture Models for Clustering and Classification of Heterogeneous Biological Data
Based on mixture models, we present a Bayesian method (called BClass) to classify biological entities (e.g. genes) when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group) in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes.
BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots) and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects
The Comparison of 14C Wiggle-Matching Results for the âFloatingâ Tree-Ring Chronology of the Ulandryk-4 Burial Ground (Altai Mountains, Siberia)
From the 18th International Radiocarbon Conference held in Wellington, New Zealand, September 1-5, 2003.Two independent 14C data sets of 10 tree-ring samples from the longest master chronology of the Pazyryk cultural complex were obtained and wiggle-matched to the absolute timescale. The results show very good agreement, within 10-15 calendar yr. The Ulandryk-4 burial ground (mound 1) was dated to about 320-310 cal BC, and this is consistent with wiggle-matching of the Pazyryk burial ground date series.The Radiocarbon archives are made available by Radiocarbon and the University of Arizona Libraries. Contact [email protected] for further information.Migrated from OJS platform February 202