26 research outputs found

    Emergence of regulatory networks in simulated evolutionary processes

    Get PDF
    Despite spectacular progress in biophysics, molecular biology and biochemistry our ability to predict the dynamic behavior of multicellular systems under different conditions is very limited. An important reason for this is that still not enough is known about how cells change their physical and biological properties by genetic or metabolic regulation, and which of these changes affect the cell behavior. For this reason, it is difficult to predict the system behavior of multicellular systems in case the cell behavior changes, for example, as a consequence of regulation or differentiation. The rules that underlie the regulation processes have been determined on the time scale of evolution, by selection on the phenotypic level of cells or cell populations. We illustrate by detailed computer simulations in a multi-scale approach how cell behavior controlled by regulatory networks may emerge as a consequence of an evolutionary process, if either the cells, or populations of cells are subject to selection on particular features. We consider two examples, migration strategies of single cells searching a signal source, or aggregation of two or more cells within minimal multiscale models of biological evolution. Both can be found for example in the life cycle of the slime mold Dictyostelium discoideum. However, phenotypic changes that can lead to completely different modes of migration have also been observed in cells of multi-cellular organisms, for example, as a consequence of a specialization in stem cells or the de-differentiation in tumor cells. The regulatory networks are represented by Boolean networks and encoded by binary strings. The latter may be considered as encoding the genetic information (the genotype) and are subject to mutations and crossovers. The cell behavior reflects the phenotype. We find that cells adopt naturally observed migration strategies, controlled by networks that show robustness and redundancy. The model simplicity allow us to unambiguously analyze the regulatory networks and the resulting phenotypes by different measures and by knockouts of regulatory elements. We illustrate that in order to maintain a cells' phenotype in case of a knockout, the cell may have to be able to deal with contradictory information. In summary, both the cell phenotype as well as the emerged regulatory network behave as their biological counterparts observed in nature

    Progressive Multiple Sequence Alignments from Triplets

    Get PDF
    Motivation: The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors. Idea: Here we present a modified variant of progressive sequence alignments that addresses both issues. Instead of pairwise alignments we use exact dynamic programming to align sequence or profile triples. This avoids a large fractions of the ambiguities arising in pairwise alignments. In the subsequent aggregation steps we follow the logic of the Neighbor-Net algorithm, which constructs a phylogenetic network by step-wisely replacing triples by pairs instead of combining pairs to singletons. To this end the three-way alignments are subdivided into two partial alignments, at which stage all-gap columns are naturally removed. This alleviates the “once a gap, always a gap” problem of progressive alignment procedures. Results: The three-way Neighbor-Net based alignment program aln3nn is shown to compare favorably on both protein sequences and nucleic acids sequences to other progressive alignment tools. In the latter case one easily can include scoring terms that consider secondary structure features. Overall, the quality of resulting alignments in general exceeds that of clustalw or other multiple alignments tools even though our software does not included heuristics for context dependent (mis)match scores

    Noisy: Identification of problematic columns in multiple sequence alignments

    Get PDF
    <p>Abstract</p> <p>Motivation</p> <p>Sequence-based methods for phylogenetic reconstruction from (nucleic acid) sequence data are notoriously plagued by two effects: homoplasies and alignment errors. Large evolutionary distances imply a large number of homoplastic sites. As most protein-coding genes show dramatic variations in substitution rates that are not uncorrelated across the sequence, this often leads to a patchwork pattern of (i) phylogenetically informative and (ii) effectively randomized regions. In highly variable regions, furthermore, alignment errors accumulate resulting in sometimes misleading signals in phylogenetic reconstruction.</p> <p>Results</p> <p>We present here a method that, based on assessing the distribution of character states along a cyclic ordering of the taxa, allows the identification of phylogenetically uninformative homoplastic sites in a multiple sequence alignment. Removal of these sites appears to improve the performance of phylogenetic reconstruction algorithms as measured by various indices of "tree quality". In particular, we obtain more stable trees due to the exclusion of phylogenetically incompatible sites that most likely represent strongly randomized characters.</p> <p>Software</p> <p>The computer program noisy implements this approach. It can be employed to improving phylogenetic reconstruction capability with quite a considerable success rate whenever (1) the average bootstrap support obtained from the original alignment is low, and (2) there are sufficiently many taxa in the data set – at least, say, 12 to 15 taxa. The software can be obtained under the GNU Public License from <url>http://www.bioinf.uni-leipzig.de/Software/noisy/</url>.</p

    Geo-Information Harvesting from Social Media Data

    Get PDF
    As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multi-perspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characteristics. In this article, we address key aspects in the field, including data availability, analysis-ready data preparation and data management, geo-information extraction from social media text messages and images, and the fusion of social media and remote sensing data. We then showcase some exemplary geographic applications. In addition, we present the first extensive discussion of ethical considerations of social media data in the context of geo-information harvesting and geographic applications. With this effort, we wish to stimulate curiosity and lay the groundwork for researchers who intend to explore social media data for geo-applications. We encourage the community to join forces by sharing their code and data.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    Noisy: Identification of problematic columns in multiple sequence alignments

    Get PDF
    Motivation Sequence-based methods for phylogenetic reconstruction from (nucleic acid) sequence data are notoriously plagued by two effects: homoplasies and alignment errors. Large evolutionary distances imply a large number of homoplastic sites. As most protein-coding genes show dramatic variations in substitution rates that are not uncorrelated across the sequence, this often leads to a patchwork pattern of (i) phylogenetically informative and (ii) effectively randomized regions. In highly variable regions, furthermore, alignment errors accumulate resulting in sometimes misleading signals in phylogenetic reconstruction. Results We present here a method that, based on assessing the distribution of character states along a cyclic ordering of the taxa, allows the identification of phylogenetically uninformative homoplastic sites in a multiple sequence alignment. Removal of these sites appears to improve the performance of phylogenetic reconstruction algorithms as measured by various indices of 'tree quality'. In particular, we obtain more stable trees due to the exclusion of phylogenetically incompatible sites that most likely represent strongly randomized characters. Software The computer program noisy implements this approach. It can be employed to improving phylogenetic reconstruction capability with quite a considerable success rate whenever (1) the average bootstrap support obtained from the original alignment is low, and (2) there are sufficiently many taxa in the data set – at least, say, 12 to 15 taxa. The software can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/noisy/

    Geo-Information Harvesting from Social Media Data

    Get PDF
    As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multiperspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characteristics. In this article, we address key aspects in the field, including data availability, analysisready data preparation and data management, geo-information extraction from social media text messages and images, and the fusion of social media and remote sensing data. We then showcase some exemplary geographic applications. In addition, we present the first extensive discussion of ethical considerations of social media data in the context of geo-information harvesting and geographic applications. With this effort, we wish to stimulate curiosity and lay the groundwork for researchers who intend to explore social media data for geo-applications. We encourage the community to join forces by sharing their code and data

    A survey of uncertainty in deep neural networks

    Get PDF
    Over the last decade, neural networks have reached almost every field of science and become a crucial part of various real world applications. Due to the increasing spread, confidence in neural network predictions has become more and more important. However, basic neural networks do not deliver certainty estimates or suffer from over- or under-confidence, i.e. are badly calibrated. To overcome this, many researchers have been working on understanding and quantifying uncertainty in a neural network's prediction. As a result, different types and sources of uncertainty have been identified and various approaches to measure and quantify uncertainty in neural networks have been proposed. This work gives a comprehensive overview of uncertainty estimation in neural networks, reviews recent advances in the field, highlights current challenges, and identifies potential research opportunities. It is intended to give anyone interested in uncertainty estimation in neural networks a broad overview and introduction, without presupposing prior knowledge in this field. For that, a comprehensive introduction to the most crucial sources of uncertainty is given and their separation into reducible model uncertainty and irreducible data uncertainty is presented. The modeling of these uncertainties based on deterministic neural networks, Bayesian neural networks (BNNs), ensemble of neural networks, and test-time data augmentation approaches is introduced and different branches of these fields as well as the latest developments are discussed. For a practical application, we discuss different measures of uncertainty, approaches for calibrating neural networks, and give an overview of existing baselines and available implementations. Different examples from the wide spectrum of challenges in the fields of medical image analysis, robotics, and earth observation give an idea of the needs and challenges regarding uncertainties in the practical applications of neural networks. Additionally, the practical limitations of uncertainty quantification methods in neural networks for mission- and safety-critical real world applications are discussed and an outlook on the next steps towards a broader usage of such methods is given

    Emergence of regulatory networks in simulated evolutionary processes

    Get PDF
    Despite spectacular progress in biophysics, molecular biology and biochemistry our ability to predict the dynamic behavior of multicellular systems under different conditions is very limited. An important reason for this is that still not enough is known about how cells change their physical and biological properties by genetic or metabolic regulation, and which of these changes affect the cell behavior. For this reason, it is difficult to predict the system behavior of multicellular systems in case the cell behavior changes, for example, as a consequence of regulation or differentiation. The rules that underlie the regulation processes have been determined on the time scale of evolution, by selection on the phenotypic level of cells or cell populations. We illustrate by detailed computer simulations in a multi-scale approach how cell behavior controlled by regulatory networks may emerge as a consequence of an evolutionary process, if either the cells, or populations of cells are subject to selection on particular features. We consider two examples, migration strategies of single cells searching a signal source, or aggregation of two or more cells within minimal multiscale models of biological evolution. Both can be found for example in the life cycle of the slime mold Dictyostelium discoideum. However, phenotypic changes that can lead to completely different modes of migration have also been observed in cells of multi-cellular organisms, for example, as a consequence of a specialization in stem cells or the de-differentiation in tumor cells. The regulatory networks are represented by Boolean networks and encoded by binary strings. The latter may be considered as encoding the genetic information (the genotype) and are subject to mutations and crossovers. The cell behavior reflects the phenotype. We find that cells adopt naturally observed migration strategies, controlled by networks that show robustness and redundancy. The model simplicity allow us to unambiguously analyze the regulatory networks and the resulting phenotypes by different measures and by knockouts of regulatory elements. We illustrate that in order to maintain a cells' phenotype in case of a knockout, the cell may have to be able to deal with contradictory information. In summary, both the cell phenotype as well as the emerged regulatory network behave as their biological counterparts observed in nature
    corecore