860 research outputs found

    Speciation dynamics of an agent-based evolution model in phenotype space

    Get PDF
    This dissertation is an exploration of phase transition behavior and clustering of populations of organisms in an agent-based model of evolutionary dynamics. The agents in the model are organisms, described as branching-coalescing random walkers, which are characterized by their coordinates in a two-dimensional phenotype space. Neutral evolutionary conditions are assumed, such that no organism has a fitness advantage regardless of its phenotype location. Lineages of organisms evolve by limiting the maximum possible offspring distance from their parent(s) (mutability, which is the only heritable trait) along each coordinate in phenotype space. As mutability is varied, a non-equilibrium phase transition is shown to occur for populations reproducing by assortative mating and asexual fission. Furthermore, mutability is also shown to change the clustering behavior of populations. Random mating is shown to destroy both phase transition behavior and clustering. The phase transition behavior is characterized in the asexual fission case. By demonstrating that the populations near criticality collapse to universal scaling functions with appropriate critical exponents, this case is shown to belong to the directed percolation universality class. Finally, lineage behavior is explored for both organisms and clusters. The lineage lifetimes of the initial population of organisms are found to have a power-law probability density which scales with the correlation length exponent near critical mutability. The cluster centroid step-sizes obey a probability density function that is bimodal for all mutability values, and the average displays a linear dependence upon mutability in the supercritical range. Cluster lineage tree structures are shown to have Kingman\u27s coalescent universal tree structure at the directed percolation phase transition despite more complicated lineage structures. --Abstract, page iii

    Computer Vision Approaches for Mapping Gene Expression onto Lineage Trees

    Get PDF
    This project concerns studying the early development of living organisms. This period is accompanied by dynamic morphogenetic events. There is an increase in the number of cells, changes in the shape of cells and specification of cell fate during this time. Typically, in order to capture the dynamic morphological changes, one can employ a form of microscopy imaging such as Selective Plane Illumination Microscopy (SPIM) which offers a single-cell resolution across time, and hence allows observing the positions, velocities and trajectories of most cells in a developing embryo. Unfortunately, the dynamic genetic activity which underlies these morphological changes and influences cellular fate decision, is captured only as static snapshots and often requires processing (sequencing or imaging) multiple distinct individuals. In order to set the stage for characterizing the factors which influence cellular fate, one must bring the data arising from the above-mentioned static snapshots of multiple individuals and the data arising from SPIM imaging of other distinct individual(s) which characterizes the changes in morphology, into the same frame of reference. In this project, a computational pipeline is established, which achieves the aforementioned goal of mapping data from these various imaging modalities and specimens to a canonical frame of reference. This pipeline relies on the three core building blocks of Instance Segmentation, Tracking and Registration. In this dissertation work, I introduce EmbedSeg which is my solution to performing instance segmentation of 2D and 3D (volume) image data. Next, I introduce LineageTracer which is my solution to performing tracking of a time-lapse (2d+t, 3d+t) recording. Finally, I introduce PlatyMatch which is my solution to performing registration of volumes. Errors from the application of these building blocks accumulate which produces a noisy observation estimate of gene expression for the digitized cells in the canonical frame of reference. These noisy estimates are processed to infer the underlying hidden state by using a Hidden Markov Model (HMM) formulation. Lastly, for wider dissemination of these methods, one requires an effective visualization strategy. A few details about the employed approach are also discussed in the dissertation work. The pipeline was designed keeping imaging volume data in mind, but can easily be extended to incorporate other data modalities, if available, such as single cell RNA Sequencing (scRNA-Seq) (more details are provided in the Discussion chapter). The methods elucidated in this dissertation would provide a fertile playground for several experiments and analyses in the future. Some of such potential experiments and current weaknesses of the computational pipeline are also discussed additionally in the Discussion Chapter

    Phylogenomic characterization of flaviviruses

    Get PDF
    Background: The occurrences of global viral pandemics have been rising as increased travel between distant countries has introduced previously endemic viruses to new envi-ronments. Major contributors to global human hemorrhagic and neurological diseases with high mortality rates include half of the ca. 70 species of the genus Flavivirus. The most widespread and well-known flaviviruses are Dengue virus, Japanese encephalitis virus, West Nile virus and Zika virus. Although the transmission routes of major viruses are well-documented and thoroughly researched, the knowledge has been gained from past outbreaks, which has been a limitation in surveillance of novel flaviviruses. Thus, having early information about potential hosts is essential in controlling and preventing viral outbreaks. Aims: The goal of the master’s thesis is to characterize the codon and nucleotide com-positions of flaviviruses and to assess a potential use to the identification of putative hosts. This methodology will be utilized to develop a new algorithm capable of identifying optimal hosts through a simple comparative codon usage analysis. This information will be highly valuable to estimate the risk of spread of a virus. Methods: The genomic characterization of flaviviruses was done with computational bi-ology methods. Computed codon usages were analyzed with clustering methods to iden-tify subgroups of viruses and their optimal hosts. The rationale behind this methodology was that codon usages vary among species and this variability is driven by the virus adaptation to the hosts. Results: (1) Genotypes of Zika viruses showed distinct codon usage patterns, which linked the origin of American and European virus cases to the Asian genotype. (2) Dis-tinct usage patterns were similarly observed when the methodology was applied to other major flaviviruses. (3) Optimal hosts for mosquito-borne flaviviruses included vertebrates and Aedes mosquitos, whereas tick-borne viruses were optimized to ticks. Aedes mos-quitoes were also optimal for insect-only flaviviruses. Culex and Anopheles mosquitoes were suboptimal to all groups. Moreover, flaviviruses clustered based on established vector-based classification, host types preferences and phylogeny. The identified hosts were in accordance to previous studies done in field and laboratory. Conclusions: The proposed methodology based on codon usages is able to estimate hosts for flaviviruses within a close range. The algorithm can be implemented in compu-tationally weak equipment, thus it may be deployed fast and on-site during viral pandem-ics. In further studies this methodology, with minor modifications, could be utilized to predict putative hosts of other viruses. A scientific article describing the host identification algorithm is under preparation (appendix 4)

    Global phylogeny of Treponema pallidum lineages reveals recent expansion and spread of contemporary syphilis.

    Get PDF
    Funder: Queensland GovernmentSyphilis, which is caused by the sexually transmitted bacterium Treponema pallidum subsp. pallidum, has an estimated 6.3 million cases worldwide per annum. In the past ten years, the incidence of syphilis has increased by more than 150% in some high-income countries, but the evolution and epidemiology of the epidemic are poorly understood. To characterize the global population structure of T. pallidum, we assembled a geographically and temporally diverse collection of 726 genomes from 626 clinical and 100 laboratory samples collected in 23 countries. We applied phylogenetic analyses and clustering, and found that the global syphilis population comprises just two deeply branching lineages, Nichols and SS14. Both lineages are currently circulating in 12 of the 23 countries sampled. We subdivided T. p. pallidum into 17 distinct sublineages to provide further phylodynamic resolution. Importantly, two Nichols sublineages have expanded clonally across 9 countries contemporaneously with SS14. Moreover, pairwise genome analyses revealed examples of isolates collected within the last 20 years from 14 different countries that had genetically identical core genomes, which might indicate frequent exchange through international transmission. It is striking that most samples collected before 1983 are phylogenetically distinct from more recently isolated sublineages. Using Bayesian temporal analysis, we detected a population bottleneck occurring during the late 1990s, followed by rapid population expansion in the 2000s that was driven by the dominant T. pallidum sublineages circulating today. This expansion may be linked to changing epidemiology, immune evasion or fitness under antimicrobial selection pressure, since many of the contemporary syphilis lineages we have characterized are resistant to macrolides

    Constructive connectomics: How neuronal axons get from here to there using gene-expression maps derived from their family trees

    Full text link
    During brain development, billions of axons must navigate over multiple spatial scales to reach specific neuronal targets, and so build the processing circuits that generate the intelligent behavior of animals. However, the limited information capacity of the zygotic genome puts a strong constraint on how, and which, axonal routes can be encoded. We propose and validate a mechanism of development that can provide an efficient encoding of this global wiring task. The key principle, confirmed through simulation, is that basic constraints on mitoses of neural stem cells—that mitotic daughters have similar gene expression to their parent and do not stray far from one another—induce a global hierarchical map of nested regions, each marked by the expression profile of its common progenitor population. Thus, a traversal of the lineal hierarchy generates a systematic sequence of expression profiles that traces a staged route, which growth cones can follow to their remote targets. We have analyzed gene expression data of developing and adult mouse brains published by the Allen Institute for Brain Science, and found them consistent with our simulations: gene expression indeed partitions the brain into a global spatial hierarchy of nested contiguous regions that is stable at least from embryonic day 11.5 to postnatal day 56. We use this experimental data to demonstrate that our axonal guidance algorithm is able to robustly extend arbors over long distances to specific targets, and that these connections result in a qualitatively plausible connectome. We conclude that, paradoxically, cell division may be the key to uniting the neurons of the brain

    Graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells

    Get PDF
    Single-cell RNA-seq allows quantification of biological heterogeneity across both discrete cell types and continuous cell differentiation transitions. We present approximate graph abstraction (AGA), an algorithm that reconciles the computational analysis strategies of clustering and trajectory inference by explaining cell-to-cell variation both in terms of discrete and continuous latent variables (https://github.com/theislab/graph_abstraction). This enables to generate cellular maps of differentiation manifolds with complex topologies - efficiently and robustly across different datasets. Approximate graph abstraction quantifies the connectivity of partitions of a neighborhood graph of single cells, thereby generating a much simpler abstracted graph whose nodes label the partitions. Together with a random walk-based distance measure, this generates a topology preserving map of single cells - a partial coordinatization of data useful for exploring and explaining its variation. We use the abstracted graph to assess which subsets of data are better explained by discrete clusters than by a continuous variable, to trace gene expression changes along aggregated single-cell paths through data and to infer abstracted trees that best explain the global topology of data. We demonstrate the power of the method by reconstructing differentiation processes with high numbers of branchings from single-cell gene expression datasets and by identifying biological trajectories from single-cell imaging data using a deep-learning based distance metric. Along with the method, we introduce measures for the connectivity of graph partitions, generalize random-walk based distance measures to disconnected graphs and introduce a path-based measure for topological similarity between graphs. Graph abstraction is computationally efficient and provides speedups of at least 30 times when compared to algorithms for the inference of lineage trees
    • …
    corecore