9 research outputs found

    Statistical estimation problems in phylogenomics and applications in microbial ecology

    Get PDF
    With the growing awareness of the potential for microbial communities to play a role in human health, environmental remediation and other important processes, the challenge of understanding such a complex population through the lens of high-throughput sequencing output has risen to the fore. For a de novo sequenced community, the first step to understanding the population involves comparing the sequences to a reference database in some form. In this dissertation, we consider some challenges and benefits of organizing the reference data according to evolution, with orthologous genes grouped together and stored as a multiple sequence alignment and phylogenetic tree. First we consider the related problem of estimating the population-level phylogeny of a group of species based on the alignments and phylogenies of several individual genes. Under one common model, species tree estimation is provably statistically consistent by several different methods, but those proofs rely on two separate and potentially shaky assumptions: that every species appears in the data for every gene (i.e., there is no missing data), and that since gene tree estimation is itself consistent, the gene trees used to compute the population-level tree are correct. Second, we explore some novel ways to use a Bayesian MCMC algorithm for jointly estimating alignment and phylogeny. The result is increased accuracy for large alignments, where the MCMC method alone would not be tractable. In the process, we identify a peculiar property of this Bayesian algorithm: it performs much differently on simulated sequences than on sequences from biological alignment benchmarks. No other alignment method tested showed the same divergence. Finally, we present two different practical applications a reference database containing an alignment and tree for a group of gene families in the context of microbial ecology. The first is an algorithm that uses the tree and alignment to construct an ensemble of profile hidden Markov models that improves remote homology detection. The second is a data visualization technique that generates an image of the community with a high density of data, but one that makes it naturally easy to compare many different samples at a time, potentially uncovering otherwise elusive patterns in the data

    Palynology and Paleoclimatology of the Chicxulub Impact Crater in the Early Paleogene

    Get PDF
    At the end of the Cretaceous Period, a large bolide impacted the Earth and formed the Chicxulub impact crater in the Yucatán Peninsula, Mexico. In 2016, International Ocean Discovery Program (IODP) Expedition 364 Site M0077 drilled into the buried peak ring of the crater, recovering a marine Paleocene to early Eocene post-impact section deposited on top of the impact breccia. Palynological analysis of 195 samples from the post-impact section has yielded the first pre-Holocene vegetational record from inside the Chicxulub impact crater and the first palynological record of the recovery of life following the end-Cretaceous mass extinction from inside the Chicxulub impact crater. The pollen and plant spore assemblage has been fully described, including one new genus (Scabrastephanoporites) and five new species (Brosipollis reticulatus, Echimonocolpites chicxulubensis, Psilastephanocolporites hammenii, Scabrastephanoporites variabilis, and Striatopollis grahamii) of angiosperm pollen. Dinoflagellate cysts from the K/Pg (Cretaceous/Paleogene) transitional unit, likely deposited within six years of the impact event, include several probably reworked Maastrichtian specimens, as well as possible in situ early Paleocene dinoflagellate cysts. The oldest terrestrial palynomorphs, two specimens of Deltoidospora, were not observed until at least 200,000 years after the impact. The PETM has been identified in the Site M0077 core based on biostratigraphy and a negative carbon isotope excursion. Geochemical and microfossil evidence indicates sea surface temperatures of ~38 °C, increased terrestrial input, salinity stratification, and bottom water anoxia. Palynomorph concentrations increase in the PETM, with an acme of the dinoflagellate genus Apectodinium in the lower PETM section, and a diverse pollen assemblage derived from a lowland tropical shrubby forest, likely from exposed portions of the Yucatán Peninsula to the south. A second spike in palynomorph concentrations occurs upsection of the PETM in a laminated dark shale, possibly representing the early Eocene hyperthermal event ETM3. Pollen and plant spore concentrations generally increase in sediments deposited during and shortly after the Early Eocene Climatic Optimum (EECO), consistent with infilling and shallowing of the crater basin during the early Paleogene. The EECO pollen and plant spore assemblages indicate a continuously present shrubby lowland tropical forest, with Malvacipollis, Bombacacidites, Brosipollis, and Crudia type pollen
    corecore