Search CORE

5,859 research outputs found

Recommended from our members

OMMA enables population-scale analysis of complex genomic features and phylogenomic relationships from nanochannel-based optical maps.

Author: Chan Ting-Fung
Chu Catherine
Ho Pak-Leung
Kwok Pui-Yan
Lai Yvonne Yuk-Yin
Leung Alden King-Yung
Li Le
Liu Melissa Chun-Jiao
Yip Kevin Y
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

BackgroundOptical mapping is an emerging technology that complements sequencing-based methods in genome analysis. It is widely used in improving genome assemblies and detecting structural variations by providing information over much longer (up to 1 Mb) reads. Current standards in optical mapping analysis involve assembling optical maps into contigs and aligning them to a reference, which is limited to pairwise comparison and becomes bias-prone when analyzing multiple samples.FindingsWe present a new method, OMMA, that extends optical mapping to the study of complex genomic features by simultaneously interrogating optical maps across many samples in a reference-independent manner. OMMA captures and characterizes complex genomic features, e.g., multiple haplotypes, copy number variations, and subtelomeric structures when applied to 154 human samples across the 26 populations sequenced in the 1000 Genomes Project. For small genomes such as pathogenic bacteria, OMMA accurately reconstructs the phylogenomic relationships and identifies functional elements across 21 Acinetobacter baumannii strains.ConclusionsWith the increasing data throughput of optical mapping system, the use of this technology in comparative genome analysis across many samples will become feasible. OMMA is a timely solution that can address such computational need. The OMMA software is available at https://github.com/TF-Chan-Lab/OMTools

eScholarship - University of California

Integration of Alignment and Phylogeny in the Whole-Genome Era

Author: Sun Hongtao
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2015
Field of study

With the development of new sequencing techniques, whole genomes of many species have become available. This huge amount of data gives rise to new opportunities and challenges. These new sequences provide valuable information on relationships among species, e.g. genome recombination and conservation. One of the principal ways to investigate such information is multiple sequence alignment (MSA). Currently, there is large amount of MSA data on the internet, such as the UCSC genome database, but how to effectively use this information to solve classical and new problems is still an area lacking of exploration. In this thesis, we explored how to use this information in four problems, i.e. sequence orthology search problem, multiple alignment improvement problem, short read mapping problem, and genome rearrangement inference problem. For the first problem, we developed a EM algorithm to iteratively align a query with a multiple alignment database with the information from a phylogeny relating the query species and the species in the multiple alignment. We also infer the query\u27s location in the phylogeny. We showed that by doing alignment and phylogeny inference together, we can improve the accuracies for both problems. For the second problem, we developed an optimization algorithm to iteratively refine the multiple alignment quality. Experiment results showed our algorithm is very stable in term of resulting alignments. The results showed that our method is more accurate than existing methods, i.e. Mafft, Clustal-O, and Mavid, on test data from three sets of species from the UCSC genome database. For the third problem, we developed a model, PhyMap, to align a read to a multiple alignment allowing mismatches and indels. PhyMap computes local alignments of a query sequence against a fixed multiple-genome alignment of closely related species. PhyMap uses a known phylogenetic tree on the species in the multiple alignment to improve the quality of its computed alignments while also estimating the placement of the query on this tree. Both theoretical computation and experiment results show that our model can differentiate between orthologous and paralogous alignments better than other popular short read mapping tools (BWA, BOWTIE and BLAST). For the fourth problem, we gave a simple genome recombination model which can express insertions, deletions, inversions, translocations and inverted translocations on aligned genome segments. We also developed an MCMC algorithm to infer the order of the query segments. We proved that using any Euclidian metrics to measure distance between two sequence orders in the tree optimization goal function will lead to a degenerated solution where the inferred order will be the order of one of the leaf nodes. We also gave a graph-based formulation of the problem which can represent the probability distribution of the order of the query sequences

Washington University St. Louis: Open Scholarship

Proceedings of the 1st Computer Science Student Workshop: Koc University Istinye Campus, Istanbul, Turkey, February 21, 2010

Author
Publication venue: Sabancı University
Publication date: 01/01/2010
Field of study

Sabanci University Research Database

The inference of gene trees with species trees

Author: Bastien Boussau
Eric Tannier
Gergely J. Szöllősi
Montbonnot France
Vincent Daubin
Publication venue
Publication date: 04/11/2013
Field of study

Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can co-exist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice-versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. In this article we review the various models that have been used to describe the relationship between gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a better basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational Evolutionary Biology" conference, Montpellier, 201

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

PubMed Central

HAL

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

Hal-Diderot

GenomeFingerprinter and universal genome fingerprint analysis for systematic comparative genomics

Author: Ai Hannan
Ai Yuncan
Meng Fanmei
Zhao Lei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 09/03/2013
Field of study

How to compare whole genome sequences at large scale has not been achieved via conventional methods based on pair-wisely base-to-base comparison; nevertheless, no attention was paid to handle in-one-sitting a number of genomes crossing genetic category (chromosome, plasmid, and phage) with farther divergences (much less or no homologous) over large size ranges (from Kbp to Mbp). We created a new method, GenomeFingerprinter, to unambiguously produce three-dimensional coordinates from a sequence, followed by one three-dimensional plot and six two-dimensional trajectory projections to illustrate whole genome fingerprints. We further developed a set of concepts and tools and thereby established a new method, universal genome fingerprint analysis. We demonstrated their applications through case studies on over a hundred of genome sequences. Particularly, we defined the total genetic component configuration (TGCC) (i.e., chromosome, plasmid, and phage) for describing a strain as a system, and the universal genome fingerprint map (UGFM) of TGCC for differentiating a strain as a universal system, as well as the systematic comparative genomics (SCG) for comparing in-one-sitting a number of genomes crossing genetic category in diverse strains. By using UGFM, UGFM-TGCC, and UGFM-TGCC-SCG, we compared a number of genome sequences with farther divergences (chromosome, plasmid, and phage; bacterium, archaeal bacterium, and virus) over large size ranges (6Kbp~5Mbp), giving new insights into critical problematic issues in microbial genomics in the post-genomic era. This paper provided a new method for rapidly computing, geometrically visualizing, and intuitively comparing genome sequences at fingerprint level, and hence established a new method of universal genome fingerprint analysis for systematic comparative genomics.Comment: 63 pages, 15 figures, 5 table

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

FigShare

Analysis of local genome rearrangement improves resolution of ancestral genomic maps in plants

Author: Dörr Daniel
Martinez Fábio Henrique Viduani
Rubert Diego
Stoye Jens
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Rubert D, Martinez FHV, Stoye J, Dörr D. Analysis of local genome rearrangement improves resolution of ancestral genomic maps in plants. BMC Genomics. 2020;21(Suppl. 2): 273.Background Computationally inferred ancestral genomes play an important role in many areas of genome research. We present an improved workflow for the reconstruction from highly diverged genomes such as those of plants. Results Our work relies on an established workflow in the reconstruction of ancestral plants, but improves several steps of this process. Instead of using gene annotations for inferring the genome content of the ancestral sequence, we identify genomic markers through a process called genome segmentation. This enables us to reconstruct the ancestral genome from hundreds of thousands of markers rather than the tens of thousands of annotated genes. We also introduce the concept of local genome rearrangement, through which we refine syntenic blocks before they are used in the reconstruction of contiguous ancestral regions. With the enhanced workflow at hand, we reconstruct the ancestral genome of eudicots, a major sub-clade of flowering plants, using whole genome sequences of five modern plants. Conclusions Our reconstructed genome is highly detailed, yet its layout agrees well with that reported in Badouin et al. (2017). Using local genome rearrangement, not only the marker-based, but also the gene-based reconstruction of the eudicot ancestor exhibited increased genome content, evidencing the power of this novel concept

Publications at Bielefeld University

Diversity of 23S rRNA Genes within Individual Prokaryotic Genomes

Author: AB Burgin
Anna Pei
BJ Paster
C Zwieb
Carlos W. Nossa
CR Woese
CR Woese
CS Harrington
D Liao
DA Relman
David M. Rosmarin
DE Hunt
DH Mathews
DW Wood
E Duchaud
E Evguenieva-Hackenberg
E Stackebrandt
F Schlunzen
FJ Stewart
G Santoyo
H Hori
H Oberreuter
J Wuyts
JA Klappenbach
Jason E. Stajich
JD Thompson
JR Cole
Liying Yang
M McClelland
Martin J. Blaser
NR Mattatall
O White
P De Rijk
P De Rijk
P Stiegler
P Vandamme
PB Eckburg
PG Higgs
Pooja Chokshi
Q Bao
R Cedergren
RR Gutell
SG Acinas
T Asai
T Coenye
T Kaneko
T Yokoyama
TH Eickbush
TZ DeSantis
V Gurtler
W Ludwig
WC Curtiss
WF Doolittle
Z Pei
Zhiheng Pei
Publication venue: Public Library of Science
Publication date: 05/05/2009
Field of study

The concept of ribosomal constraints on rRNA genes is deduced primarily based on the comparison of consensus rRNA sequences between closely related species, but recent advances in whole-genome sequencing allow evaluation of this concept within organisms with multiple rRNA operons. was the only species in which intragenomic diversity >3% was observed among 4 paralogous 23S rRNA genes.These findings indicate tight ribosomal constraints on individual 23S rRNA genes within a genome. Although classification using primary 23S rRNA sequences could be erroneous, significant diversity among paralogous 23S rRNA genes was observed only once in the 184 species analyzed, indicating little overall impact on the mainstream of 23S rRNA gene-based prokaryotic taxonomy

Public Library of Science (PLOS)

Crossref

PubMed Central

Phylogenetic relationships of the Wolbachia of nematodes and arthropods

Author: Claire Conlon
Edward J Pearce
Julian Parkhill
Katelyn Fenn
Mark Blaxter
Martin Jones
Michael A Quail
Nancy E Holroyd
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2006
Field of study

Wolbachia are well known as bacterial symbionts of arthropods, where they are reproductive parasites, but have also been described from nematode hosts, where the symbiotic interaction has features of mutualism. The majority of arthropod Wolbachia belong to clades A and B, while nematode Wolbachia mostly belong to clades C and D, but these relationships have been based on analysis of a small number of genes. To investigate the evolution and relationships of Wolbachia symbionts we have sequenced over 70 kb of the genome of wOvo, a Wolbachia from the human-parasitic nematode Onchocerca volvulus, and compared the genes identified to orthologues in other sequenced Wolbachia genomes. In comparisons of conserved local synteny, we find that wBm, from the nematode Brugia malayi, and wMel, from Drosophila melanogaster, are more similar to each other than either is to wOvo. Phylogenetic analysis of the protein-coding and ribosomal RNA genes on the sequenced fragments supports reciprocal monophyly of nematode and arthropod Wolbachia. The nematode Wolbachia did not arise from within the A clade of arthropod Wolbachia, and the root of the Wolbachia clade lies between the nematode and arthropod symbionts. Using the wOvo sequence, we identified a lateral transfer event whereby segments of the Wolbachia genome were inserted into the Onchocerca nuclear genome. This event predated the separation of the human parasite O. volvulus from its cattle-parasitic sister species, O. ochengi. The long association between filarial nematodes and Wolbachia symbionts may permit more frequent genetic exchange between their genomes

Crossref

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer