Search CORE

860 research outputs found

A high-quality bonobo genome refines the analysis of hominid evolution

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3,4,5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome

Archivio istituzionale della ricerca - Università di Bari

PubMed Central

A High-Quality Bonobo Genome Refines The Analysis Of Hominid Evolution

Author: Antonacci F
Audano P T
Baker C
Catacchio C R
Dishuck P C
Fernandes J D
Fiddes I T
Gordon D S
Harvey W T
Hastie A R
Haukness M
Hillier L W
Hoekzema M
Hoffman J
Hsieh P H
Huang T H
Lee J Y
Lewis A P
Mao Y F
Mao Y F
Mercuri L
Montinaro F
Munson K M
Murali S C
Pang A W
Piccolo I
Porubsky D
Salama S R
Sorensen M
Storer J M
Sulovari A
Thibaud-Nissen F
Underwood J G
Walker J A
Publication venue: LSU Digital Commons
Publication date: 03/06/2021
Field of study

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation(1,2). Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes(1,3-5) and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome

Louisiana State University

Reference genome and comparative genome analysis for the WHO reference strain for Mycobacterium bovis BCG Danish, the present tuberculosis vaccine

Author: Borgers Katlyn
Callewaert Nico
Festjens Nele
Lin Yao-Cheng
Michielsen Gitte
Ou Jheng-Yang
Plets Evelyn
Tiels Petra
Van Hecke Annelies
Zheng Po-Xing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Background: Mycobacterium bovis bacillus Calmette-Guerin (M. bovis BCG) is the only vaccine available against tuberculosis (TB). In an effort to standardize the vaccine production, three substrains, i.e. BCG Danish 1331, Tokyo 172-1 and Russia BCG-1 were established as the WHO reference strains. Both for BCG Tokyo 172-1 as Russia BCG-1, reference genomes exist, not for BCG Danish. In this study, we set out to determine the completely assembled genome sequence for BCG Danish and to establish a workflow for genome characterization of engineering-derived vaccine candidate strains.ResultsBy combining second (Illumina) and third (PacBio) generation sequencing in an integrated genome analysis workflow for BCG, we could construct the completely assembled genome sequence of BCG Danish 1331 (07/270) (and an engineered derivative that is studied as an improved vaccine candidate, a SapM KO), including the resolution of the analytically challenging long duplication regions. We report the presence of a DU1-like duplication in BCG Danish 1331, while this tandem duplication was previously thought to be exclusively restricted to BCG Pasteur. Furthermore, comparative genome analyses of publicly available data for BCG substrains showed the absence of a DU1 in certain BCG Pasteur substrains and the presence of a DU1-like duplication in some BCG China substrains. By integrating publicly available data, we provide an update to the genome features of the commonly used BCG strains. Conclusions: We demonstrate how this analysis workflow enables the resolution of genome duplications and of the genome of engineered derivatives of the BCG Danish vaccine strain. The BCG Danish WHO reference genome will serve as a reference for future engineered strains and the established workflow can be used to enhance BCG vaccine standardization

Ghent University Academic Bibliography

Computational methods to improve genome assembly and gene prediction

Author: Kelley David Roy
Publication venue
Publication date: 01/01/2011
Field of study

DNA sequencing is used to read the nucleotides composing the genetic material that forms individual organisms. As 2nd generation sequencing technologies offering high throughput at a feasible cost have matured, sequencing has permeated nearly all areas of biological research. By a combination of large-scale projects led by consortiums and smaller endeavors led by individual labs, the flood of sequencing data will continue, which should provide major insights into how genomes produce physical characteristics, including disease, and evolve. To realize this potential, computer science is required to develop the bioinformatics pipelines to efficiently and accurately process and analyze the data from large and noisy datasets. Here, I focus on two crucial bioinformatics applications: the assembly of a genome from sequencing reads and protein-coding gene prediction. In genome assembly, we form large contiguous genomic sequences from the short sequence fragments generated by current machines. Starting from the raw sequences, we developed software called Quake that corrects sequencing errors more accurately than previous programs by using coverage of k-mers and probabilistic modeling of sequencing errors. My experiments show correcting errors with Quake improves genome assembly and leads to the detection of more polymorphisms in re-sequencing studies. For post-assembly analysis, we designed a method to detect a particular type of mis-assembly where the two copies of each chromosome in diploid genomes diverge. We found thousands of examples in each of the chimpanzee, cow, and chicken public genome assemblies that created false segmental duplications. Shotgun sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to both discover unknown microbes and explore complex environments. We developed software called Scimm that clusters metagenomic sequences based on composition in an unsupervised fashion more accurately than previous approaches. Finally, we extended an approach for predicting protein-coding genes on whole genomes to metagenomic sequences by adding new discriminative features and augmenting the task with taxonomic classification and clustering of the sequences. The program, called Glimmer-MG, predicts genes more accurately than all previous methods. By adding a model for sequencing errors that also allows the program to predict insertions and deletions, accuracy significantly improves on error-prone sequences

CiteSeerX

Digital Repository at the University of Maryland

Characterisation of pathogen-specific regions and novel effector candidates in Fusarium oxysporum f. sp. cepae

Author: A Bankevich
A Gurevich
A Krogh
A Taylor
A Taylor
AC Testa
AH Williams
AL Delcher
AR Quinlan
B Gel
B Langmead
B Ramos
BJ Haas
BJ Walker
BV Chellapan
C Camacho
C Cramer
C Trapnell
CB Michielse
D Brayford
D Medini
F Gawehns
FA Simão
GH Teetor-Barsch
H Bayraktar
HC Does van der
I Vlaardingerbroek
J Mistry
J Niño-Sánchez
J Sperschneider
K Katoh
K Sasaki
K Tamura
KJ Hoff
L Epstein
L Faino
L Li
L Ma
L-J Ma
M Stanke
MJ Southwood
MJ Southwood
NH Barton
P Dam van
P Jones
PJ Kersey
PM Houterman
PM Houterman
R Apweiler
R Lanfear
RP Baayen
RP Brown
S Aimé
S Dong
S-Y Jiang
SM Schmidt
SM Schmidt
T Weber
TN Petersen
V Lombard
VGAA Vleeshouwers
Publication venue: Nature Publishing Group
Publication date: 01/01/2018
Field of study

A reference-quality assembly of Fusarium oxysporum f. sp. cepae (Foc), the causative agent of onion basal rot has been generated along with genomes of additional pathogenic and non-pathogenic isolates of onion. Phylogenetic analysis confirmed a single origin of the Foc pathogenic lineage. Genome alignments with other F. oxysporum ff. spp. and non pathogens revealed high levels of syntenic conservation of core chromosomes but little synteny between lineage specific (LS) chromosomes. Four LS contigs in Foc totaling 3.9 Mb were designated as pathogen-specific (PS). A two-fold increase in segmental duplication events was observed between LS regions of the genome compared to within core regions or from LS regions to the core. RNA-seq expression studies identified candidate effectors expressed in planta, consisting of both known effector homologs and novel candidates. FTF1 and a subset of other transcription factors implicated in regulation of effector expression were found to be expressed in planta

Crossref

Greenwich Academic Literature Archive

Directory of Open Access Journals

Warwick Research Archives Portal Repository

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

Author: Albracht Derek
et al
Fulton Robert S
Graves-Lindsay Tina
Kremitzki Milinn
Magrini Vincent
Markovic Chris
McGrath Sean
Steinberg Karyn Meltz
Wilson Richard K
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Digital Commons@Becker

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Guryev Victor
Lansdorp Peter
Porubský David
Spierings Diana
Publication venue
Publication date: 23/09/2017
Field of study

The incomplete identification of structural variants from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long- and short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,181 indel variants (<50 bp) and 31,599 structural variants (≥50 bp) per human genome, a seven fold increase in structural variation compared to previous reports, including from the 1000 Genomes Project. We also discovered 156 inversions per genome, most of which previously escaped detection, as well as large unbalanced chromosomal rearrangements. We provide near-complete, haplotype-resolved structural variation for three genomes that can now be used as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen