Search CORE

34,013 research outputs found

The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

Author: Cardeno Charis M
Corbett-Detig Russell B
Crepeau Marc W
Lack Justin B
Langley Charles H
Pool John E
Stevens Kristian A
Taylor William
Publication venue: eScholarship, University of California
Publication date: 27/01/2015
Field of study

Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets

CiteSeerX

PubMed Central

eScholarship - University of California

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations.

Author: Kwok Pui-Yan
Levy-Sakin Michal
Wong Karen HY
Publication venue: eScholarship, University of California
Publication date: 01/08/2018
Field of study

The human reference genome is used extensively in modern biological research. However, a single consensus representation is inadequate to provide a universal reference structure because it is a haplotype among many in the human population. Using 10× Genomics (10×G) "Linked-Read" technology, we perform whole genome sequencing (WGS) and de novo assembly on 17 individuals across five populations. We identify 1842 breakpoint-resolved non-reference unique insertions (NUIs) that, in aggregate, add up to 2.1 Mb of so far undescribed genomic content. Among these, 64% are considered ancestral to humans since they are found in non-human primate genomes. Furthermore, 37% of the NUIs can be found in the human transcriptome and 14% likely arose from Alu-recombination-mediated deletion. Our results underline the need of a set of human reference genomes that includes a comprehensive list of alternative haplotypes to depict the complete spectrum of genetic diversity across populations

Directory of Open Access Journals

eScholarship - University of California

Archaic mitochondrial DNA inserts in modern day nuclear genomes

Author: Bücking R.
Cox M.
Hudjashov G.
Saag L.
Stoneking M.
Sudoyo H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/12/2019
Field of study

Traces of interbreeding of Neanderthals and Denisovans with modern humans in the form of archaic DNA have been detected in the genomes of present-day human populations outside sub-Saharan Africa. Up to now, only nuclear archaic DNA has been detected in modern humans; we therefore attempted to identify archaic mitochondrial DNA (mtDNA) residing in modern human nuclear genomes as nuclear inserts of mitochondrial DNA (NUMTs)

MPG.PuRe

Recommended from our members

OMMA enables population-scale analysis of complex genomic features and phylogenomic relationships from nanochannel-based optical maps.

Author: Chan Ting-Fung
Chu Catherine
Ho Pak-Leung
Kwok Pui-Yan
Lai Yvonne Yuk-Yin
Leung Alden King-Yung
Li Le
Liu Melissa Chun-Jiao
Yip Kevin Y
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

BackgroundOptical mapping is an emerging technology that complements sequencing-based methods in genome analysis. It is widely used in improving genome assemblies and detecting structural variations by providing information over much longer (up to 1 Mb) reads. Current standards in optical mapping analysis involve assembling optical maps into contigs and aligning them to a reference, which is limited to pairwise comparison and becomes bias-prone when analyzing multiple samples.FindingsWe present a new method, OMMA, that extends optical mapping to the study of complex genomic features by simultaneously interrogating optical maps across many samples in a reference-independent manner. OMMA captures and characterizes complex genomic features, e.g., multiple haplotypes, copy number variations, and subtelomeric structures when applied to 154 human samples across the 26 populations sequenced in the 1000 Genomes Project. For small genomes such as pathogenic bacteria, OMMA accurately reconstructs the phylogenomic relationships and identifies functional elements across 21 Acinetobacter baumannii strains.ConclusionsWith the increasing data throughput of optical mapping system, the use of this technology in comparative genome analysis across many samples will become feasible. OMMA is a timely solution that can address such computational need. The OMMA software is available at https://github.com/TF-Chan-Lab/OMTools

eScholarship - University of California

Bioinformatics tools for analysing viral genomic data

Author: Davison A.
Gu Q.
Hughes J.
Maabar M.
Modha S.
Orton R.J.
Vattipally Sreenu
Wilkie G.S.
Publication venue: 'O.I.E (World Organisation for Animal Health)'
Publication date: 01/04/2016
Field of study

The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing

Enlighten

Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans

Author: Andolfatto Peter
Cridland Julie M.
Hu Tina T.
Rogers Rebekah L.
Shao Ling
Thornton Kevin R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/04/2014
Field of study

We have used whole genome paired-end Illumina sequence data to identify tandem duplications in 20 isofemale lines of D. yakuba, and 20 isofemale lines of D. simulans and performed genome wide validation with PacBio long molecule sequencing. We identify 1,415 tandem duplications that are segregating in D. yakuba as well as 975 duplications in D. simulans, indicating greater variation in D. yakuba. Additionally, we observe high rates of secondary deletions at duplicated sites, with 8% of duplicated sites in D. simulans and 17% of sites in D. yakuba modified with deletions. These secondary deletions are consistent with the action of the large loop mismatch repair system acting to remove polymorphic tandem duplication, resulting in rapid dynamics of gain and loss in duplicated alleles and a richer substrate of genetic novelty than has been previously reported. Most duplications are present in only single strains, suggesting deleterious impacts are common. D. simulans shows larger numbers of whole gene duplications in comparison to larger proportions of gene fragments in D. yakuba. D. simulans displays an excess of high frequency variants on the X chromosome, consistent with adaptive evolution through duplications on the D. simulans X or demographic forces driving duplicates to high frequency. We identify 78 chimeric genes in D. yakuba and 38 chimeric genes in D. simulans, as well as 143 cases of recruited non-coding sequence in D. yakuba and 96 in D. simulans, in agreement with rates of chimeric gene origination in D. melanogaster. Together, these results suggest that tandem duplications often result in complex variation beyond whole gene duplications that offers a rich substrate of standing variation that is likely to contribute both to detrimental phenotypes and disease, as well as to adaptive evolutionary change.Comment: Revised Version- Accepted at Molecular Biology and Evolutio

arXiv.org e-Print Archive

Princeton University Open Access Repository

PubMed Central

eScholarship - University of California

Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

Author: A Bairoch
A Christoffels
A Gurevich
A Kozomara
A McKenna
A Mitchell
A Morgulis
A Morgulis
A Pradhan
A Reiner
A Rodriguez-Mari
A Stamatakis
A Yates
AI Makunin
AJ Enright
AL Price
AL Price
Alan Christoffels
Aleksey Komissarov
Alexey Tupikin
Amy Hin Yan Tong
Andrey A. Yurchenko
AR Quinlan
B Langmead
B Star
C Berthelot
C Camacho
C Holt
C Wang
Chen-Shan Chin
CS Chin
D Brawand
D Ellinghaus
DA Benson
Darrell Green
DC Hardie
Dean R. Jerry
DH Alexander
Doreen Lau
DR Kelley
DRS-K C. Jerry
E Casacuberta
E. TG Staristina
EW Myers
F Abascal
F Chen
F Yang
FC Jones
FJ Krsticevic
Fritz J. Sedlazeck
G Abrusan
G Benson
G Lin
G Marcais
G Parra
G Parra
G Tamazian
GH Yue
GH Yue
Gopikrishna Gopalapillai
Gregory W. Vurture
GS Slater
GT Valente
H Li
H Saiga
Heiner Kuhl
HH Kazazian Jr.
I Braasch
Inna S. Kuznetsova
IS Kuznetsova
J Castresana
J Eid
J Huerta-Cepas
J Jurka
J Lin
James P. Drake
JG Ruby
JN Volff
JN Volff
Jolly M. Saju
Jonas Korlach
JS Chew
Junhui Jiang
K Howe
K Katoh
K Prufer
Kathiresan Purushothaman
KD Pruitt
KJ Hoff
KP Koepfli
KW Tzung
Lawrence S. Hon
László Orbán
M Blanchette
M Kanehisa
M Kasahara
M Kolmogorov
M Krzywinski
M Martin
M Schartl
M Tarailoâ-Graovac
M Tine
MA Larkin
Mario Jonas
Marsel Kabilov
Matthew Boitano
MB Stocks
MG Grabherr
Michael C. Schatz
MJ Chaisson
MR Friedlander
N Siegel
Natascha M. Thevasagayam
NM Thevasagayam
O Jaillon
O Otero
P Cingolani
P Ravi
P Schattner
P Shannon
P Xu
Paul M. Richardson
PE Warburton
Peter Van Heusden
R Kajitani
R Lorenz
R Luo
R Moore
R Pethiyagoda
R Poulter
R She
R Sreenivasan
Ramkumar Lachumanan
RD Ward
RD Ward
Richard Hall
RJ Roberts
S Chen
S Guindon
S Hoegg
S Hoegg
S Koren
S Vij
S Zhou
Sai Rama Sridatta Prakki
Sarah Mwangi
SF Altschul
Shubha Vij
Si Lok
Si Yan Ngoh
Siddharth Singh
Simon Moxon
SM Kielbasa
Sridhar Sivasubbu
Stanley Kimbung Mbandi
Stephen J. O'Brien
Stephen W. Turner
T Anantharaman
Tamás Dalmay
Tansyn H. Noble
TD Wu
TF DeLuca
TH O'Hare
TLO Davis
TS Anantharaman
Tyler Garvin
U Consortium
U Grimholt
V Douard
V Ravi
Vinaya Kumar Katneni
Vinod Scaria
Vladimir Trifonov
W Xue
WC Liew
Woei Chang Liew
WS Davidson
X Huang
X Zheng
XG Wang
XG Wang
Xueyan Shen
Y Guiguen
Y Han
Y Hashiguchi
Y Moriya
Y Sato
Y Sato
Y Sato
Z Lai
Ø Hammer
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

Public Library of Science (PLOS)

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

ResearchOnline at James Cook University

PubMed Central

Research Repository

Repository of the Academy's Library

University of East Anglia digital repository

NSU Works

MPG.PuRe

Identifying Structural Variation in Haploid Microbial Genomes from Short-Read Resequencing Data Using Breseq

Author: Barrick Jeffrey E.
Borges Jordan J.
Colburn Geoffrey
Deatherage Daniel E.
Knoester David B.
Meyer Austin G.
Reba Aaron
Strand Matthew D.
Traverse Charles C.
Publication venue
Publication date: 01/11/2014
Field of study

Mutations that alter chromosomal structure play critical roles in evolution and disease, including in the origin of new lifestyles and pathogenic traits in microbes. Large-scale rearrangements in genomes are often mediated by recombination events involving new or existing copies of mobile genetic elements, recently duplicated genes, or other repetitive sequences. Most current software programs for predicting structural variation from short-read DNA resequencing data are intended primarily for use on human genomes. They typically disregard information in reads mapping to repeat sequences, and significant post-processing and manual examination of their output is often required to rule out false-positive predictions and precisely describe mutational events. Results: We have implemented an algorithm for identifying structural variation from DNA resequencing data as part of the breseq computational pipeline for predicting mutations in haploid microbial genomes. Our method evaluates the support for new sequence junctions present in a clonal sample from split-read alignments to a reference genome, including matches to repeat sequences. Then, it uses a statistical model of read coverage evenness to accept or reject these predictions. Finally, breseq combines predictions of new junctions and deleted chromosomal regions to output biologically relevant descriptions of mutations and their effects on genes. We demonstrate the performance of breseq on simulated Escherichia coli genomes with deletions generating unique breakpoint sequences, new insertions of mobile genetic elements, and deletions mediated by mobile elements. Then, we reanalyze data from an E. coli K-12 mutation accumulation evolution experiment in which structural variation was not previously identified. Transposon insertions and large-scale chromosomal changes detected by breseq account for similar to 25% of spontaneous mutations in this strain. In all cases, we find that breseq is able to reliably predict structural variation with modest read-depth coverage of the reference genome (>40-fold). Conclusions: Using breseq to predict structural variation should be useful for studies of microbial epidemiology, experimental evolution, synthetic biology, and genetics when a reference genome for a closely related strain is available. In these cases, breseq can discover mutations that may be responsible for important or unintended changes in genomes that might otherwise go undetected.U.S. National Institutes of Health R00-GM087550U.S. National Science Foundation (NSF) DEB-0515729NSF BEACON Center for the Study of Evolution in Action DBI-0939454Cancer Prevention & Research Institute of Texas (CPRIT) RP130124University of Texas at Austin startup fundsUniversity of Texas at AustinCPRIT Cancer Research TraineeshipMolecular Bioscience

PubMed Central

Texas ScholarWorks