Search CORE

1,607 research outputs found

Recommended from our members

BCFtools/csq: haplotype-aware variant consequences.

Author: Danecek Petr
McCarthy Shane A
Publication venue: Bioinformatics
Publication date: 01/07/2017
Field of study

MOTIVATION: Prediction of functional variant consequences is an important part of sequencing pipelines, allowing the categorization and prioritization of genetic variants for follow up analysis. However, current predictors analyze variants as isolated events, which can lead to incorrect predictions when adjacent variants alter the same codon, or when a frame-shifting indel is followed by a frame-restoring indel. Exploiting known haplotype information when making consequence predictions can resolve these issues. RESULTS: BCFtools/csq is a fast program for haplotype-aware consequence calling which can take into account known phase. Consequence predictions are changed for 501 of 5019 compound variants found in the 81.7M variants in the 1000 Genomes Project data, with an average of 139 compound variants per haplotype. Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an order of magnitude less memory. AVAILABILITY AND IMPLEMENTATION: The program is freely available for commercial and non-commercial use in the BCFtools package which is available for download from http://samtools.github.io/bcftools . CONTACT: [email protected]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

Apollo (Cambridge)

A Novel Genome-Wide Association Study Approach Using Genotyping by Exome Sequencing Leads to the Identification of a Primary Open Angle Glaucoma Associated Inversion Disrupting ADAMTS17

Author: A McKenna
András M. Komáromy
Cathryn Mellersh
D Gilliam
D Gould
D Hubmacher
Degui Zhi
FH Farias
G Lim
H Li
HG Parker
HM Kang
J Guo
J Guo
J Kuchtey
J Kuchtey
J Morales
JT Robinson
L Mosyak
Louise Pettitt
N Safra
N Siva
Oliver P. Forman
P Danecek
Peter Bedford
S Porter
S Purcell
SJ Ahonen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/12/2015
Field of study

Closed breeding populations in the dog in conjunction with advances in gene mapping and sequencing techniques facilitate mapping of autosomal recessive diseases and identification of novel disease-causing variants, often using unorthodox experimental designs. In our investigation we demonstrate successful mapping of the locus for primary open angle glaucoma in the Petit Basset Griffon Vendéen dog breed with 12 cases and 12 controls, using a novel genotyping by exome sequencing approach. The resulting genome-wide association signal was followed up by genome sequencing of an individual case, leading to the identification of an inversion with a breakpoint disrupting the ADAMTS17 gene. Genotyping of additional controls and expression analysis provide strong evidence that the inversion is disease causing. Evidence of cryptic splicing resulting in novel exon transcription as a consequence of the inversion in ADAMTS17 is identified through RNAseq experiments. This investigation demonstrates how a novel genotyping by exome sequencing approach can be used to map an autosomal recessive disorder in the dog, with the use of genome sequencing to facilitate identification of a disease-associated variant

Crossref

Directory of Open Access Journals

PubMed Central

Twelve years of SAMtools and BCFtools.

Author: Bonfield James K
Danecek Petr
Davies Robert M
Keane Thomas
Li Heng
Liddle Jennifer
Marshall John
McCarthy Shane A
Ohan Valeriu
Pollard Martin O
Whitwham Andrew
Publication venue: Gigascience
Publication date: 01/02/2021
Field of study

BACKGROUND: SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods. FINDINGS: The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines. CONCLUSION: Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed >1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org

arXiv.org e-Print Archive

Enlighten

Apollo (Cambridge)

The Origin of a New Sex Chromosome by Introgression between Two Stickleback Fishes.

Author: Abbott
Alexander
Anders
Bachtrog
Bachtrog
Bachtrog
Bronson
Browning
Carling
Carneiro
Charlesworth
Charlesworth
Choi
Cook
Corcoran
Coyne
Coyne
Danecek
DePristo
Edwards
Glazer
Green
Groves Dixon
Hagen
Irwin
Ishikawa
JanouŠek
Jay
Jombart
Jombart
Jun Kitano
Krueger
Langmead
Li
Li
Lindholm
Love
Luo
Mark Kirkpatrick
Martin
Martin
Martin
Meguro
Melissa Wilson Sayres
Muirhead
Natri
Neafsey
Nei
O’Neill
Payseur
Petit
Presgraves
Quinlan
Ravinet
Ross
Saetre
Sankararaman
Schwander
Sciuchetti
Smit
Stamatakis
Storchová
Stölting
Sun
Takahashi
Takahashi
Toews
Tosi
Toups
Tuttle
van Doorn
Vicoso
von Hippel
Wang
White
Yang
Yoshida
Yoshida
Úbeda
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Introgression is increasingly recognized as a source of genetic diversity that fuels adaptation. Its role in the evolution of sex chromosomes, however, is not well known. Here, we confirm the hypothesis that the Y chromosome in the ninespine stickleback, Pungitius pungitius, was established by introgression from the Amur stickleback, P. sinensis. Using whole genome resequencing, we identified a large region of Chr 12 in P. pungitius that is diverged between males and females. Within but not outside of this region, several lines of evidence show that the Y chromosome of P. pungitius shares a most recent common ancestor not with the X chromosome, but with the homologous chromosome in P. sinensis. Accumulation of repetitive elements and gene expression changes on the new Y are consistent with a young sex chromosome in early stages of degeneration, but other hallmarks of Y chromosomes have not yet appeared. Our findings indicate that porous species boundaries can trigger rapid sex chromosome evolution

Crossref

eScholarship - University of California

OMBRA: Observing Montello BRoad Activity. Una rete temporanea per lo studio dei processi di deformazione attraverso la faglia del Montello (Alpi orientali).

Author: Augliera P.
Cavaliere A.
Danecek P.
Danesi S.
Franceschina G.
Lovati S.
Maistrello M.
Massa M.
Pessina V.
Pondrelli S.
Salimbeni S.
Serpelloni E.
Publication venue: Istituto Nazionale di Geofisica e Vulcanologia
Publication date
Field of study

L’area veneta delle Alpi orientali è caratterizzata da una debole sismicità di background. In particolare, l’attività sismica registrata negli ultimi 30 anni [Castello et al., 2006; Bollettino Sismico INGV1] mostra eventi di bassa energia (ML<3) lungo l’arco alpino in corrispondenza dell’anticlinale del Montello (situato a NW di Treviso). Sono noti però alcuni eventi di magnitudo medio-alta che hanno storicamente interessato la regione: l’episodio più significativo è il terremoto di Asolo del 1695 (Imax 10 e MaW 6.61), affiancato da tre ulteriori eventi sismici di intensità Imax≥VIII (magnitudo equivalente 6.0) avvenuti nel 778, 1286 e 1836 [CPTI Working group 2004] (Figura 1). Il Montello è catalogato tra i segmenti sismogeneticamente attivi del fronte alpino [Valensise and Pantosti, 2001; Galadini et al., 2005; Poli et al., 2008], originato dall’uplift di una struttura di thrust S-vergente, con slip rate di deformazione stimato tra 1.5 mm/yr [Burrato et al., 2009] e 1.8-2.0 mm/yr [Benedetti et al., 2000]. Scopo del progetto OMBRA è quello di studiare alcune questioni ancora aperte e scientificamente controverse. Ci si chiede come questi eventi storici forti possano integrarsi nel contesto della debole sismicità di fondo osservata recentemente. Inoltre è interessante capire come una velocità di placca relativamente alta possa accomodarsi nel pattern regionale e inoltre quali strutture tra l’anticlinale e il fronte alpino possano essere potenzialmente attive

Earth-prints Repository

Population genomics of the Asian tiger mosquito, Aedes albopictus. Insights into the recent worldwide invasion

Author: Aguirre-Obando
Alexander
Armbruster
Baird
Becker
Beebe
Benedict
Birungi
Bonizzoni
Brady
Brown
Chen
Chen
Chouin-Carneiro
Dalla Pozza
Damal
Danecek
Das
Delatte
Denlinger
Eaton
Egizi
Emerson
Excoffier
Flacio
Foll
Frichot
Galtier
Gloria-Soria
Goubert
Gratz
Hawley
Hemingway
Hurst
Ismail
Jackson
Jombart
Kambhampati
Kambhampati
Kamgang
Kamgang
Kasai
Lamballerie
Lambrechts
Langmead
Lewis
Li
Lischer
Liu
Lounibos
Lounibos
Lourenco de Oliveira
Luu
Manni
Manni
Marcombe
Maynard
Medley
Medlock
Minard
Morens
Mori
Mousson
Nazareno
O'Donnell
Patterson
Paupy
Pech-May
Peterson
Porretta
Porretta
Powell
Puckett
Reiter
Smartt
Sprenger
Stamatakis
Tatem
Trucchi
Urbanelli
Urbanski
Willing
Wong
Wray
Wu
Xu
Zhong
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

Aedes albopictus, the “Asian tiger mosquito,” is an aggressive biting mosquito native to Asia that has colonized all continents except Antarctica during the last ~30–40 years. The species is of great public health concern as it can transmit at least 26 arboviruses, including dengue, chikungunya, and Zika viruses. In this study, using double- digest Restriction site-Associated DNA (ddRAD) sequencing, we developed a panel of ~58,000 single nucleotide polymorphisms (SNPs) based on 20 worldwide Ae. albopic-tus populations representing both the invasive and the native range. We used this genomic- based approach to study the genetic structure and the differentiation of Ae. albopictus populations and to understand origin(s) and dynamics of the recent inva-sions. Our analyses indicated the existence of two major genetically differentiated population clusters, each one including both native and invasive populations. The de-tection of additional genetic structure within each major cluster supports that these SNPs can detect differentiation at a global and local scale, while the similar levels of genomic diversity between native and invasive range populations support the scenario of multiple invasions or colonization by a large number of propagules. Finally, our re-sults revealed the possible source(s) of the recent invasion in Americas, Europe, and Africa, a finding with important implications for vector- control strategies

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Archivio istituzionale della ricerca - Università di Camerino

Archivio della ricerca- Università di Roma La Sapienza

The variant call format and VCFtools

Author: A. Auton
C. A. Albers
Durbin
E. Banks
G. Abecasis
G. Lunter
G. McVean
G. T. Marth
M. A. DePristo
P. Danecek
R. Durbin
R. E. Handsaker
S. T. Sherry
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API

Oxford University Research Archive

dispel4py: An Open-Source Python library for Data-Intensive Seismology

Author: Atkinson Malcolm
Danecek Peter
Filgueira Vicente Rosa
Klampanos Iraklis
Krause Amrey
Spinuso Alessandro
Publication venue
Publication date: 01/01/2015
Field of study

Scientific workflows are a necessary tool for many scientific communities as they enable easy composition and execution of applications on computing resources while scientists can focus on their research without being distracted by the computation management. Nowadays, scientific communities (e.g. Seismology) have access to a large variety of computing resources and their computational problems are best addressed using parallel computing technology. However, successful use of these technologies requires a lot of additional machinery whose use is not straightforward for non-experts: different parallel frameworks (MPI, Storm, multiprocessing, etc.) must be used depending on the computing resources (local machines, grids, clouds, clusters) where applications are run. This implies that for achieving the best applications' performance, users usually have to change their codes depending on the features of the platform selected for running them. This work presents dispel4py, a new open-source Python library for describing abstract stream-based workflows for distributed data-intensive applications. Special care has been taken to provide dispel4py with the ability to map abstract workflows to different platforms dynamically at run-time. Currently dispel4py has four mappings: Apache Storm, MPI, multi-threading and sequential. The main goal of dispel4py is to provide an easy-to-use tool to develop and test workflows in local resources by using the sequential mode with a small dataset. Later, once a workflow is ready for long runs, it can be automatically executed on different parallel resources. dispel4py takes care of the underlying mappings by performing an efficient parallelisation. Processing Elements (PE) represent the basic computational activities of any dispel4Py workflow, which can be a seismologic algorithm, or a data transformation process. For creating a dispel4py workflow, users only have to write very few lines of code to describe their PEs and how they are connected by using Python, which is widely supported on many platforms and is popular in many scientific domains, such as in geosciences. Once, a dispel4py workflow is written, a user only has to select which mapping they would like to use, and everything else (parallelisation, distribution of data) is carried on by dispel4py without any cost to the user. Among all dispel4py features we would like to highlight the following: * The PEs are connected by streams and not by writing to and reading from intermediate files, avoiding many IO operations. * The PEs can be stored into a registry. Therefore, different users can recombine PEs in many different workflows. * dispel4py has been enriched with a provenance mechanism to support runtime provenance analysis. We have adopted the W3C-PROV data model, which is accessible via a prototypal browser-based user interface and a web API. It supports the users with the visualisation of graphical products and offers combined operations to access and download the data, which may be selectively stored at runtime, into dedicated data archives. dispel4py has been already used by seismologists in the VERCE project to develop different seismic workflows. One of them is the Seismic Ambient Noise Cross-Correlation workflow, which preprocesses and cross-correlates traces from several stations. First, this workflow was tested on a local machine by using a small number of stations as input data. Later, it was executed on different parallel platforms (SuperMUC cluster, and Terracorrelator machine), automatically scaling up by using MPI and multiprocessing mappings and up to 1000 stations as input data. The results show that the dispel4py achieves scalable performance in both mappings tested on different parallel platforms

Heriot Watt Pure

Edinburgh Research Explorer

University of St. Andrews - Pure

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Author: Danecek P.
Durbin R.
et al.
Gambaro G.
Howie B.
Huang J.
Malerba G.
Marchini J.
McCarthy S.
Memari Y.
Min J.L.
Richards J.B.
Schmidts M.
Soranzo N.
Timpson N.J.
Trabetti E.
Walter K.
Zheng H.F.
Publication venue: NATURE PUBLISHING GROUP
Publication date: 01/01/2015
Field of study

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants

UCL Discovery

Oxford University Research Archive

Radboud Repository

King's Research Portal

Helsingin yliopiston digitaalinen arkisto

Apollo (Cambridge)

Explore Bristol Research

Genomics of Divergence along a Continuum of Parapatric Population Differentiation

MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1)

OceanRep

Crossref

Directory of Open Access Journals

PubMed Central

Queen Mary Research Online

Bern Open Repository and Information System (BORIS)

MPG.PuRe