1,607 research outputs found

    A Novel Genome-Wide Association Study Approach Using Genotyping by Exome Sequencing Leads to the Identification of a Primary Open Angle Glaucoma Associated Inversion Disrupting ADAMTS17

    Get PDF
    Closed breeding populations in the dog in conjunction with advances in gene mapping and sequencing techniques facilitate mapping of autosomal recessive diseases and identification of novel disease-causing variants, often using unorthodox experimental designs. In our investigation we demonstrate successful mapping of the locus for primary open angle glaucoma in the Petit Basset Griffon Vendéen dog breed with 12 cases and 12 controls, using a novel genotyping by exome sequencing approach. The resulting genome-wide association signal was followed up by genome sequencing of an individual case, leading to the identification of an inversion with a breakpoint disrupting the ADAMTS17 gene. Genotyping of additional controls and expression analysis provide strong evidence that the inversion is disease causing. Evidence of cryptic splicing resulting in novel exon transcription as a consequence of the inversion in ADAMTS17 is identified through RNAseq experiments. This investigation demonstrates how a novel genotyping by exome sequencing approach can be used to map an autosomal recessive disorder in the dog, with the use of genome sequencing to facilitate identification of a disease-associated variant

    Twelve years of SAMtools and BCFtools.

    Get PDF
    BACKGROUND: SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods. FINDINGS: The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines. CONCLUSION: Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed >1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org

    The Origin of a New Sex Chromosome by Introgression between Two Stickleback Fishes.

    Get PDF
    Introgression is increasingly recognized as a source of genetic diversity that fuels adaptation. Its role in the evolution of sex chromosomes, however, is not well known. Here, we confirm the hypothesis that the Y chromosome in the ninespine stickleback, Pungitius pungitius, was established by introgression from the Amur stickleback, P. sinensis. Using whole genome resequencing, we identified a large region of Chr 12 in P. pungitius that is diverged between males and females. Within but not outside of this region, several lines of evidence show that the Y chromosome of P. pungitius shares a most recent common ancestor not with the X chromosome, but with the homologous chromosome in P. sinensis. Accumulation of repetitive elements and gene expression changes on the new Y are consistent with a young sex chromosome in early stages of degeneration, but other hallmarks of Y chromosomes have not yet appeared. Our findings indicate that porous species boundaries can trigger rapid sex chromosome evolution

    OMBRA: Observing Montello BRoad Activity. Una rete temporanea per lo studio dei processi di deformazione attraverso la faglia del Montello (Alpi orientali).

    Get PDF
    L’area veneta delle Alpi orientali è caratterizzata da una debole sismicità di background. In particolare, l’attività sismica registrata negli ultimi 30 anni [Castello et al., 2006; Bollettino Sismico INGV1] mostra eventi di bassa energia (ML<3) lungo l’arco alpino in corrispondenza dell’anticlinale del Montello (situato a NW di Treviso). Sono noti però alcuni eventi di magnitudo medio-alta che hanno storicamente interessato la regione: l’episodio più significativo è il terremoto di Asolo del 1695 (Imax 10 e MaW 6.61), affiancato da tre ulteriori eventi sismici di intensità Imax≥VIII (magnitudo equivalente 6.0) avvenuti nel 778, 1286 e 1836 [CPTI Working group 2004] (Figura 1). Il Montello è catalogato tra i segmenti sismogeneticamente attivi del fronte alpino [Valensise and Pantosti, 2001; Galadini et al., 2005; Poli et al., 2008], originato dall’uplift di una struttura di thrust S-vergente, con slip rate di deformazione stimato tra 1.5 mm/yr [Burrato et al., 2009] e 1.8-2.0 mm/yr [Benedetti et al., 2000]. Scopo del progetto OMBRA è quello di studiare alcune questioni ancora aperte e scientificamente controverse. Ci si chiede come questi eventi storici forti possano integrarsi nel contesto della debole sismicità di fondo osservata recentemente. Inoltre è interessante capire come una velocità di placca relativamente alta possa accomodarsi nel pattern regionale e inoltre quali strutture tra l’anticlinale e il fronte alpino possano essere potenzialmente attive

    Population genomics of the Asian tiger mosquito, Aedes albopictus. Insights into the recent worldwide invasion

    Get PDF
    Aedes albopictus, the “Asian tiger mosquito,” is an aggressive biting mosquito native to Asia that has colonized all continents except Antarctica during the last ~30–40 years. The species is of great public health concern as it can transmit at least 26 arboviruses, including dengue, chikungunya, and Zika viruses. In this study, using double- digest Restriction site-Associated DNA (ddRAD) sequencing, we developed a panel of ~58,000 single nucleotide polymorphisms (SNPs) based on 20 worldwide Ae. albopic-tus populations representing both the invasive and the native range. We used this genomic- based approach to study the genetic structure and the differentiation of Ae. albopictus populations and to understand origin(s) and dynamics of the recent inva-sions. Our analyses indicated the existence of two major genetically differentiated population clusters, each one including both native and invasive populations. The de-tection of additional genetic structure within each major cluster supports that these SNPs can detect differentiation at a global and local scale, while the similar levels of genomic diversity between native and invasive range populations support the scenario of multiple invasions or colonization by a large number of propagules. Finally, our re-sults revealed the possible source(s) of the recent invasion in Americas, Europe, and Africa, a finding with important implications for vector- control strategies

    The variant call format and VCFtools

    Get PDF
    Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API

    dispel4py: An Open-Source Python library for Data-Intensive Seismology

    Get PDF
    Scientific workflows are a necessary tool for many scientific communities as they enable easy composition and execution of applications on computing resources while scientists can focus on their research without being distracted by the computation management. Nowadays, scientific communities (e.g. Seismology) have access to a large variety of computing resources and their computational problems are best addressed using parallel computing technology. However, successful use of these technologies requires a lot of additional machinery whose use is not straightforward for non-experts: different parallel frameworks (MPI, Storm, multiprocessing, etc.) must be used depending on the computing resources (local machines, grids, clouds, clusters) where applications are run. This implies that for achieving the best applications' performance, users usually have to change their codes depending on the features of the platform selected for running them. This work presents dispel4py, a new open-source Python library for describing abstract stream-based workflows for distributed data-intensive applications. Special care has been taken to provide dispel4py with the ability to map abstract workflows to different platforms dynamically at run-time. Currently dispel4py has four mappings: Apache Storm, MPI, multi-threading and sequential. The main goal of dispel4py is to provide an easy-to-use tool to develop and test workflows in local resources by using the sequential mode with a small dataset. Later, once a workflow is ready for long runs, it can be automatically executed on different parallel resources. dispel4py takes care of the underlying mappings by performing an efficient parallelisation. Processing Elements (PE) represent the basic computational activities of any dispel4Py workflow, which can be a seismologic algorithm, or a data transformation process. For creating a dispel4py workflow, users only have to write very few lines of code to describe their PEs and how they are connected by using Python, which is widely supported on many platforms and is popular in many scientific domains, such as in geosciences. Once, a dispel4py workflow is written, a user only has to select which mapping they would like to use, and everything else (parallelisation, distribution of data) is carried on by dispel4py without any cost to the user. Among all dispel4py features we would like to highlight the following: * The PEs are connected by streams and not by writing to and reading from intermediate files, avoiding many IO operations. * The PEs can be stored into a registry. Therefore, different users can recombine PEs in many different workflows. * dispel4py has been enriched with a provenance mechanism to support runtime provenance analysis. We have adopted the W3C-PROV data model, which is accessible via a prototypal browser-based user interface and a web API. It supports the users with the visualisation of graphical products and offers combined operations to access and download the data, which may be selectively stored at runtime, into dedicated data archives. dispel4py has been already used by seismologists in the VERCE project to develop different seismic workflows. One of them is the Seismic Ambient Noise Cross-Correlation workflow, which preprocesses and cross-correlates traces from several stations. First, this workflow was tested on a local machine by using a small number of stations as input data. Later, it was executed on different parallel platforms (SuperMUC cluster, and Terracorrelator machine), automatically scaling up by using MPI and multiprocessing mappings and up to 1000 stations as input data. The results show that the dispel4py achieves scalable performance in both mappings tested on different parallel platforms

    Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

    Get PDF
    Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants
    corecore