Search CORE

411 research outputs found

Strategies for analyzing bisulfite sequencing data

Author: Akalin A.
Assenov Y.
Gosdschan A.
Grüning B.
Wreczycka K.
Yusuf D.
Publication venue: 'Elsevier BV'
Publication date: 10/11/2017
Field of study

DNA methylation is one of the main epigenetic modifications in the eukaryotic genome; it has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used for further classification of regions returned by segmentation and differential methylation methods. Finally, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and we discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis

MDC Repository

Strategies for analyzing bisulfite sequencing data

Author: Akalin A.
Assenov Y.
Gosdschan A.
Gruening B.
Wreczycka K.
Yusuf D.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 09/08/2017
Field of study

DNA methylation is one of the main epigenetic modifications in the eukaryotic genome and has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite-sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used further classification of regions returned by segmentation or differential methylation methods. Lastly, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and also discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis

MDC Repository

PiGx: reproducible genomics analysis pipelines with GNU Guix

Author: Akalin A.
Franke V.
Gosdschan A.
Osberg B.
Ronen J.
Uyar B.
Wreczycka K.
Wurmus R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 21/04/2018
Field of study

In bioinformatics, as well as other computationally-intensive research fields, there is a need for workflows that can reliably produce consistent output, from known sources, independent of the software environment or configuration settings of the machine on which they are executed. Indeed, this is essential for controlled comparison between different observations or for the wider dissemination of workflows. Providing this type of reproducibility and traceability, however, is often complicated by the need to accommodate the myriad dependencies included in a larger body of software, each of which generally come in various versions. Moreover, in many fields (bioinformatics being a prime example), these versions are subject to continual change due to rapidly evolving technologies, further complicating problems related to reproducibility. Here, we propose a principled approach for building analysis pipelines and managing their dependencies with GNU Guix. As a case study to demonstrate the utility of our approach, we present a set of highly reproducible pipelines called PiGx for the analysis of RNA-seq, ChIP-seq, Bisulfite-seq, and single-cell RNA-seq. All pipelines process raw experimental data, and generate reports containing publication-ready plots and figures, with interactive report elements and standard observables. Users may install these highly reproducible packages and apply them to their own datasets without any special computational expertise beyond the use of the command line. We hope such a toolkit will provide immediate benefit to laboratory workers wishing to process their own data sets or bioinformaticians seeking to automate all, or parts of, their analyses. In the long term, we hope our approach to reproducibility will serve as a blueprint for reproducible workflows in other areas. Our pipelines, along with their corresponding documentation and sample reports, are available at http://bioinformatics.mdc-berlin.de/pigx

Scipedia

MDC Repository

HOT or not: examining the basis of high-occupancy target regions

Author: Akalin A.
Bulut S.
Franke V.
Tursun B.
Uyar B.
Wreczycka K.
Wurmus R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/06/2019
Field of study

High-occupancy target (HOT) regions are segments of the genome with unusually high number of transcription factor binding sites. These regions are observed in multiple species and thought to have biological importance due to high transcription factor occupancy. Furthermore, they coincide with house-keeping gene promoters and consequently associated genes are stably expressed across multiple cell types. Despite these features, HOT regions are solemnly defined using ChIP-seq experiments and shown to lack canonical motifs for transcription factors that are thought to be bound there. Although, ChIP-seq experiments are the golden standard for finding genome-wide binding sites of a protein, they are not noise free. Here, we show that HOT regions are likely to be ChIP-seq artifacts and they are similar to previously proposed 'hyper-ChIPable' regions. Using ChIP-seq data sets for knocked-out transcription factors, we demonstrate presence of false positive signals on HOT regions. We observe sequence characteristics and genomic features that are discriminatory of HOT regions, such as GC/CpG-rich k-mers, enrichment of RNA-DNA hybrids (R-loops) and DNA tertiary structures (G-quadruplex DNA). The artificial ChIP-seq enrichment on HOT regions could be associated to these discriminatory features. Furthermore, we propose strategies to deal with such artifacts for the future ChIP-seq studies

Crossref

MDC Repository

Alternative 3' UTRs direct localization of functionally diverse protein isoforms in neuronal compartments

Author: Akalin A.
Arrey G.
Chekulaeva M.
Ciolli Mattioli C.
Franke V.
Imami K.
Rom A.
Terne M.
Ulitsky I.
Woehler A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 18/03/2019
Field of study

The proper subcellular localization of RNAs and local translational regulation is crucial in highly compartmentalized cells, such as neurons. RNA localization is mediated by specific cis-regulatory elements usually found in mRNA 3'UTRs. Therefore, processes that generate alternative 3'UTRs-alternative splicing and polyadenylation-have the potential to diversify mRNA localization patterns in neurons. Here, we performed mapping of alternative 3'UTRs in neurites and soma isolated from mESC-derived neurons. Our analysis identified 593 genes with differentially localized 3'UTR isoforms. In particular, we have shown that two isoforms of Cdc42 gene with distinct functions in neuronal polarity are differentially localized between neurites and soma of mESC-derived and mouse primary cortical neurons, at both mRNA and protein level. Using reporter assays and 3'UTR swapping experiments, we have identified the role of alternative 3'UTRs and mRNA transport in differential localization of alternative CDC42 protein isoforms. Moreover, we used SILAC to identify isoform-specific Cdc42 3'UTR-bound proteome with potential role in Cdc42 localization and translation. Our analysis points to usage of alternative 3'UTR isoforms as a novel mechanism to provide for differential localization of functionally diverse alternative protein isoforms

MDC Repository

Central Role of IL-23 and IL-17 Producing Eosinophils as Immunomodulatory Effector Cells in Acute Pulmonary Aspergillosis and Allergic Asthma

Author: Akalin Ali
Guerra Evelyn V. Santos
Huang Haibin
Huh Jun R.
Lee Chrono K.
Levitz Stuart M.
Mueller Christian
Specht Charles A.
Yadav Bhawna
Publication venue: eScholarship@UMassChan
Publication date: 01/01/2017
Field of study

Aspergillus fumigatus causes invasive pulmonary disease in immunocompromised hosts and allergic asthma in atopic individuals. We studied the contribution of lung eosinophils to these fungal diseases. By in vivo intracellular cytokine staining and confocal microscopy, we observed that eosinophils act as local sources of IL-23 and IL-17. Remarkably, mice lacking eosinophils had a \u3e95% reduction in the percentage of lung IL-23p19+ cells as well as markedly reduced IL-23 heterodimer in lung lavage fluid. Eosinophils killed A. fumigatus conidia in vivo. Eosinopenic mice had higher mortality rates, decreased recruitment of inflammatory monocytes, and decreased expansion of lung macrophages after challenge with conidia. All of these functions underscore a potential protective role for eosinophils in acute aspergillosis. Given the postulated role for IL-17 in asthma pathogenesis, we assessed whether eosinophils could act as sources of IL-23 and IL-17 in models where mice were sensitized to either A. fumigatus antigens or ovalbumin (OVA). We found IL-23p19+ IL-17AF+ eosinophils in both allergic models. Moreover, close to 95% of IL-23p19+ cells and \u3e90% of IL-17AF+ cells were identified as eosinophils. These data establish a new paradigm in acute and allergic aspergillosis whereby eosinophils act not only as effector cells but also as immunomodulatory cells driving the IL-23/IL-17 axis and contributing to inflammatory cell recruitment

Directory of Open Access Journals

PubMed Central

eScholarship@UMMS

FigShare