10 research outputs found

    Evaluation of Algorithm Performance in ChIP-Seq Peak Detection

    Get PDF
    Next-generation DNA sequencing coupled with chromatin immunoprecipitation (ChIP-seq) is revolutionizing our ability to interrogate whole genome protein-DNA interactions. Identification of protein binding sites from ChIP-seq data has required novel computational tools, distinct from those used for the analysis of ChIP-Chip experiments. The growing popularity of ChIP-seq spurred the development of many different analytical programs (at last count, we noted 31 open source methods), each with some purported advantage. Given that the literature is dense and empirical benchmarking challenging, selecting an appropriate method for ChIP-seq analysis has become a daunting task. Herein we compare the performance of eleven different peak calling programs on common empirical, transcription factor datasets and measure their sensitivity, accuracy and usability. Our analysis provides an unbiased critical assessment of available technologies, and should assist researchers in choosing a suitable tool for handling ChIP-seq data

    Detection and Removal of Biases in the Analysis of Next-Generation Sequencing Reads

    Get PDF
    Since the emergence of next-generation sequencing (NGS) technologies, great effort has been put into the development of tools for analysis of the short reads. In parallel, knowledge is increasing regarding biases inherent in these technologies. Here we discuss four different biases we encountered while analyzing various Illumina datasets. These biases are due to both biological and statistical effects that in particular affect comparisons between different genomic regions. Specifically, we encountered biases pertaining to the distributions of nucleotides across sequencing cycles, to mappability, to contamination of pre-mRNA with mRNA, and to non-uniform hydrolysis of RNA. Most of these biases are not specific to one analyzed dataset, but are present across a variety of datasets and within a variety of genomic contexts. Importantly, some of these biases correlated in a highly significant manner with biological features, including transcript length, gene expression levels, conservation levels, and exon-intron architecture, misleadingly increasing the credibility of results due to them. We also demonstrate the relevance of these biases in the context of analyzing an NGS dataset mapping transcriptionally engaged RNA polymerase II (RNAPII) in the context of exon-intron architecture, and show that elimination of these biases is crucial for avoiding erroneous interpretation of the data. Collectively, our results highlight several important pitfalls, challenges and approaches in the analysis of NGS reads

    ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions

    Get PDF
    ZINBA (Zero-Inflated Negative Binomial Algorithm) identifies genomic regions enriched in a variety of ChIP-seq and related next-generation sequencing experiments (DNA-seq), calling both broad and narrow modes of enrichment across a range of signal-to-noise ratios. ZINBA models and accounts for factors that co-vary with background or experimental signal, such as G/C content, and identifies enrichment in genomes with complex local copy number variations. ZINBA provides a single unified framework for analyzing DNA-seq experiments in challenging genomic contexts

    Katvust mÔjutavate parameetrite hindamine

    Get PDF
    Katvus ehk sekveneerimissĂŒgavus vĂ€ljendab seda, mitu korda on ĂŒks nukleotiid sekveneeritud. Katvuse andmeid kasutatakse genoomianalĂŒĂŒsis nii indiviidi geneetiliste variatsioonide uurimiseks, geeniekspressiooni analĂŒĂŒsiks kui ka DNA kĂ”rgema struktuuri uurimiseks. Peamiseks probleemiks seejuures on katvuse kĂ”rvalekalded oodatud ĂŒhtlasest vÀÀrtusest. KĂ€esoleva töö eesmĂ€rk on anda ĂŒlevaade katvuse rakendustest inimese genoomi analĂŒĂŒsides ja kirjeldada katvuse vÀÀrtust mĂ”jutavaid tegureid ning eksperimentaalses osas hinnata GC-sisalduse, genoomipositsiooni ja kromosoomi mĂ”ju k-meeri katvuse vÀÀrtusele. Katvust mĂ”jutavate parameetrite tuvastamine ning sobivad mudelid katvuse korrigeerimiseks vĂ”imaldavad tĂ€psemalt analĂŒĂŒsida madalama katvusega sekveneeritud proove ning vĂ€hendada analĂŒĂŒside valepositiivsete ja –negatiivsete tulemuste hulka

    The Arabidopsis thaliana Heat Shock Transcription Factor A1b Transcriptional Regulatory Network

    Get PDF
    Plants as sessile organisms have adapted highly sophisticated cellular processes to cope with environmental stress conditions, which include the initiation of complex transcriptional regulatory circuits. The heat shock transcription factors (HSFs) have been shown to be central regulators of plant responses to abiotic and biotic stress conditions. However, the extremely high multiplicity in plant HSF families compared to those of other kingdoms and their unique expression patterns and structures suggest that some of them might have evolved to become major regulators of other non-stress related processes. Arabidopsis thaliana HSFA1b (AtHSFA1b) has been shown to be a major regulator of various forms of plant responses to abiotic and biotic stresses. However, it has also been suggested that overexpression of AtHSFA1b results in a subtle developmental effect in Arabidopsis thaliana and Brassica napus in the form of increased seed yield and harvest index. Through genome-wide mapping of the AtHSFA1b binding profile in the Arabidopsis thaliana genome, monitoring changes in the AtHSFA1b-regulated-transcriptome, and functional analysis of AtHSFA1b in Saccharomyces cerevisiae under non-stress and heat stress conditions, this study provides evidence of the association of AtHSFA1b with plant general developmental processes. Furthermore, the outcome of this research shows that AtHSFA1b controls a transcriptional regulatory network operating in a hierarchical manner. However, in an agreement with a previously suggested model, the results from this study demonstrate that the involvement of AtHSFA1b in the regulation of heat stress response in Arabidopsis thaliana is possibly limited to the immediate and very early phases of heat stress response which also results in a collapse in its transcriptional network which seems to be accompanied by a general shutdown in plant growth and development

    Développement de nouveaux outils pour l'intégration des données du ChIP-Seq et leurs applications pour l'étude du contrÎle de la transcription

    Get PDF
    Les progrĂšs fulgurants des technologies de sĂ©quençage permettent de dĂ©velopper des projets de recherche trĂšs complexes. De plus, les consortiums internationaux tels qu’ENCODE, Roadmap Epigenomics et Fantom offrent publiquement de vastes jeux de donnĂ©s Ă  la communautĂ© scientifique. Ainsi, mon projet de recherche au doctorat a pour but de dĂ©velopper de nouvelles approches bioinformatiques afin d’analyser efficacement les donnĂ©es gĂ©nomiques de type ChIP-Seq pour cibler les changements dans les patrons d’interactions entre les protĂ©ines et l’ADN. De nouveaux outils R tels ENCODExplorer et FantomTSS ont donc Ă©tĂ© dĂ©veloppĂ©s afin de faciliter l’intĂ©gration des donnĂ©es publiques. De plus, l’outil metagene, dĂ©veloppĂ© dans le cadre de mon doctorat, permet de comparer les patrons d’enrichissement des protĂ©ines interagissant avec l’ADN. Il extrait efficacement la couverture des rĂ©gions gĂ©nomiques, normalise le signal et d’utilise les contrĂŽles pour retirer le bruit de fond. Il produit des graphiques pour comparer visuellement les facteurs et conditions et offre des outils statistiques pour cibler les profils significativement diffĂ©rents. Afin de valider mon approche expĂ©rimentale, j’ai analysĂ© une centaine de jeux de donnĂ©es de ChIP-Seq de la lignĂ©e GM12878 pour Ă©tudier les profils d’enrichissement au niveau des amplificateurs et des promoteurs en fonction de leur activitĂ© transcriptionnelle. Cette Ă©tude a ciblĂ© deux modes de recrutement distincts, soit l’effet gradient et l’effet seuil. Face Ă  la complexitĂ© et la quantitĂ© de donnĂ©es disponibles, il est essentiel de dĂ©velopper de nouvelles approches mĂ©thodologiques et statistiques afin d’amĂ©liorer notre comprĂ©hension des mĂ©canismes biologiques. ENCODExplorer et metagene sont disponibles sur Bioconductor.Recent progress in sequencing technologies opened the possibility of performing very complex research experiments. Combined with the vast public datasets produced by intenational consortiums such as ENCODE, Roadmap Epigenomics and Fantoms, the amount of data to process can be daunting. The goal of my doctoral project is to develop new bioinformatic approaches to facilitate the integration of ChIP-Seq data for the study of the dynamic of the interactions between proteins and DNA. New tools such as ENCODExplorer and FantomTSS were developped in R to make the publicly available datasets easier to integrate. Futhermore, the metagene package allows the comparison of enrichment patterns of DNA-interacting proteins. This package efficiently extracts read coverage from genomic regions of interest, normalize the signal and uses controls to remove background noise. The main functionnality of the metagene package is to visually compare enrichment profiles from multiple groups of genomic regions and to offer statistical tools to caracterize and compare those profiles. To validate my experimental approach, I used over a hundred datasets from the GM12878 cell line produced by the ENCODE consortium to study the enrichment profiles of transcription factors and histones in enhnacer and promoter regions. I was able to define two distinct recruitment patterns: the gradient effect and the threshold effect. With the ever growing complexity of genomic datasets, it is essential to develop new methodotical approaches to allow a better understanding of the underlying biological processes. ENCODExplorer and metagene are both available on Bioconductor

    In vitro and molecular approaches for propagation and germplasm improvement of blueberries

    Get PDF
    Blueberries are known as a “super-fruit” and have tremendous commercial importance due to their high antioxidant contents. During in vitro culture, by introducing various plant growth regulators (PGRs) in the culture medium redifferentiation is stimulated in regenerating cells. During organogenesis and somatic embryogenesis (SE), a somatic cell goes through the process of dedifferentiation which is predominantly controlled by various epigenetic factors. I have investigated the effect of different PGRs on SE and established a protocol in half-high blueberry plants, for the first time, using thidiazuron (TDZ) on a semi-solid medium (SSM). I compared the antioxidant capacity of the in vitro grown plants with their donor counterparts to see the effect of SE on the biochemical profile of the regenerants. It was seen that not only the SE process but also the different concentrations of TDZ and the physiological age of the explants significantly affects the antioxidant activity. To get a more detailed insight into the effect of in vitro propagation on differential methylation pattern, I have analysed the global methylation pattern of young leaves and regenerated calli of one hybrid blueberry and three lowbush blueberry clones using methylation sensitive amplification polymorphism (MSAP) technique. Methylation assay results showed that calli regenerated in the SSM supplemented with TDZ are significantly hypermethylated relative to the donor plants, and the level of methylation varies with different concentrations of TDZ. Not only that but different plant genotypes showed differential effect on methylation pattern. These findings further confirm the effect of different aspects of plant tissue culture techniques on altered DNA methylation pattern. Finally, to gain further insight into how various in vitro culture systems affect the global methylation pattern, I performed global methylation analysis on half-highbush blueberry plantlets regenerated from SSM and liquid medium in a temporary immersion bioreactor (TIB) in the presence of TDZ and zeatin. From this experiment I found that significant increase in total methylation percentage and methylation polymorphism were present in plantlets from TIB system in comparison to SSM. Overall, my results indicate that each component of in vitro propagation has strong effects on the epigenetic and biochemical profile of the regenerants
    corecore