11 research outputs found
PuFFIN--a parameter-free method to build nucleosome maps from paired-end reads.
BackgroundWe introduce a novel method, called PuFFIN, that takes advantage of paired-end short reads to build genome-wide nucleosome maps with larger numbers of detected nucleosomes and higher accuracy than existing tools. In contrast to other approaches that require users to optimize several parameters according to their data (e.g., the maximum allowed nucleosome overlap or legal ranges for the fragment sizes) our algorithm can accurately determine a genome-wide set of non-overlapping nucleosomes without any user-defined parameter. This feature makes PuFFIN significantly easier to use and prevents users from choosing the "wrong" parameters and obtain sub-optimal nucleosome maps.ResultsPuFFIN builds genome-wide nucleosome maps using a multi-scale (or multi-resolution) approach. Our algorithm relies on a set of nucleosome "landscape" functions at different resolution levels: each function represents the likelihood of each genomic location to be occupied by a nucleosome for a particular value of the smoothing parameter. After a set of candidate nucleosomes is computed for each function, PuFFIN produces a consensus set that satisfies non-overlapping constraints and maximizes the number of nucleosomes.ConclusionsWe report comprehensive experimental results that compares PuFFIN with recently published tools (NOrMAL, TEMPLATE FILTERING, and NucPosSimulator) on several synthetic datasets as well as real data for S. cerevisiae and P. falciparum. Experimental results show that our approach produces more accurate nucleosome maps with a higher number of non-overlapping nucleosomes than other tools
DNA-encoded nucleosome occupancy is associated with transcription levels in the human malaria parasite Plasmodium falciparum.
BackgroundIn eukaryotic organisms, packaging of DNA into nucleosomes controls gene expression by regulating access of the promoter to transcription factors. The human malaria parasite Plasmodium falciparum encodes relatively few transcription factors, while extensive nucleosome remodeling occurs during its replicative cycle in red blood cells. These observations point towards an important role of the nucleosome landscape in regulating gene expression. However, the relation between nucleosome positioning and transcriptional activity has thus far not been explored in detail in the parasite.ResultsHere, we analyzed nucleosome positioning in the asexual and sexual stages of the parasite's erythrocytic cycle using chromatin immunoprecipitation of MNase-digested chromatin, followed by next-generation sequencing. We observed a relatively open chromatin structure at the trophozoite and gametocyte stages, consistent with high levels of transcriptional activity in these stages. Nucleosome occupancy of genes and promoter regions were subsequently compared to steady-state mRNA expression levels. Transcript abundance showed a strong inverse correlation with nucleosome occupancy levels in promoter regions. In addition, AT-repeat sequences were strongly unfavorable for nucleosome binding in P. falciparum, and were overrepresented in promoters of highly expressed genes.ConclusionsThe connection between chromatin structure and gene expression in P. falciparum shares similarities with other eukaryotes. However, the remarkable nucleosome dynamics during the erythrocytic stages and the absence of a large variety of transcription factors may indicate that nucleosome binding and remodeling are critical regulators of transcript levels. Moreover, the strong dependency between chromatin structure and DNA sequence suggests that the P. falciparum genome may have been shaped by nucleosome binding preferences. Nucleosome remodeling mechanisms in this deadly parasite could thus provide potent novel anti-malarial targets
ThIEF: Finding Genome-wide Trajectories of Epigenetics Marks
We address the problem of comparing multiple genome-wide maps representing nucleosome positions or specific histone marks. These maps can originate from the comparative analysis of ChIP-Seq/MNase-Seq/FAIRE-Seq data for different cell types/tissues or multiple time points. The input to the problem is a set of maps, each of which is a list of genomics locations for nucleosomes or histone marks. The output is an alignment of nucleosomes/histone marks across time points (that we call trajectories), allowing small movements and gaps in some of the maps. We present a tool called ThIEF (TrackIng of Epigenetic Features) that can efficiently compute these trajectories. ThIEF comes into two "flavors": ThIEF:Iterative finds the trajectories progressively using bipartite matching, while ThIEF:LP solves a k-partite matching problem on a hyper graph using linear programming. ThIEF:LP is guaranteed to find the optimal solution, but it is slower than ThIEF:Iterative. We demonstrate the utility of ThIEF by providing an example of applications on the analysis of temporal nucleosome maps for the human malaria parasite. As a surprisingly remarkable result, we show that the output of ThIEF can be used to produce a supervised classifier that can accurately predict the position of stable nucleosomes (i.e., nucleosomes present in all time points) and unstable nucleosomes (i.e., present in at most half of the time points) from the primary DNA sequence. To the best of our knowledge, this is the first result on the prediction of the dynamics of nucleosomes solely based on their DNA binding preference. Software is available at https://github.com/ucrbioinfo/ThIEF
Computational Methods for Exploring Nucleosome Dynamics
Nucleosomes are the basic elements of DNA chromatin structure. Not only they control DNA packaging but also play a critical role in gene regulation by allowing physical access to transcription factors. In addition to providing the positions of nucleosomes and the occupancy level it is becoming more and more important to resolve possible overlaps, extract additional information about nucleosomes like the probability of placement, and determine whether they are well-positioned or "fuzzy" in the sequenced cell sample.In this dissertation, we address some of the computational issues associated with the analysis of sequencing data enriched for nucleosomes. We propose two novel algorithms to create nucleosome maps, for single- and paired-end sequencing data respectively. Then, we study the problem of aligning these maps. The first method, called NOrMAL, is based on a novel parametric probabilistic model of a nucleosome. Expectation maximization is used to learn the parameters of a Gaussian mixture model. Extensive experiments on real and synthetic data shows that our method can produce very accurate maps, and can detect a larger number of nucleosomes than published tools.The second method, called PuFFIN, takes advantage of paired-end short reads to build genome-wide nucleosome maps. In contrast to other approaches that require users to optimize several parameters according to their data (e.g., the maximum allowed nucleosome overlap or legal ranges for the fragment sizes) our algorithm can accurately determine a genome-wide set of non-overlapping nucleosomes without any user-defined parameter. On the real data PuFFIN detects stronger associations between nucleosome occupancy and gene expression levels compared to other tools, which indicates that our tool extracts more biologically-relevant features from the data.Finally, we then study the problem of aligning nucleosome maps, which is NP-complete when the number of maps is three or more. We use effective bounding tricks to limit the size of the problem and use linear programming to solve it. Our evaluations on the synthetic data shows that our aligning tool consistently outperforms the naive (greedy) approach and it is faster than dynamic programming
Recommended from our members
Computational Methods for Exploring Nucleosome Dynamics
Nucleosomes are the basic elements of DNA chromatin structure. Not only they control DNA packaging but also play a critical role in gene regulation by allowing physical access to transcription factors. In addition to providing the positions of nucleosomes and the occupancy level it is becoming more and more important to resolve possible overlaps, extract additional information about nucleosomes like the probability of placement, and determine whether they are well-positioned or "fuzzy" in the sequenced cell sample.In this dissertation, we address some of the computational issues associated with the analysis of sequencing data enriched for nucleosomes. We propose two novel algorithms to create nucleosome maps, for single- and paired-end sequencing data respectively. Then, we study the problem of aligning these maps. The first method, called NOrMAL, is based on a novel parametric probabilistic model of a nucleosome. Expectation maximization is used to learn the parameters of a Gaussian mixture model. Extensive experiments on real and synthetic data shows that our method can produce very accurate maps, and can detect a larger number of nucleosomes than published tools.The second method, called PuFFIN, takes advantage of paired-end short reads to build genome-wide nucleosome maps. In contrast to other approaches that require users to optimize several parameters according to their data (e.g., the maximum allowed nucleosome overlap or legal ranges for the fragment sizes) our algorithm can accurately determine a genome-wide set of non-overlapping nucleosomes without any user-defined parameter. On the real data PuFFIN detects stronger associations between nucleosome occupancy and gene expression levels compared to other tools, which indicates that our tool extracts more biologically-relevant features from the data.Finally, we then study the problem of aligning nucleosome maps, which is NP-complete when the number of maps is three or more. We use effective bounding tricks to limit the size of the problem and use linear programming to solve it. Our evaluations on the synthetic data shows that our aligning tool consistently outperforms the naive (greedy) approach and it is faster than dynamic programming
NOrMAL
Année de la première version : 2012Documents associés disponibles : Documentation utilisateur – RéférenceInterface utilisateur : ligne de commandeA command line tool for accurate placing of the nucleosomes. NOrMAL was designed to resolve overlapping nucleosomes and extract extra information ("fuzziness", probability, etc.) of nucleosome placement
NOrMAL
A command line tool for accurate placing of the nucleosomes. NOrMAL was designed to resolve overlapping nucleosomes and extract extra information ("fuzziness", probability, etc.) of nucleosome placement
Recommended from our members
NORMAL: accurate nucleosome positioning using a modified Gaussian mixture model.
MotivationNucleosomes are the basic elements of chromatin structure. They control the packaging of DNA and play a critical role in gene regulation by allowing physical access to transcription factors. The advent of second-generation sequencing has enabled landmark genome-wide studies of nucleosome positions for several model organisms. Current methods to determine nucleosome positioning first compute an occupancy coverage profile by mapping nucleosome-enriched sequenced reads to a reference genome; then, nucleosomes are placed according to the peaks of the coverage profile. These methods are quite accurate on placing isolated nucleosomes, but they do not properly handle more complex configurations. Also, they can only provide the positions of nucleosomes and their occupancy level, whereas it is very beneficial to supply molecular biologists additional information about nucleosomes like the probability of placement, the size of DNA fragments enriched for nucleosomes and/or whether nucleosomes are well positioned or 'fuzzy' in the sequenced cell sample.ResultsWe address these issues by providing a novel method based on a parametric probabilistic model. An expectation maximization algorithm is used to infer the parameters of the mixture of distributions. We compare the performance of our method on two real datasets against Template Filtering, which is considered the current state-of-the-art. On synthetic data, we show that our method can resolve more accurately complex configurations of nucleosomes, and it is more robust to user-defined parameters. On real data, we show that our method detects a significantly higher number of nucleosomes.AvailabilityVisit http://www.cs.ucr.edu/~polishka
Recommended from our members
PuFFIN--a parameter-free method to build nucleosome maps from paired-end reads.
BackgroundWe introduce a novel method, called PuFFIN, that takes advantage of paired-end short reads to build genome-wide nucleosome maps with larger numbers of detected nucleosomes and higher accuracy than existing tools. In contrast to other approaches that require users to optimize several parameters according to their data (e.g., the maximum allowed nucleosome overlap or legal ranges for the fragment sizes) our algorithm can accurately determine a genome-wide set of non-overlapping nucleosomes without any user-defined parameter. This feature makes PuFFIN significantly easier to use and prevents users from choosing the "wrong" parameters and obtain sub-optimal nucleosome maps.ResultsPuFFIN builds genome-wide nucleosome maps using a multi-scale (or multi-resolution) approach. Our algorithm relies on a set of nucleosome "landscape" functions at different resolution levels: each function represents the likelihood of each genomic location to be occupied by a nucleosome for a particular value of the smoothing parameter. After a set of candidate nucleosomes is computed for each function, PuFFIN produces a consensus set that satisfies non-overlapping constraints and maximizes the number of nucleosomes.ConclusionsWe report comprehensive experimental results that compares PuFFIN with recently published tools (NOrMAL, TEMPLATE FILTERING, and NucPosSimulator) on several synthetic datasets as well as real data for S. cerevisiae and P. falciparum. Experimental results show that our approach produces more accurate nucleosome maps with a higher number of non-overlapping nucleosomes than other tools