32 research outputs found

    Multi-species integrative biclustering

    Get PDF
    We describe an algorithm, multi-species cMonkey, for the simultaneous biclustering of heterogeneous multiple-species data collections and apply the algorithm to a group of bacteria containing Bacillus subtilis, Bacillus anthracis, and Listeria monocytogenes. The algorithm reveals evolutionary insights into the surprisingly high degree of conservation of regulatory modules across these three species and allows data and insights from well-studied organisms to complement the analysis of related but less well studied organisms

    Comparative Microbial Modules Resource: Generation and Visualization of Multi-species Biclusters

    Get PDF
    The increasing abundance of large-scale, high-throughput datasets for many closely related organisms provides opportunities for comparative analysis via the simultaneous biclustering of datasets from multiple species. These analyses require a reformulation of how to organize multi-species datasets and visualize comparative genomics data analyses results. Recently, we developed a method, multi-species cMonkey, which integrates heterogeneous high-throughput datatypes from multiple species to identify conserved regulatory modules. Here we present an integrated data visualization system, built upon the Gaggle, enabling exploration of our method's results (available at http://meatwad.bio.nyu.edu/cmmr.html). The system can also be used to explore other comparative genomics datasets and outputs from other data analysis procedures – results from other multiple-species clustering programs or from independent clustering of different single-species datasets. We provide an example use of our system for two bacteria, Escherichia coli and Salmonella Typhimurium. We illustrate the use of our system by exploring conserved biclusters involved in nitrogen metabolism, uncovering a putative function for yjjI, a currently uncharacterized gene that we predict to be involved in nitrogen assimilation

    Publisher Correction: Demonstration of reduced neoclassical energy transport in Wendelstein 7-X

    Get PDF

    Demonstration of reduced neoclassical energy transport in Wendelstein 7-X

    Get PDF

    Towards a new image processing system at Wendelstein 7-X: From spatial calibration to characterization of thermal events

    Get PDF
    Wendelstein 7-X (W7-X) is the most advanced fusion experiment in the stellarator line and is aimed at proving that the stellarator concept is suitable for a fusion reactor. One of the most important issues for fusion reactors is the monitoring of plasma facing components when exposed to very high heat loads, through the use of visible and infrared (IR) cameras. In this paper, a new image processing system for the analysis of the strike lines on the inboard limiters from the first W7-X experimental campaign is presented. This system builds a model of the IR cameras through the use of spatial calibration techniques, helping to characterize the strike lines by using the information given by real spatial coordinates of each pixel. The characterization of the strike lines is made in terms of position, size, and shape, after projecting the camera image in a 2D grid which tries to preserve the curvilinear surface distances between points. The description of the strike-line shape is made by means of the Fourier Descriptors

    Forward modeling of collective Thomson scattering for Wendelstein 7-X plasmas: Electrostatic approximation

    Get PDF
    In this paper, we present a method for numerical computation of collective Thomson scattering (CTS). We developed a forward model, eCTS, in the electrostatic approximation and benchmarked it against a full electromagnetic model. Differences between the electrostatic and the electromagnetic models are discussed. The sensitivity of the results to the ion temperature and the plasma composition is demonstrated. We integrated the model into the Bayesian data analysis framework Minerva and used it for the analysis of noisy synthetic data sets produced by a full electromagnetic model. It is shown that eCTS can be used for the inference of the bulk ion temperature. The model has been used to infer the bulk ion temperature from the first CTS measurements on Wendelstein 7-X

    “Same difference”: comprehensive evaluation of four DNA methylation measurement platforms

    No full text
    Abstract Background DNA methylation in CpG context is fundamental to the epigenetic regulation of gene expression in higher eukaryotes. Changes in methylation patterns are implicated in many diseases, cellular differentiation, imprinting, and other biological processes. Techniques that enrich for biologically relevant genomic regions with high CpG content are desired, since, depending on the size of an organism’s methylome, the depth of sequencing required to cover all CpGs can be prohibitively expensive. Currently, restriction enzyme-based reduced representation bisulfite sequencing and its modified protocols are widely used to study methylation differences. Recently, Agilent Technologies, Roche NimbleGen, and Illumina have ventured to both reduce sequencing costs and capture CpGs of known biological relevance by marketing in-solution custom-capture hybridization platforms. We aimed to evaluate the similarities and differences of these four methods considering each platform targets approximately 10–13% of the human methylome. Results Overall, the regions covered per platform were as expected: targeted capture-based methods covered > 95% of their designed regions, whereas the restriction enzyme-based method covered > 70% of the expected fragments. While the total number of CpG loci shared by all methods was low, ~ 24% of any platform, the methylation levels of CpGs covered by all platforms were concordant. Annotation of CpG loci with genomic features revealed roughly the same proportions of feature annotations across the four platforms. Targeted capture methods comprise similar types and coverage of annotations and, relative to the targeted methods, the restriction enzyme method covers fewer promoters (~ 9%), CpG shores (~ 8%) and unannotated loci (~ 11%). Conclusions Although all methods are largely consistent in terms of covered CpG loci, the commercially available capture methods result in covering nearly all CpG sites in their target regions with few off-target loci and covering similar proportions of annotated CpG loci, the restriction-based enrichment results in more off-target and unannotated CpG loci. Quality of DNA is very important for restriction-based enrichment and starting material can be low. Conversely, quality of the starting material is less important for capture methods, and at least twice the amount of starting material is required. Pricing is marginally less for restriction-based enrichment, and the number of samples that can be prepared is not restricted to the number of capture reactions a kit supports. However, the advantage of capture libraries is the ability to custom design areas of interest. The choice of the technique would be decided by the number of samples, the quality and quantity of DNA available and the biological areas of interest since comparable data are obtained from all platforms

    Multiplexing of ChIP-Seq Samples in an Optimized Experimental Condition Has Minimal Impact on Peak Detection

    No full text
    <div><p>Multiplexing samples in sequencing experiments is a common approach to maximize information yield while minimizing cost. In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience, with limited understanding on the effects on the experimental results. Here we set to examine the impact of multiplexing ChIP-seq experiments on the ability to identify a specific epigenetic modification. We performed peak detection analyses to determine the effects of multiplexing. These include false discovery rates, size, position and statistical significance of peak detection, and changes in gene annotation. We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million single-end reads each and still detect over 90% of all peaks found when using a full lane for sample (~181 million reads). Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks. We conclude that, for a well characterized antibody and, therefore, model IP condition, multiplexing 8 samples per lane is sufficient to capture most of the biological signal.</p></div
    corecore