27 research outputs found

    ModuleDigger: an itemset mining framework for the detection of cis-regulatory modules

    Get PDF
    Background: The detection of cis-regulatory modules (CRMs) that mediate transcriptional responses in eukaryotes remains a key challenge in the postgenomic era. A CRM is characterized by a set of co-occurring transcription factor binding sites (TFBS). In silico methods have been developed to search for CRMs by determining the combination of TFBS that are statistically overrepresented in a certain geneset. Most of these methods solve this combinatorial problem by relying on computational intensive optimization methods. As a result their usage is limited to finding CRMs in small datasets (containing a few genes only) and using binding sites for a restricted number of transcription factors (TFs) out of which the optimal module will be selected. Results: We present an itemset mining based strategy for computationally detecting cis-regulatory modules (CRMs) in a set of genes. We tested our method by applying it on a large benchmark data set, derived from a ChIP-Chip analysis and compared its performance with other well known cis-regulatory module detection tools. Conclusion: We show that by exploiting the computational efficiency of an itemset mining approach and combining it with a well-designed statistical scoring scheme, we were able to prioritize the biologically valid CRMs in a large set of coregulated genes using binding sites for a large number of potential TFs as input

    Inferring transcriptional modules from ChIP-chip, motif and microarray data

    Get PDF
    'ReMoDiscovery' is an intuitive algorithm to correlate regulatory programs with regulators and corresponding motifs to a set of co-expressed genes. It exploits in a concurrent way three independent data sources: ChIP-chip data, motif information and gene expression profiles. When compared to published module discovery algorithms, ReMoDiscovery is fast and easily tunable. We evaluated our method on yeast data, where it was shown to generate biologically meaningful findings and allowed the prediction of potential novel roles of transcriptional regulators

    DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli

    Get PDF
    DISTILLER, a data integration framework for the inference of transcriptional module networks, is presented and used to investigate the condition dependency and modularity in Escherichia coli networks

    VectorNet Data Series 3: Culicoides Abundance Distribution Models for Europe and Surrounding Regions

    Get PDF
    This is the third in a planned series of data papers presenting modelled vector distributions produced during the ECDC and EFSA funded VectorNet project. The data package presented here includes those Culicoides vectors species first modelled in 2015 as part of the VectorNet gap analysis work namely C. imicola, C. obsoletus, C. scoticus, C. dewulfi, C. chiopterus, C. pulicaris, C. lupicaris, C. punctatus, and C. newsteadi. The known distributions of these species within the Project area (Europe, the Mediterranean Basin, North Africa, and Eurasia) are currently incomplete to a greater or lesser degree. The models are designed to fill the gaps with predicted distributions, to provide a) first indication of vector species distributions across the project geographical extent, and b) assistance in targeting surveys to collect distribution data for those areas with no field validated information. The models are based on input data from light trap surveillance of adult Culicoides across continental Europe and surrounding regions (71.8°N –33.5°S, – 11.2°W – 62°E), concentrated in Western countries, supplemented by transect samples in eastern and northern Europe. Data from central EU are relatively sparse.Peer reviewe

    The condition-dependent transcriptional network in Escherichia coli

    No full text
    Thanks to the availability of high-throughput omics data, bioinformatics approaches are able to hypothesize thus-far undocumented genetic interactions. However, due to the amount of noise in these data, inferences based on a single data source are often unreliable. A popular approach to overcome this problem is to integrate different data sources. In this study, we describe DISTILLER, a novel frame work for data integration that simultaneously analyzes microarray and motif information to find modules that: consist. of genes that are co-expressed in a subset of conditions, and their corresponding regulators. By applying our method on publicly available data, we evaluated the condition-specific transcriptional network of Escherichia coli. DISTILLER confirmed 62% of 736 interactions described in RegulonDB, and 278 novel interactions v,,ere predicted

    ModuleDigger: an itemset mining framework for the detection of cis-regulatory modules

    No full text
    Abstract Background The detection of cis-regulatory modules (CRMs) that mediate transcriptional responses in eukaryotes remains a key challenge in the postgenomic era. A CRM is characterized by a set of co-occurring transcription factor binding sites (TFBS). In silico methods have been developed to search for CRMs by determining the combination of TFBS that are statistically overrepresented in a certain geneset. Most of these methods solve this combinatorial problem by relying on computational intensive optimization methods. As a result their usage is limited to finding CRMs in small datasets (containing a few genes only) and using binding sites for a restricted number of transcription factors (TFs) out of which the optimal module will be selected. Results We present an itemset mining based strategy for computationally detecting cis-regulatory modules (CRMs) in a set of genes. We tested our method by applying it on a large benchmark data set, derived from a ChIP-Chip analysis and compared its performance with other well known cis-regulatory module detection tools. Conclusion We show that by exploiting the computational efficiency of an itemset mining approach and combining it with a well-designed statistical scoring scheme, we were able to prioritize the biologically valid CRMs in a large set of coregulated genes using binding sites for a large number of potential TFs as input.</p

    UGent Open Science

    No full text
    Training and guidance material of the Ghent University's open science tea
    corecore