16 research outputs found

    Comparison of Patser and Consite.

    No full text
    <p>Number of TFBSs from the dataset by Harbison et al. against the total number of TFBSs detected by Patser and Consite.</p

    Top-20 TF combinations.

    No full text
    <p>First dataset. The twenty TF combinations with the lowest p-value and highest support obtained when using the dataset by Harbison et al. Evidence column shows whether results were yielded when PubMed was queried for evidence in the literature (P), STRING <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0108065#pone.0108065-Franceschini1" target="_blank">[48]</a> yielded a connected graph for the given TFs (S), both conditions (SP) or none (-) were met.</p><p>Top-20 TF combinations.</p

    Parameter values.

    No full text
    <p>Second dataset. Summary of the input parameters used.</p><p>Parameter values.</p

    TF combinations.

    No full text
    <p>Second dataset. Some of the TF combinations obtained when using the TFBSs detected by Patser (yeast genome).</p><p>TF combinations.</p

    Outline of the CisMiner procedure.

    No full text
    <p>Diagram of the main steps of the CisMiner procedure. Given a set of TFBSs, the process starts by performing a fuzzy hierarchical clustering to obtain a set of closely located TFBSs. The result of this step is a fuzzy transactional database, which will then be mined by a Fuzzy Frequent Itemset Mining algorithm (Fuzzy Frequent-Pattern Tree) to obtain a set of frequent fuzzy itemsets. Finally, a postprocessing takes place in order to handle overlapping TFBSs that appear in each frequent itemset. As a result, a set of putative CRMs, along with their estimated p-value and their fuzzy support, is given.</p

    Procedure for generating the fuzzy transactional database.

    No full text
    <p>(1) Each circle represents a binding site. Each binding site is labeled with the name of the TF which binds that BS. (2) Three clusters are obtained. Centroids are calculated for each cluster. (3) Fuzzy sets are defined for each cluster. (4) Fuzzy transactions are generated from the fuzzy sets. The value after the colon indicates the membership degree of the corresponding TF to the transaction.</p

    Fuzzy-crisp comparison.

    No full text
    <p>The four first rows show the mean values of fuzzy/crisp support and <i>p</i>-value of the combinations respectively. The last two rows show the statistical significance returned by the ANOVA procedure.</p><p>Fuzzy-crisp comparison.</p

    Post-processing of the results.

    No full text
    <p>The value indicates the membership degree of each binding site to its corresponding transaction. (a) Pairs of overlapping binding sites are directly removed. (b) The optimum way of fitting itemset {A, B, C} is found.</p

    InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data

    No full text
    <div><p>Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from <a href="http:/bitbucket.org/kokonech/infusion" target="_blank">http:/bitbucket.org/kokonech/infusion</a>.</p></div

    TMPRSS2-ERG fusion isoforms.

    No full text
    <p>(A) Genomic structure of the TMPRSS2–ERG fusion transcripts discovered from deep sequencing data by InFusion. Isoform 3 is a known transcript, while isoforms 1 and 2 are novel. Transcript names are taken from the Ensembl v.68 database. (B) RT-PCR validation of isoforms in VCaP, LNCaP, RWPE-1 and PrEC cell lines; NTC = no template control. The PCR primer design was based on the output from the InFusion pipeline. In order to detect only one product, one PCR primer specific for Isoform 3 was designed to cover the fusion junction site. A 50 bp DNA ladder was co-run as size marker; bright bands indicate 250 bp and 500 bp. (C) Relative expression levels of the fusion isoforms as measured by qRT-PCR. All measurements were performed in triplicate, mean expression values were computed relative to GAPDH. Plotted values are normalized to the computed expression of isoform 3. (D) Expression levels of isoforms estimated in RPKM under the assumption of uniform coverage.</p
    corecore