10 research outputs found

    Average Viterbi Decoding Accuracy over 10 different trials (instances) of 10 k-length synthetic 3-level signal data, where all levels have identical Poisson duration but the separation (gaussian emission means) between the levels varies

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "A novel, fast, HMM-with-Duration implementation – for application with a new, pattern recognition informed, nanopore detector"</p><p>http://www.biomedcentral.com/1471-2105/8/S7/S19</p><p>BMC Bioinformatics 2007;8(Suppl 7):S19-S19.</p><p>Published online 1 Nov 2007</p><p>PMCID:PMC2099487.</p><p></p> The Viterbi decoding accuracy improves as the number of bins increases in the decoding HMM's approximation of the Poisson durations generated using a 1 k-bin length distribution representation in the generating HMM. From left to right in each plot, the Viterbi response improves as the separation of the 3 levels (emission means) increases. , decoding performance when all levels have identical attributes is random 3-way guessing, so the expected 3333 out of 10000 correct is observed in all cases. , decoding performance with distributions with means 19 (for geometric), 19.25 and 19.5 (with Poisson distributed dwell-times)

    , decoding performance with distributions with means 19 (for geometric), 20 and 21 (with Poisson distributed dwell-times)

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "A novel, fast, HMM-with-Duration implementation – for application with a new, pattern recognition informed, nanopore detector"</p><p>http://www.biomedcentral.com/1471-2105/8/S7/S19</p><p>BMC Bioinformatics 2007;8(Suppl 7):S19-S19.</p><p>Published online 1 Nov 2007</p><p>PMCID:PMC2099487.</p><p></p> , decoding performance with distributions with different mean separations, with a 1000-bin representation of the state dwell-time distribution. (See Fig. 11 caption for further details.

    Recovered duration histograms by learning the randomly initialized explicit duration DHMM for the maximum state duration of 30

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Duration learning for analysis of nanopore ionic current blockades"</p><p>http://www.biomedcentral.com/1471-2105/8/S7/S14</p><p>BMC Bioinformatics 2007;8(Suppl 7):S14-S14.</p><p>Published online 1 Nov 2007</p><p>PMCID:PMC2099482.</p><p></p

    Mixture of convolutions for Aggregate states 1 (Agr1) and 4 (Agr4) where in brackets we include mixture component number

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Duration learning for analysis of nanopore ionic current blockades"</p><p>http://www.biomedcentral.com/1471-2105/8/S7/S14</p><p>BMC Bioinformatics 2007;8(Suppl 7):S14-S14.</p><p>Published online 1 Nov 2007</p><p>PMCID:PMC2099482.</p><p></p> Transitions with weight 0.001 are negligible and were forcefully assigned by learning algorithms not to cause underflow in forward-backward procedure

    RNA CoMPASS: A Dual Approach for Pathogen and Host Transcriptome Analysis of RNA-Seq Datasets

    No full text
    <div><p>High-throughput RNA sequencing (RNA-seq) has become an instrumental assay for the analysis of multiple aspects of an organism's transcriptome. Further, the analysis of a biological specimen's associated microbiome can also be performed using RNA-seq data and this application is gaining interest in the scientific community. There are many existing bioinformatics tools designed for analysis and visualization of transcriptome data. Despite the availability of an array of next generation sequencing (NGS) analysis tools, the analysis of RNA-seq data sets poses a challenge for many biomedical researchers who are not familiar with command-line tools. Here we present RNA CoMPASS, a comprehensive RNA-seq analysis pipeline for the simultaneous analysis of transcriptomes and metatranscriptomes from diverse biological specimens. RNA CoMPASS leverages existing tools and parallel computing technology to facilitate the analysis of even very large datasets. RNA CoMPASS has a web-based graphical user interface with intrinsic queuing to control a distributed computational pipeline. RNA CoMPASS was evaluated by analyzing RNA-seq data sets from 45 B-cell samples. Twenty-two of these samples were derived from lymphoblastoid cell lines (LCLs) generated by the infection of naïve B-cells with the Epstein Barr virus (EBV), while another 23 samples were derived from Burkitt's lymphomas (BL), some of which arose in part through infection with EBV. Appropriately, RNA CoMPASS identified EBV in all LCLs and in a fraction of the BLs. Cluster analysis of the human transcriptome component of the RNA CoMPASS output clearly separated the BLs (which have a germinal center-like phenotype) from the LCLs (which have a blast-like phenotype) with evidence of activated MYC signaling and lower interferon and NF-kB signaling in the BLs. Together, this analysis illustrates the utility of RNA CoMPASS in the simultaneous analysis of transcriptome and metatranscriptome data. RNA CoMPASS is freely available at <a href="http://rnacompass.sourceforge.net/" target="_blank">http://rnacompass.sourceforge.net/</a>.</p></div

    Developmentally linked human DNA hypermethylation is associated with down-modulation, repression, and upregulation of transcription

    No full text
    <p>DNA methylation can affect tissue-specific gene transcription in ways that are difficult to discern from studies focused on genome-wide analyses of differentially methylated regions (DMRs). To elucidate the variety of associations between differentiation-related DNA hypermethylation and transcription, we used available epigenomic and transcriptomic profiles from 38 human cell/tissue types to focus on such relationships in 94 genes linked to hypermethylated DMRs in myoblasts (Mb). For 19 of the genes, promoter-region hypermethylation in Mb (and often a few heterologous cell types) was associated with gene repression but, importantly, DNA hypermethylation was absent in many other repressed samples. In another 24 genes, DNA hypermethylation overlapped cryptic enhancers or super-enhancers and correlated with down-modulated, but not silenced, gene expression. However, such methylation was absent, surprisingly, in both non-expressing samples and highly expressing samples. This suggests that some genes need DMR hypermethylation to help repress cryptic enhancer chromatin only when they are actively transcribed. For another 11 genes, we found an association between intergenic hypermethylated DMRs and positive expression of the gene in Mb. DNA hypermethylation/transcription correlations similar to those of Mb were evident sometimes in diverse tissues, such as aorta and brain. Our findings have implications for the possible involvement of methylated DNA in Duchenne's muscular dystrophy, congenital heart malformations, and cancer. This epigenomic analysis suggests that DNA methylation is not simply the inevitable consequence of changes in gene expression but, instead, is often an active agent for fine-tuning transcription in association with development.</p

    Heat Map representing Human B-Cells analyzed using RNA CoMPASS.

    No full text
    <p>Human transcript counts from the 45 B-cell samples were imported into the R software environment and analyzed using the edgeR package <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0089445#pone.0089445-Robinson1" target="_blank">[15]</a>. Genes with low transcript counts (less than 1 CPM (count per million)) in the majority of samples were filtered out. The Manhattan (L-1) distance matrix for the samples was computed using the remaining transcript counts, and this was taken as input for hierarchical clustering using the Ward algorithm. After assigning each sample to one of two groups identified by hierarchical clustering (Human B-Cell or Burkitt's Lymphoma), the glmFit function was used to fit the mean log(CPM) for each group and likelihood ratio tests were used to identify those genes that were differentially expressed, with adjusted <i>P</i><0.05 following the Benjamini-Hochberg correction for multiple testing. The fitted log(CPM) values for the subset of genes that were differentially expressed in the LCL samples relative to the Burkitt's lymphoma samples were then clustered using the Euclidean distance and complete linkage algorithm to detect groups of co-expressed genes. The expression heat map displays the top 250 differentially expressed genes.</p

    Performance Analysis of RNA CoMPASS.

    No full text
    <p>RNA CoMPASS was deployed on a local cluster and benchmarking was performed. An Akata RNA-seq data set was split into six files of varying sizes: 1–393.4 MB, 1,397,139 reads, 2–757 MB, 2,685,149 reads, 3–1.44 GB, 5,120,805 reads, 4–2.72 GB, 9,651,466 reads, 5–5.01 GB, 25,465,406 reads, sample 6–8.99 GB, 50,930,812 reads. Overall time was calculated for each file on a single machine (blue column) and on the local 4-node cluster (red column). Speedup time is represented as a green line.</p

    Schematic of RNA CoMPASS (RNA comprehensive multi-processor analysis system for sequencing) architecture.

    No full text
    <p>RNA CoMPASS is a graphical user interface (GUI) based parallel computation pipeline for the analysis of both exogenous and human sequences from RNA-seq data. It employs a commercial and several open-source programs to analyze RNA-seq data sets including Novoalign, SAMMate, BLAST, and MEGAN. Each step results in the subtraction of reads in order to further analyze the unmapped reads for pathogen discovery. The mapped reads are analyzed separately. The end result from this pipeline is pathogen discovery and host transcriptome analysis.</p

    Differences in Gastric Carcinoma Microenvironment Stratify According to EBV Infection Intensity: Implications for Possible Immune Adjuvant Therapy

    Get PDF
    <div><p>Epstein-Barr virus (EBV) is associated with roughly 10% of gastric carcinomas worldwide (EBVaGC). Although previous investigations provide a strong link between EBV and gastric carcinomas, these studies were performed using selected EBV gene probes. Using a cohort of gastric carcinoma RNA-seq data sets from The Cancer Genome Atlas (TCGA), we performed a quantitative and global assessment of EBV gene expression in gastric carcinomas and assessed EBV associated cellular pathway alterations. EBV transcripts were detected in 17% of samples but these samples varied significantly in EBV coverage depth. In four samples with the highest EBV coverage (hiEBVaGC – high EBV associated gastric carcinoma), transcripts from the BamHI A region comprised the majority of EBV reads. Expression of LMP2, and to a lesser extent, LMP1 were also observed as was evidence of abortive lytic replication. Analysis of cellular gene expression indicated significant immune cell infiltration and a predominant IFNG response in samples expressing high levels of EBV transcripts relative to samples expressing low or no EBV transcripts. Despite the apparent immune cell infiltration, high levels of the cytotoxic T-cell (CTL) and natural killer (NK) cell inhibitor, IDO1, was observed in the hiEBVaGCs samples suggesting an active tolerance inducing pathway in this subgroup. These results were confirmed in a separate cohort of 21 Vietnamese gastric carcinoma samples using qRT-PCR and on tissue samples using in situ hybridization and immunohistochemistry. Lastly, a panel of tumor suppressors and candidate oncogenes were expressed at lower levels in hiEBVaGC versus EBV-low and EBV-negative gastric cancers suggesting the direct regulation of tumor pathways by EBV.</p> </div
    corecore