34 research outputs found
RTCGAToolbox: A New Tool for Exporting TCGA Firehose Data
Background & Objective Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent steps. We have developed an open source and extensible R based data client for pre-processed data from the Firehouse, and demonstrate its use with sample case studies. Results show that our RTCGAToolbox can facilitate data management for researchers interested in working with TCGA data. The RTCGAToolbox can also be integrated with other analysis pipelines for further data processing. Availability and implementation The RTCGAToolbox is open-source and licensed under the GNU General Public License Version 2.0. All documentation and source code for RTCGAToolbox is freely available at http://mksamur.github.io/RTCGAToolbox/ for Linux and Mac OS X operating systems
Recommended from our members
canEvolve: A Web Portal for Integrative Oncogenomics
Background & objective: Genome-wide profiles of tumors obtained using functional genomics platforms are being deposited to the public repositories at an astronomical scale, as a result of focused efforts by individual laboratories and large projects such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium. Consequently, there is an urgent need for reliable tools that integrate and interpret these data in light of current knowledge and disseminate results to biomedical researchers in a user-friendly manner. We have built the canEvolve web portal to meet this need. Results: canEvolve query functionalities are designed to fulfill most frequent analysis needs of cancer researchers with a view to generate novel hypotheses. canEvolve stores gene, microRNA (miRNA) and protein expression profiles, copy number alterations for multiple cancer types, and protein-protein interaction information. canEvolve allows querying of results of primary analysis, integrative analysis and network analysis of oncogenomics data. The querying for primary analysis includes differential gene and miRNA expression as well as changes in gene copy number measured with SNP microarrays. canEvolve provides results of integrative analysis of gene expression profiles with copy number alterations and with miRNA profiles as well as generalized integrative analysis using gene set enrichment analysis. The network analysis capability includes storage and visualization of gene co-expression, inferred gene regulatory networks and protein-protein interaction information. Finally, canEvolve provides correlations between gene expression and clinical outcomes in terms of univariate survival analysis. Conclusion: At present canEvolve provides different types of information extracted from 90 cancer genomics studies comprising of more than 10,000 patients. The presence of multiple data types, novel integrative analysis for identifying regulators of oncogenesis, network analysis and ability to query gene lists/pathways are distinctive features of canEvolve. canEvolve will facilitate integrative and meta-analysis of oncogenomics datasets
Overall RTCGAToolbox structure and workflow.
<p>(A) Overall representation of RTCGAToolbox layers from Firehose web portal to user environments. (B) Sample workflow for “BRCA” dataset.</p
Current Firehose data content (Some of these data may not be accessible due to TCGA data restrictions, full data table can be accessible via http://gdac.broadinstitute.org/runs/stddata__2014_03_16/ingested_data.html).
<p>Current Firehose data content (Some of these data may not be accessible due to TCGA data restrictions, full data table can be accessible via <a href="http://gdac.broadinstitute.org/runs/stddata__2014_03_16/ingested_data.html" target="_blank">http://gdac.broadinstitute.org/runs/stddata__2014_03_16/ingested_data.html</a>).</p
Summary plot for BRCA dataset.
<p>A circle plot that shows the differentially expressed genes result from RNASeq and microarray platform (Inner circle 1 and 2, y axis represents the fold change value, red dots are up regulated and blue dots are down regulated in cancer samples), copy number changes (inner third circle, blue zones represents the deletions and red circle represents the amplifications) and outer circle shows the genes that has mutation at least 5% of samples.</p
Sample heatmap outputs from BRCA dataset.
<p>Panel A and B show the top differentially up and down regulated genes between “Cancer” and “Normal” samples by using RNASeq and microarray data respectively.</p
Deciphering the Chronology of Copy Number Alterations in Multiple Myeloma (MM): What Comes First?
International audienc