61 research outputs found
Polychrome: Creating and Assessing Qualitative Palettes with Many Colors
Although R includes numerous tools for creating color palettes to display continuous data, facilities for displaying categorical data primarily use the RColorBrewer package, which is, by default, limited to 12 colors. The colorspace package can produce more colors, but it is not immediately clear how to use it to produce colors that can be reliably distingushed in different kinds of plots. However, applications to genomics would be enhanced by the ability to display at least the 24 human chromosomes in distinct colors, as is common in technologies like spectral karyotyping. In this article, we describe the Polychrome package, which can be used to construct palettes with at least 24 colors that can be distinguished by most people with normal color vision. Polychrome includes a variety of visualization methods allowing users to evaluate the proposed palettes. In addition, we review the history of attempts to construct qualitative color palettes with many colors
Pattern recognition in lymphoid malignancies using CytoGPS and Mercator
BACKGROUND: There have been many recent breakthroughs in processing and analyzing large-scale data sets in biomedical informatics. For example, the CytoGPS algorithm has enabled the use of text-based karyotypes by transforming them into a binary model. However, such advances are accompanied by new problems of data sparsity, heterogeneity, and noisiness that are magnified by the large-scale multidimensional nature of the data. To address these problems, we developed the Mercator R package, which processes and visualizes binary biomedical data. We use Mercator to address biomedical questions of cytogenetic patterns relating to lymphoid hematologic malignancies, which include a broad set of leukemias and lymphomas. Karyotype data are one of the most common form of genetic data collected on lymphoid malignancies, because karyotyping is part of the standard of care in these cancers.
RESULTS: In this paper we combine the analytic power of CytoGPS and Mercator to perform a large-scale multidimensional pattern recognition study on 22,741 karyotype samples in 47 different hematologic malignancies obtained from the public Mitelman database.
CONCLUSION: Our findings indicate that Mercator was able to identify both known and novel cytogenetic patterns across different lymphoid malignancies, furthering our understanding of the genetics of these diseases
A Two-Gene Signature, SKI and SLAMF1, Predicts Time-to-Treatment in Previously Untreated Patients with Chronic Lymphocytic Leukemia
We developed and validated a two-gene signature that predicts prognosis in previously-untreated chronic lymphocytic leukemia (CLL) patients. Using a 65 sample training set, from a cohort of 131 patients, we identified the best clinical models to predict time-to-treatment (TTT) and overall survival (OS). To identify individual genes or combinations in the training set with expression related to prognosis, we cross-validated univariate and multivariate models to predict TTT. We identified four gene sets (5, 6, 12, or 13 genes) to construct multivariate prognostic models. By optimizing each gene set on the training set, we constructed 11 models to predict the time from diagnosis to treatment. Each model also predicted OS and added value to the best clinical models. To determine which contributed the most value when added to clinical variables, we applied the Akaike Information Criterion. Two genes were consistently retained in the models with clinical variables: SKI (v-SKI avian sarcoma viral oncogene homolog) and SLAMF1 (signaling lymphocytic activation molecule family member 1; CD150). We optimized a two-gene model and validated it on an independent test set of 66 samples. This two-gene model predicted prognosis better on the test set than any of the known predictors, including ZAP70 and serum Ξ²2-microglobulin
Myeloid neoplasm with histiocytosis and spleen tyrosine kinase fusion responds to fostamatinib
Not available
Transformation of Human Mesenchymal Cells and Skin Fibroblasts into Hematopoietic Cells
Patients with prolonged myelosuppression require frequent platelet and occasional granulocyte transfusions. Multi-donor transfusions induce alloimmunization, thereby increasing morbidity and mortality. Therefore, an autologous or HLA-matched allogeneic source of platelets and granulocytes is needed. To determine whether nonhematopoietic cells can be reprogrammed into hematopoietic cells, human mesenchymal stromal cells (MSCs) and skin fibroblasts were incubated with the demethylating agent 5-azacytidine (Aza) and the growth factors (GF) granulocyte-macrophage colony-stimulating factor and stem cell factor. This treatment transformed MSCs to round, non-adherent cells expressing T-, B-, myeloid-, or stem/progenitor-cell markers. The transformed cells engrafted as hematopoietic cells in bone marrow of immunodeficient mice. DNA methylation and mRNA array analysis suggested that Aza and GF treatment demethylated and activated HOXB genes. Indeed, transfection of MSCs or skin fibroblasts with HOXB4, HOXB5, and HOXB2 genes transformed them into hematopoietic cells. Further studies are needed to determine whether transformed MSCs or skin fibroblasts are suitable for therapy
Thirty biologically interpretable clusters of transcription factors distinguish cancer type
Abstract Background Transcription factors are essential regulators of gene expression and play critical roles in development, differentiation, and in many cancers. To carry out their regulatory programs, they must cooperate in networks and bind simultaneously to sites in promoter or enhancer regions of genes. We hypothesize that the mRNA co-expression patterns of transcription factors can be used both to learn how they cooperate in networks and to distinguish between cancer types. Results We recently developed a new algorithm, Thresher, that combines principal component analysis, outlier filtering, and von Mises-Fisher mixture models to cluster genes (in this case, transcription factors) based on expression, determining the optimal number of clusters in the process. We applied Thresher to the RNA-Seq expression data of 486 transcription factors from more than 10,000 samples of 33 kinds of cancer studied in The Cancer Genome Atlas (TCGA). We found that 30 clusters of transcription factors from a 29-dimensional principal component space were able to distinguish between most cancer types, and could separate tumor samples from normal controls. Moreover, each cluster of transcription factors could be either (i) linked to a tissue-specific expression pattern or (ii) associated with a fundamental biological process such as cell cycle, angiogenesis, apoptosis, or cytoskeleton. Clusters of the second type were more likely also to be associated with embryonically lethal mouse phenotypes. Conclusions Using our approach, we have shown that the mRNA expression patterns of transcription factors contain most of the information needed to distinguish different cancer types. The Thresher method is capable of discovering biologically interpretable clusters of genes. It can potentially be applied to other gene sets, such as signaling pathways, to decompose them into simpler, yet biologically meaningful, components
- β¦