2,966 research outputs found

    VizRank: Data Visualization Guided by Machine Learning

    Get PDF
    Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics

    A Differentiation-Based Phylogeny of Cancer Subtypes

    Get PDF
    Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors

    Tissue Phenomics for prognostic biomarker discovery in low- and intermediate-risk prostate cancer

    Get PDF
    Tissue Phenomics is the discipline of mining tissue images to identify patterns that are related to clinical outcome providing potential prognostic and predictive value. This involves the discovery process from assay development, image analysis, and data mining to the final interpretation and validation of the findings. Importantly, this process is not linear but allows backward steps and optimization loops over multiple sub-processes. We provide a detailed description of the Tissue Phenomics methodology while exemplifying each step on the application of prostate cancer recurrence prediction. In particular, we automatically identified tissue-based biomarkers having significant prognostic value for low-and intermediate-risk prostate cancer patients (Gleason scores 6-7b) after radical prostatectomy. We found that promising phenes were related to CD8(+) and CD68(+) cells in the microenvironment of cancerous glands in combination with the local micro-vascularization. Recurrence prediction based on the selected phenes yielded accuracies up to 83% thereby clearly outperforming prediction based on the Gleason score. Moreover, we compared different machine learning algorithms to combine the most relevant phenes resulting in increased accuracies of 88% for tumor progression prediction. These findings will be of potential use for future prognostic tests for prostate cancer patients and provide a proof-of-principle of the Tissue Phenomics approach

    Differentiation associated regulation of microRNA expression in vivo in human CD8+ T cell subsets

    Get PDF
    BACKGROUND: The differentiation of CD8+ T lymphocytes following priming of naïve cells is central in the establishment of the adaptive immune response. Yet, the molecular events underlying this process are not fully understood. MicroRNAs have been recently shown to play a key role in the regulation of haematopoiesis in mouse, but their implication in peripheral lymphocyte differentiation in humans remains largely unknown. METHODS: In order to explore the potential implication of microRNAs in CD8+ T cell differentiation in humans, microRNA expression profiles were analysed using microarrays and quantitative PCR in several human CD8+ T cell subsets defining the major steps of the T cell differentiation pathway. RESULTS: We found expression of a limited set of microRNAs, including the miR-17~92 cluster. Moreover, we reveal the existence of differentiation-associated regulation of specific microRNAs. When compared to naive cells, miR-21 and miR-155 were indeed found upregulated upon differentiation to effector cells, while expression of the miR-17~92 cluster tended to concomitantly decrease. CONCLUSIONS: This study establishes for the first time in a large panel of individuals the existence of differentiation associated regulation of microRNA expression in human CD8+ T lymphocytes in vivo, which is likely to impact on specific cellular functions

    Wine metrics : revealing the volatile molecular feature responsible for the wine like aroma

    Get PDF
    O vinho é uma matriz complexa composta por uma variedade de aromas provenientes das diferentes interações dos seus variados compostos. Aroma é geralmente associado a compostos voláteis, que resultam da fermentação alcoólica de acordo com a levedura utilizada e condições utilizadas. A grande concorrência do sector está levar os produtores a compreender melhor as expectativas e preferências dos seus consumidores. A motivação desta tese vai ao encontro de uma ferramenta para compreender o aroma “tipo-vinho” levando à seleção de estirpes de S. cerevisiae de acordo com o padrão de voláteis em função dos consumidores. Este trabalho irá fornecer informações sobre a conexão entre o metabolismo da fermentação do vinho e a percepção do aroma vínico. Neste contexto, foram realizadas 3 réplicas de 4 fermentações em um meio sintético, usando diferentes estirpes de levedura em cada fermentação (3 estirpes vínicas: QA23, VL1, ZA e 1 estirpe de cachaça: L328). Os perfis metabólicos das fermentações foram obtidos através de cromatografia gasosa ligada a um detetor de ionização por chama (GC-FID), espectrometria de massa (GC-MS) e cromatografia líquida de alto desempenho (HPLC) de modo a quantificar os compostos. Um estudo sensorial foi usado para avaliar se os consumidores reconheciam o aroma “tipo-vinho” no decorrer das fermentações. Esta estratégia direcionada em conjunto com uma análise não direcionada usando técnicas de análise multivariada como: decomposição singular do valor (SVD) e análise hierárquica de Agrupamentos (HCA), revelou que o aroma “tipo-vinho” das fermentações está relacionada com acetaldeído, acetato de hexilo e ésteres etílicos. A estirpe L328 revelou-se a que apresenta melhores resultados sensoriais e melhor correlação com os compostos responsáveis pelo aroma “tipo-vinho”. Uma análise supervisionada, nomeadamente PLS-R, permitiu a construção de um modelo que prevê os resultados (R2 = 0.8)) sensoriais do aroma “tipo-vinho” pelo padrão aromático. Por fim uma estratégia não direcionada foi usada com os dados de GC-MS pré-processados e análise por SVD. Este método demonstrou distinguir estirpes baseadas na evolução metabólica durante a fermentação, L328 e QA23 revelaram ser facilmente distinguíveis das restantes. No decorrer desta tese foram utilizadas estratégias de quimiometria e bioinformática para o estudo metabólico de fermentações e a possibilidade de uma pré-seleção das estirpes de acordo as características do produto final.Wine represents a variety of aromas that stem from a complex, completely non-linear system of interactions among many hundreds of compounds. Aroma is usually associated with odorous or volatile compounds that result from fermentation. Yeast strain and fermentation conditions are claimed to be the most important factors influencing the aromas produced in wine. The fierce competition is forcing wine producers to understand better the expectations and preferences of their target market so they can produce wines accordingly. The motivation of this thesis is in the line with a tool for yeast strains selection according with the volatiles output giving a better acceptance by the consumers. This work provided insights about the connection between the wine fermentation metabolism and the product “wine-like” aroma perception. In this context were performed 3 replicas of 4 fermentations of a synthetic grape juice with a different S. cerevisae strain on each (3 wine strains: QA23, VL1, ZA and 1 cachaça strain: L328). The metabolic profiles of fermentations were obtained using gas chromatography attached to a flame ionization detector (GC-FID) or to mass spectrometry (GC-MS) and High Performance Liquid Chromatography (HPLC). A sensorial study also was used in order to evaluate the recognition of a “wine-like” aroma. This target approach coupled with unsupervised analysis, namely Singular Value Decomposition (SVD) and Hierarchical Cluster Analysis (HCA), revealed that the “wine-like” aroma key odorants are acetaldehyde, hexyl acetate and ethyl esters. L328 revealed to be the strain with better scores and correlation with the sensorial analysis scores of “wine-like”. A supervised analysis, Partial least squares regression (PLS-R) model, allowed the prediction (R2 = 0.8) of the “wine-like” scores of the samples during fermentation process. Finally, an untargeted metabolomic approach combining GC-MS data preprocessing with SVD was able to distinguish strains based on their metabolic profiles evolution during the fermentation time. L328 and QA23 strains revealed to be easily distinguished from each other and from the couple ZA and VL1. In conclusion this study demonstrated the potential of the use of chemometrics and bioinformatics approaches was explored in the characterization, prediction and classification of metabolic profiles from fermentations and the possibility of selection of the yeast strain according the final product characteristics

    FOCIS: A forest classification and inventory system using LANDSAT and digital terrain data

    Get PDF
    Accurate, cost-effective stratification of forest vegetation and timber inventory is the primary goal of a Forest Classification and Inventory System (FOCIS). Conventional timber stratification using photointerpretation can be time-consuming, costly, and inconsistent from analyst to analyst. FOCIS was designed to overcome these problems by using machine processing techniques to extract and process tonal, textural, and terrain information from registered LANDSAT multispectral and digital terrain data. Comparison of samples from timber strata identified by conventional procedures showed that both have about the same potential to reduce the variance of timber volume estimates over simple random sampling
    corecore