2,966 research outputs found
VizRank: Data Visualization Guided by Machine Learning
Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics
A Differentiation-Based Phylogeny of Cancer Subtypes
Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors
Tissue Phenomics for prognostic biomarker discovery in low- and intermediate-risk prostate cancer
Tissue Phenomics is the discipline of mining tissue images to identify patterns that are related to clinical outcome providing potential prognostic and predictive value. This involves the discovery process from assay development, image analysis, and data mining to the final interpretation and validation of the findings. Importantly, this process is not linear but allows backward steps and optimization loops over multiple sub-processes. We provide a detailed description of the Tissue Phenomics methodology while exemplifying each step on the application of prostate cancer recurrence prediction. In particular, we automatically identified tissue-based biomarkers having significant prognostic value for low-and intermediate-risk prostate cancer patients (Gleason scores 6-7b) after radical prostatectomy. We found that promising phenes were related to CD8(+) and CD68(+) cells in the microenvironment of cancerous glands in combination with the local micro-vascularization. Recurrence prediction based on the selected phenes yielded accuracies up to 83% thereby clearly outperforming prediction based on the Gleason score. Moreover, we compared different machine learning algorithms to combine the most relevant phenes resulting in increased accuracies of 88% for tumor progression prediction. These findings will be of potential use for future prognostic tests for prostate cancer patients and provide a proof-of-principle of the Tissue Phenomics approach
Differentiation associated regulation of microRNA expression in vivo in human CD8+ T cell subsets
BACKGROUND: The differentiation of CD8+ T lymphocytes following priming of naïve cells is central in the establishment of the adaptive immune response. Yet, the molecular events underlying this process are not fully understood. MicroRNAs have been recently shown to play a key role in the regulation of haematopoiesis in mouse, but their implication in peripheral lymphocyte differentiation in humans remains largely unknown.
METHODS: In order to explore the potential implication of microRNAs in CD8+ T cell differentiation in humans, microRNA expression profiles were analysed using microarrays and quantitative PCR in several human CD8+ T cell subsets defining the major steps of the T cell differentiation pathway.
RESULTS: We found expression of a limited set of microRNAs, including the miR-17~92 cluster. Moreover, we reveal the existence of differentiation-associated regulation of specific microRNAs. When compared to naive cells, miR-21 and miR-155 were indeed found upregulated upon differentiation to effector cells, while expression of the miR-17~92 cluster tended to concomitantly decrease.
CONCLUSIONS: This study establishes for the first time in a large panel of individuals the existence of differentiation associated regulation of microRNA expression in human CD8+ T lymphocytes in vivo, which is likely to impact on specific cellular functions
Wine metrics : revealing the volatile molecular feature responsible for the wine like aroma
O vinho é uma matriz complexa composta por uma variedade de aromas provenientes das
diferentes interações dos seus variados compostos. Aroma é geralmente associado a
compostos voláteis, que resultam da fermentação alcoólica de acordo com a levedura utilizada
e condições utilizadas. A grande concorrência do sector está levar os produtores a
compreender melhor as expectativas e preferências dos seus consumidores. A motivação desta
tese vai ao encontro de uma ferramenta para compreender o aroma “tipo-vinho” levando à
seleção de estirpes de S. cerevisiae de acordo com o padrão de voláteis em função dos
consumidores. Este trabalho irá fornecer informações sobre a conexão entre o metabolismo da
fermentação do vinho e a percepção do aroma vínico.
Neste contexto, foram realizadas 3 réplicas de 4 fermentações em um meio sintético, usando
diferentes estirpes de levedura em cada fermentação (3 estirpes vínicas: QA23, VL1, ZA e 1
estirpe de cachaça: L328). Os perfis metabólicos das fermentações foram obtidos através de
cromatografia gasosa ligada a um detetor de ionização por chama (GC-FID), espectrometria
de massa (GC-MS) e cromatografia líquida de alto desempenho (HPLC) de modo a
quantificar os compostos. Um estudo sensorial foi usado para avaliar se os consumidores
reconheciam o aroma “tipo-vinho” no decorrer das fermentações. Esta estratégia direcionada
em conjunto com uma análise não direcionada usando técnicas de análise multivariada como:
decomposição singular do valor (SVD) e análise hierárquica de Agrupamentos (HCA),
revelou que o aroma “tipo-vinho” das fermentações está relacionada com acetaldeído, acetato
de hexilo e ésteres etílicos. A estirpe L328 revelou-se a que apresenta melhores resultados
sensoriais e melhor correlação com os compostos responsáveis pelo aroma “tipo-vinho”. Uma
análise supervisionada, nomeadamente PLS-R, permitiu a construção de um modelo que
prevê os resultados (R2 = 0.8)) sensoriais do aroma “tipo-vinho” pelo padrão aromático. Por
fim uma estratégia não direcionada foi usada com os dados de GC-MS pré-processados e
análise por SVD. Este método demonstrou distinguir estirpes baseadas na evolução
metabólica durante a fermentação, L328 e QA23 revelaram ser facilmente distinguíveis das
restantes.
No decorrer desta tese foram utilizadas estratégias de quimiometria e bioinformática para o
estudo metabólico de fermentações e a possibilidade de uma pré-seleção das estirpes de
acordo as características do produto final.Wine represents a variety of aromas that stem from a complex, completely non-linear system
of interactions among many hundreds of compounds. Aroma is usually associated with
odorous or volatile compounds that result from fermentation. Yeast strain and fermentation
conditions are claimed to be the most important factors influencing the aromas produced in
wine. The fierce competition is forcing wine producers to understand better the expectations
and preferences of their target market so they can produce wines accordingly. The motivation
of this thesis is in the line with a tool for yeast strains selection according with the volatiles
output giving a better acceptance by the consumers. This work provided insights about the
connection between the wine fermentation metabolism and the product “wine-like” aroma
perception.
In this context were performed 3 replicas of 4 fermentations of a synthetic grape juice with a
different S. cerevisae strain on each (3 wine strains: QA23, VL1, ZA and 1 cachaça strain:
L328). The metabolic profiles of fermentations were obtained using gas chromatography
attached to a flame ionization detector (GC-FID) or to mass spectrometry (GC-MS) and High
Performance Liquid Chromatography (HPLC). A sensorial study also was used in order to
evaluate the recognition of a “wine-like” aroma. This target approach coupled with
unsupervised analysis, namely Singular Value Decomposition (SVD) and Hierarchical Cluster
Analysis (HCA), revealed that the “wine-like” aroma key odorants are acetaldehyde, hexyl
acetate and ethyl esters. L328 revealed to be the strain with better scores and correlation with
the sensorial analysis scores of “wine-like”. A supervised analysis, Partial least squares
regression (PLS-R) model, allowed the prediction (R2 = 0.8) of the “wine-like” scores of the
samples during fermentation process. Finally, an untargeted metabolomic approach
combining GC-MS data preprocessing with SVD was able to distinguish strains based on
their metabolic profiles evolution during the fermentation time. L328 and QA23 strains
revealed to be easily distinguished from each other and from the couple ZA and VL1.
In conclusion this study demonstrated the potential of the use of chemometrics and
bioinformatics approaches was explored in the characterization, prediction and classification
of metabolic profiles from fermentations and the possibility of selection of the yeast strain
according the final product characteristics
FOCIS: A forest classification and inventory system using LANDSAT and digital terrain data
Accurate, cost-effective stratification of forest vegetation and timber inventory is the primary goal of a Forest Classification and Inventory System (FOCIS). Conventional timber stratification using photointerpretation can be time-consuming, costly, and inconsistent from analyst to analyst. FOCIS was designed to overcome these problems by using machine processing techniques to extract and process tonal, textural, and terrain information from registered LANDSAT multispectral and digital terrain data. Comparison of samples from timber strata identified by conventional procedures showed that both have about the same potential to reduce the variance of timber volume estimates over simple random sampling
- …