4,553 research outputs found
Joint co-clustering: co-clustering of genomic and clinical bioimaging data
AbstractFor better understanding the genetic mechanisms underlying clinical observations, and better defining a group of potential candidates for protein-family-inhibiting therapy, it is interesting to determine the correlations between genomic, clinical data and data coming from high resolution and fluorescent microscopy. We introduce a computational method, called joint co-clustering, that can find co-clusters or groups of genes, bioimaging parameters and clinical traits that are believed to be closely related to each other based on the given empirical information. As bioimaging parameters, we quantify the expression of growth factor receptor EGFR/erb-B family in non-small cell lung carcinoma (NSCLC) through a fully-automated computer-aided analysis approach. This immunohistochemical analysis is usually performed by pathologists via visual inspection of tissue samples images. Our fully-automated techniques streamlines this error-prone and time-consuming process, thereby facilitating analysis and diagnosis. Experimental results for several real-life datasets demonstrate the high quantitative precision of our approach. The joint co-clustering method was tested with the receptor EGFR/erb-B family data on non-small cell lung carcinoma (NSCLC) tissue and identified statistically significant co-clusters of genes, receptor protein expression and clinical traits. The validation of our results with the literature suggest that the proposed method can provide biologically meaningful co-clusters of genes and traits and that it is a very promising approach to analyse large-scale biological data and to study multi-factorial genetic pathologies through their genetic alterations
Comparing high dimensional partitions, with the Coclustering Adjusted Rand Index
We consider the simultaneous clustering of rows and columns of a matrix and
more particularly the ability to measure the agreement between two
co-clustering partitions. The new criterion we developed is based on the
Adjusted Rand Index and is called the Co-clustering Adjusted Rand Index named
CARI. We also suggest new improvements to existing criteria such as the
Classification Error which counts the proportion of misclassified cells and the
Extended Normalized Mutual Information criterion which is a generalization of
the criterion based on mutual information in the case of classic
classifications. We study these criteria with regard to some desired properties
deriving from the co-clustering context. Experiments on simulated and real
observed data are proposed to compare the behavior of these criteria.Comment: 52 page
Co-Clustering Network-Constrained Trajectory Data
Recently, clustering moving object trajectories kept gaining interest from
both the data mining and machine learning communities. This problem, however,
was studied mainly and extensively in the setting where moving objects can move
freely on the euclidean space. In this paper, we study the problem of
clustering trajectories of vehicles whose movement is restricted by the
underlying road network. We model relations between these trajectories and road
segments as a bipartite graph and we try to cluster its vertices. We
demonstrate our approaches on synthetic data and show how it could be useful in
inferring knowledge about the flow dynamics and the behavior of the drivers
using the road network
- …