8 research outputs found

    Speeding up the Consensus Clustering methodology for microarray data analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The inference of the number of clusters in a dataset, a fundamental problem in Statistics, Data Analysis and Classification, is usually addressed via internal validation measures. The stated problem is quite difficult, in particular for microarrays, since the inferred prediction must be sensible enough to capture the inherent biological structure in a dataset, e.g., functionally related genes. Despite the rich literature present in that area, the identification of an internal validation measure that is both fast and precise has proved to be elusive. In order to partially fill this gap, we propose a speed-up of <monospace>Consensus</monospace> (Consensus Clustering), a methodology whose purpose is the provision of a prediction of the number of clusters in a dataset, together with a dissimilarity matrix (the consensus matrix) that can be used by clustering algorithms. As detailed in the remainder of the paper, <monospace>Consensus</monospace> is a natural candidate for a speed-up.</p> <p>Results</p> <p>Since the time-precision performance of <monospace>Consensus</monospace> depends on two parameters, our first task is to show that a simple adjustment of the parameters is not enough to obtain a good precision-time trade-off. Our second task is to provide a fast approximation algorithm for <monospace>Consensus</monospace>. That is, the closely related algorithm <monospace>FC</monospace> (Fast Consensus) that would have the same precision as <monospace>Consensus</monospace> with a substantially better time performance. The performance of <monospace>FC</monospace> has been assessed via extensive experiments on twelve benchmark datasets that summarize key features of microarray applications, such as cancer studies, gene expression with up and down patterns, and a full spectrum of dimensionality up to over a thousand. Based on their outcome, compared with previous benchmarking results available in the literature, <monospace>FC</monospace> turns out to be among the fastest internal validation methods, while retaining the same outstanding precision of <monospace>Consensus</monospace>. Moreover, it also provides a consensus matrix that can be used as a dissimilarity matrix, guaranteeing the same performance as the corresponding matrix produced by <monospace>Consensus</monospace>. We have also experimented with the use of <monospace>Consensus</monospace> and <monospace>FC</monospace> in conjunction with <monospace>NMF</monospace> (Nonnegative Matrix Factorization), in order to identify the correct number of clusters in a dataset. Although <monospace>NMF</monospace> is an increasingly popular technique for biological data mining, our results are somewhat disappointing and complement quite well the state of the art about <monospace>NMF</monospace>, shedding further light on its merits and limitations.</p> <p>Conclusions</p> <p>In summary, <monospace>FC</monospace> with a parameter setting that makes it robust with respect to small and medium-sized datasets, i.e, number of items to cluster in the hundreds and number of conditions up to a thousand, seems to be the internal validation measure of choice. Moreover, the technique we have developed here can be used in other contexts, in particular for the speed-up of stability-based validation measures.</p

    Local operators to detect regions of interest

    No full text
    The performance of a visual system is strongly influenced by the information processing that is done in the early vision phase. The need exists to limit the computation on areas of interest to reduce the total amount of data and their redundancy. This paper describes a new method to drive the attention during the analysis of complex scenes. Two new local operators, based on the computation of local moments and symmetries, are combined to drive the selection. Experimental results on real data are also reported. © 1997 Elsevier Science B.V

    Image-based rendering of intersecting surfaces for dynamic comparative visualization

    No full text
    Nested or intersecting surfaces are proven techniques for visualizing shape differences between static 3D objects (Weigle and Taylor II, IEEE Visualization, Proceedings, pp. 503–510, 2005). In this paper we present an image-based formulation for these techniques that extends their use to dynamic scenarios, in which surfaces can be manipulated or even deformed interactively. The formulation is based on our new layered rendering pipeline, a generic image-based approach for rendering nested surfaces based on depth peeling and deferred shading. We use layered rendering to enhance the intersecting surfaces visualization. In addition to enabling interactive performance, our enhancements address several limitations of the original technique. Contours remove ambiguity regarding the shape of intersections. Local distances between the surfaces can be visualized at any point using either depth fogging or distance fields: Depth fogging is used as a cue for the distance between two surfaces in the viewing direction, whereas closest-point distance measures are visualized interactively by evaluating one surface’s distance field on the other surface. Furthermore, we use these measures to define a three-way surface segmentation, which visualizes regions of growth, shrinkage, and no change of a test surface compared with a reference surface. Finally, we demonstrate an application of our technique in the visualization of statistical shape models. We evaluate our technique based on feedback provided by medical image analysis researchers, who are experts in working with such models.Intelligent SystemsElectrical Engineering, Mathematics and Computer Scienc
    corecore