Search CORE

48 research outputs found

Recommended from our members

Visualization Support to Interactive Cluster Analysis

Author: G Andrienko
G Andrienko
J Seo
JW Sammon
S Rinzivillo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

We demonstrate interactive visual embedding of partition-based clustering of multidimensional data using methods from the open-source machine learning library Weka. According to the visual analytics paradigm, knowledge is gradually built and refined by a human analyst through iterative application of clustering with different parameter settings and to different data subsets. To show clustering results to the analyst, cluster membership is typically represented by color coding. Our tools support the color consistency between different steps of the process. We shall demonstrate two-way clustering of spatial time series, in which clustering will be applied to places and to time steps

City Research Online

Crossref

ICAP: An Interactive Cluster Analysis Procedure for analyzing remotely sensed data

Author: Wharton S. W.
Publication venue
Publication date
Field of study

An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. The algorithm interfaces the rapid numerical processing capacity of a computer with the human ability to integrate qualitative information. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters and the analyst, who evaluate and elect to modify the cluster structure. Clusters can be deleted or lumped pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The ICAP was implemented in APL (A Programming Language), an interactive computer language. The flexibility of the algorithm was evaluated using data from different LANDSAT scenes to simulate two situations: one in which the analyst is assumed to have no prior knowledge about the data and wishes to have the clusters formed more or less automatically; and the other in which the analyst is assumed to have some knowledge about the data structure and wishes to use that information to closely supervise the clustering process. For comparison, an existing clustering method was also applied to the two data sets

NASA Technical Reports Server

Interactive spatiotemporal cluster analysis of vast challenge 2008 datasets

Author: Gennady Andrienko
Natalia Andrienko
Publication venue
Publication date: 01/01/2009
Field of study

We describe a visual analytics method supporting the analysis of two different types of spatio-temporal data, point events and trajectories of moving agents. The method combines clustering with interactive visual displays, in particular, map and space-time cube. We demonstrate the use of the method by applying it to two datasets from the VAST Challenge 2008: evacuation traces (trajectories of people movement) and landings and interdictions of migrant boats (point events)

CiteSeerX

Crossref

Fraunhofer-ePrints

SEQOPTICS: a protein sequence clustering system

Author: Chen Yonghui
Guan Zhijie
Reilly Kevin D
Sprague Alan P
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Protein sequence clustering has been widely used as a part of the analysis of protein structure and function. In most cases single linkage or graph-based clustering algorithms have been applied. OPTICS (Ordering Points To Identify the Clustering Structure) is an attractive approach due to its emphasis on visualization of results and support for interactive work, e.g., in choosing parameters. However, OPTICS has not been used, as far as we know, for protein sequence clustering. RESULTS: In this paper, a system of clustering proteins, SEQOPTICS (SEQuence clustering with OPTICS) is demonstrated. The system is implemented with Smith-Waterman as protein distance measurement and OPTICS at its core to perform protein sequence clustering. SEQOPTICS is tested with four data sets from different data sources. Visualization of the sequence clustering structure is demonstrated as well. CONCLUSION: The system was evaluated by comparison with other existing methods. Analysis of the results demonstrates that SEQOPTICS performs better based on some evaluation criteria including Jaccard coefficient, Precision, and Recall. It is a promising protein sequence clustering method with future possible improvement on parallel computing and other protein distance measurements

Springer - Publisher Connector

PubMed Central

Incremental procedures for partitioning highly intermixed multi-class datasets into hyper-spherical and hyper-ellipsoidal clusters

Author: Kong Qinglu
Zhu Qiuming
Publication venue: DigitalCommons@UNO
Publication date: 01/11/2007
Field of study

Two procedures for partitioning large collections of highly intermixed datasets of different classes into a number of hyper-spherical or hyper-ellipsoidal clusters are presented. The incremental procedures are to generate a minimum numbers of hyper-spherical or hyper-ellipsoidal clusters with each cluster containing a maximum number of data points of the same class. The procedures extend the move-to-front algorithms originally designed for construction of minimum sized enclosing balls or ellipsoids for dataset of a single class. The resulting clusters of the dataset can be used for data modeling, outlier detection, discrimination analysis, and knowledge discovery

The University of Nebraska, Omaha

Recommended from our members

Integrating cluster formation and cluster evaluation in interactive visual analysis

Author: Hauser H.
Parulek J.
Reuter N.
Turkay C.
Publication venue
Publication date
Field of study

Cluster analysis is a popular method for data investigation where data items are structured into groups called clusters. This analysis involves two sequential steps, namely cluster formation and cluster evaluation. In this paper, we propose the tight integration of cluster formation and cluster evaluation in interactive visual analysis in order to overcome the challenges that relate to the black-box nature of clustering algorithms. We present our conceptual framework in the form of an interactive visual environment. In this realization of our framework, we build upon general concepts such as cluster comparison, clustering tendency, cluster stability and cluster coherence. Additionally, we showcase our framework on the cluster analysis of mixed lipid bilayers

City Research Online

Revisiting Bertin Matrices: New Interactions for Crafting Tabular Visualizations

Author: Dragicevic P.
Fekete J.
Perin C.
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 09/11/2014
Field of study

We present Bertifier, a web app for rapidly creating tabular visualizations from spreadsheets. Bertifier draws from Jacques Bertin's matrix analysis method, whose goal was to “simplify without destroying” by encoding cell values visually and grouping similar rows and columns. Although there were several attempts to bring this method to computers, no implementation exists today that is both exhaustive and accessible to a large audience. Bertifier remains faithful to Bertin's method while leveraging the power of today's interactive computers. Tables are formatted and manipulated through crossets, a new interaction technique for rapidly applying operations on rows and columns. We also introduce visual reordering, a semi-interactive reordering approach that lets users apply and tune automatic reordering algorithms in a WYSIWYG manner. Sessions with eight users from different backgrounds suggest that Bertifier has the potential to bring Bertin's method to a wider audience of both technical and non-technical users, and empower them with data analysis and communication tools that were so far only accessible to a handful of specialists

HAL-CentraleSupelec

City Research Online

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

ClusterNet: A Perception-Based Clustering Model for Scattered Data

Author: Hartwig Sebastian
Hermosilla Pedro
Ropinski Timo
van Onzenoodt Christian
Publication venue
Publication date: 04/09/2023
Field of study

Visualizations for scattered data are used to make users understand certain attributes of their data by solving different tasks, e.g. correlation estimation, outlier detection, cluster separation. In this paper, we focus on the later task, and develop a technique that is aligned to human perception, that can be used to understand how human subjects perceive clusterings in scattered data and possibly optimize for better understanding. Cluster separation in scatterplots is a task that is typically tackled by widely used clustering techniques, such as for instance k-means or DBSCAN. However, as these algorithms are based on non-perceptual metrics, we can show in our experiments, that their output do not reflect human cluster perception. We propose a learning strategy which directly operates on scattered data. To learn perceptual cluster separation on this data, we crowdsourced a large scale dataset, consisting of 7,320 point-wise cluster affiliations for bivariate data, which has been labeled by 384 human crowd workers. Based on this data, we were able to train ClusterNet, a point-based deep learning model, trained to reflect human perception of cluster separability. In order to train ClusterNet on human annotated data, we use a PointNet++ architecture enabling inference on point clouds directly. In this work, we provide details on how we collected our dataset, report statistics of the resulting annotations, and investigate perceptual agreement of cluster separation for real-world data. We further report the training and evaluation protocol of ClusterNet and introduce a novel metric, that measures the accuracy between a clustering technique and a group of human annotators. Finally, we compare our approach against existing state-of-the-art clustering techniques and can show, that ClusterNet is able to generalize to unseen and out of scope data.Comment: Currently, this manuscript is under revision at TVC

arXiv.org e-Print Archive