2,737 research outputs found
VizRank: Data Visualization Guided by Machine Learning
Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics
Visualizing probabilistic models: Intensive Principal Component Analysis
Unsupervised learning makes manifest the underlying structure of data without
curated training and specific problem definitions. However, the inference of
relationships between data points is frustrated by the `curse of
dimensionality' in high-dimensions. Inspired by replica theory from statistical
mechanics, we consider replicas of the system to tune the dimensionality and
take the limit as the number of replicas goes to zero. The result is the
intensive embedding, which is not only isometric (preserving local distances)
but allows global structure to be more transparently visualized. We develop the
Intensive Principal Component Analysis (InPCA) and demonstrate clear
improvements in visualizations of the Ising model of magnetic spins, a neural
network, and the dark energy cold dark matter ({\Lambda}CDM) model as applied
to the Cosmic Microwave Background.Comment: 6 pages, 5 figure
- …