38,776 research outputs found
Data analytics enhanced data visualization and interrogation with parallel coordinates plots
© 2018 IEEE. Parallel coordinates plots (PCPs) suffer from curse of dimensionality when used with larger multidimensional datasets. Curse of dimentionality results in clutter which hides important visual data trends among coordinates. A number of solutions to address this problem have been proposed including filtering, aggregation, and dimension reordering. These solutions, however, have their own limitations with regard to exploring relationships and trends among the coordinates in PCPs. Correlation based coordinates reordering techniques are among the most popular and have been widely used in PCPs to reduce clutter, though based on the conducted experiments, this research has identified some of their limitations. To achieve better visualization with reduced clutter, we have proposed and evaluated dimensions reordering approach based on minimization of the number of crossing pairs. In the last step, k-means clustering is combined with reordered coordinates to highlight key trends and patterns. The conducted comparative analysis have shown that minimum crossings pairs approach performed much better than other applied techniques for coordinates reordering, and when combined with k-means clustering, resulted in better visualization with significantly reduced clutter
Conditional t-SNE: Complementary t-SNE embeddings through factoring out prior information
Dimensionality reduction and manifold learning methods such as t-Distributed
Stochastic Neighbor Embedding (t-SNE) are routinely used to map
high-dimensional data into a 2-dimensional space to visualize and explore the
data. However, two dimensions are typically insufficient to capture all
structure in the data, the salient structure is often already known, and it is
not obvious how to extract the remaining information in a similarly effective
manner. To fill this gap, we introduce \emph{conditional t-SNE} (ct-SNE), a
generalization of t-SNE that discounts prior information from the embedding in
the form of labels. To achieve this, we propose a conditioned version of the
t-SNE objective, obtaining a single, integrated, and elegant method. ct-SNE has
one extra parameter over t-SNE; we investigate its effects and show how to
efficiently optimize the objective. Factoring out prior knowledge allows
complementary structure to be captured in the embedding, providing new
insights. Qualitative and quantitative empirical results on synthetic and
(large) real data show ct-SNE is effective and achieves its goal
Diffusion map for clustering fMRI spatial maps extracted by independent component analysis
Functional magnetic resonance imaging (fMRI) produces data about activity
inside the brain, from which spatial maps can be extracted by independent
component analysis (ICA). In datasets, there are n spatial maps that contain p
voxels. The number of voxels is very high compared to the number of analyzed
spatial maps. Clustering of the spatial maps is usually based on correlation
matrices. This usually works well, although such a similarity matrix inherently
can explain only a certain amount of the total variance contained in the
high-dimensional data where n is relatively small but p is large. For
high-dimensional space, it is reasonable to perform dimensionality reduction
before clustering. In this research, we used the recently developed diffusion
map for dimensionality reduction in conjunction with spectral clustering. This
research revealed that the diffusion map based clustering worked as well as the
more traditional methods, and produced more compact clusters when needed.Comment: 6 pages. 8 figures. Copyright (c) 2013 IEEE. Published at 2013 IEEE
International Workshop on Machine Learning for Signal Processin
- âŠ