2 research outputs found
ExClus : explainable clustering on low-dimensional data representations
Dimensionality reduction and clustering techniques are frequently used to
analyze complex data sets, but their results are often not easy to interpret.
We consider how to support users in interpreting apparent cluster structure on
scatter plots where the axes are not directly interpretable, such as when the
data is projected onto a two-dimensional space using a dimensionality-reduction
method. Specifically, we propose a new method to compute an interpretable
clustering automatically, where the explanation is in the original
high-dimensional space and the clustering is coherent in the low-dimensional
projection. It provides a tunable balance between the complexity and the amount
of information provided, through the use of information theory. We study the
computational complexity of this problem and introduce restrictions on the
search space of solutions to arrive at an efficient, tunable, greedy
optimization algorithm. This algorithm is furthermore implemented in an
interactive tool called ExClus. Experiments on several data sets highlight that
ExClus can provide informative and easy-to-understand patterns, and they expose
where the algorithm is efficient and where there is room for improvement
considering tunability and scalability
ExClus : explainable clustering on low-dimensional data representations
Dimensionality reduction and clustering techniques are frequently used to
analyze complex data sets, but their results are often not easy to interpret.
We consider how to support users in interpreting apparent cluster structure on
scatter plots where the axes are not directly interpretable, such as when the
data is projected onto a two-dimensional space using a dimensionality-reduction
method. Specifically, we propose a new method to compute an interpretable
clustering automatically, where the explanation is in the original
high-dimensional space and the clustering is coherent in the low-dimensional
projection. It provides a tunable balance between the complexity and the amount
of information provided, through the use of information theory. We study the
computational complexity of this problem and introduce restrictions on the
search space of solutions to arrive at an efficient, tunable, greedy
optimization algorithm. This algorithm is furthermore implemented in an
interactive tool called ExClus. Experiments on several data sets highlight that
ExClus can provide informative and easy-to-understand patterns, and they expose
where the algorithm is efficient and where there is room for improvement
considering tunability and scalability