43,484 research outputs found
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
Uncovering Group Level Insights with Accordant Clustering
Clustering is a widely-used data mining tool, which aims to discover
partitions of similar items in data. We introduce a new clustering paradigm,
\emph{accordant clustering}, which enables the discovery of (predefined) group
level insights. Unlike previous clustering paradigms that aim to understand
relationships amongst the individual members, the goal of accordant clustering
is to uncover insights at the group level through the analysis of their
members. Group level insight can often support a call to action that cannot be
informed through previous clustering techniques. We propose the first accordant
clustering algorithm, and prove that it finds near-optimal solutions when data
possesses inherent cluster structure. The insights revealed by accordant
clusterings enabled experts in the field of medicine to isolate successful
treatments for a neurodegenerative disease, and those in finance to discover
patterns of unnecessary spending.Comment: accepted to SDM 2017 (oral
- …