12 research outputs found
Tree-based mining contrast subspace
All existing mining contrast subspace methods employ density-based likelihood contrast scoring function to measure the likelihood of a query object to a target class against other class in a subspace. However, the density tends to decrease when the dimensionality of subspaces increases causes its bounds to identify inaccurate contrast subspaces for the given query object. This paper proposes a novel contrast subspace mining method that employs tree-based likelihood contrast scoring function which is not affected by the dimensionality of subspaces. The tree-based scoring measure recursively binary partitions the subspace space in the way that objects belong to the target class are grouped together and separated from objects belonging to other class. In contrast subspace, the query object should be in a group having a higher number of objects of the target class than other class. It incorporates the feature selection approach to find a subset of one-dimensional subspaces with high likelihood contrast score with respect to the query object. Therefore, the contrast subspaces are then searched through the selected subset of one-dimensional subspaces. An experiment is conducted to evaluate the effectiveness of the tree-based method in terms of classification accuracy. The experiment results show that the proposed method has higher classification accuracy and outperform the existing method on several real-world data sets
Towards Interpretable Anomaly Detection via Invariant Rule Mining
In the research area of anomaly detection, novel and promising methods are
frequently developed. However, most existing studies, especially those
leveraging deep neural networks, exclusively focus on the detection task only
and ignore the interpretability of the underlying models as well as their
detection results. However, anomaly interpretation, which aims to provide
explanation of why specific data instances are identified as anomalies, is an
equally (if not more) important task in many real-world applications. In this
work, we pursue highly interpretable anomaly detection via invariant rule
mining. Specifically, we leverage decision tree learning and association rule
mining to automatically generate invariant rules that are consistently
satisfied by the underlying data generation process. The generated invariant
rules can provide explicit explanation of anomaly detection results and thus
are extremely useful for subsequent decision-making. Furthermore, our empirical
evaluation shows that the proposed method can also achieve comparable
performance in terms of AUC and partial AUC with popular anomaly detection
models in various benchmark datasets
A new dimensionality-unbiased score for efficient and effective outlying aspect mining
The main aim of the outlying aspect mining algorithm is to automatically detect the subspace(s) (a.k.a. aspect(s)), where a given data point is dramatically different than the rest of the data in each of those subspace(s) (aspect(s)). To rank the subspaces for a given data point, a scoring measure is required to compute the outlying degree of the given data in each subspace. In this paper, we introduce a new measure to compute outlying degree, called Simple Isolation score using Nearest Neighbor Ensemble (SiNNE), which not only detects the outliers but also provides an explanation on why the selected point is an outlier. SiNNE is a dimensionally unbias measure in its raw form, which means the scores produced by SiNNE are compared directly with subspaces having different dimensions. Thus, it does not require any normalization to make the score unbiased. Our experimental results on synthetic and publicly available real-world datasets revealed that (i) SiNNE produces better or at least the same results as existing scores. (ii) It improves the run time of the existing outlying aspect mining algorithm based on beam search by at least two orders of magnitude. SiNNE allows the existing outlying aspect mining algorithm to run in datasets with hundreds of thousands of instances and thousands of dimensions which was not possible before. © 2022, The Author(s)
What did I do Wrong in my MOBA Game?: Mining Patterns Discriminating Deviant Behaviours
International audienceThe success of electronic sports (eSports), where professional gamers participate in competitive leagues and tournaments , brings new challenges for the video game industry. Other than fun, games must be difficult and challenging for eSports professionals but still easy and enjoyable for amateurs. In this article, we consider Multi-player Online Battle Arena games (MOBA) and particularly, " Defense of the Ancients 2 " , commonly known simply as DOTA2. In this context, a challenge is to propose data analysis methods and metrics that help players to improve their skills. We design a data mining-based method that discovers strategic patterns from historical behavioral traces: Given a model encoding an expected way of playing (the norm), we are interested in patterns deviating from the norm that may explain a game outcome from which player can learn more efficient ways of playing. The method is formally introduced and shown to be adaptable to different scenarios. Finally, we provide an experimental evaluation over a dataset of 10, 000 behavioral game traces
Visualizing Image Content to Explain Novel Image Discovery
The initial analysis of any large data set can be divided into two phases:
(1) the identification of common trends or patterns and (2) the identification
of anomalies or outliers that deviate from those trends. We focus on the goal
of detecting observations with novel content, which can alert us to artifacts
in the data set or, potentially, the discovery of previously unknown phenomena.
To aid in interpreting and diagnosing the novel aspect of these selected
observations, we recommend the use of novelty detection methods that generate
explanations. In the context of large image data sets, these explanations
should highlight what aspect of a given image is new (color, shape, texture,
content) in a human-comprehensible form. We propose DEMUD-VIS, the first method
for providing visual explanations of novel image content by employing a
convolutional neural network (CNN) to extract image features, a method that
uses reconstruction error to detect novel content, and an up-convolutional
network to convert CNN feature representations back into image space. We
demonstrate this approach on diverse images from ImageNet, freshwater streams,
and the surface of Mars.Comment: Under Revie
A Survey on Explainable Anomaly Detection
In the past two decades, most research on anomaly detection has focused on
improving the accuracy of the detection, while largely ignoring the
explainability of the corresponding methods and thus leaving the explanation of
outcomes to practitioners. As anomaly detection algorithms are increasingly
used in safety-critical domains, providing explanations for the high-stakes
decisions made in those domains has become an ethical and regulatory
requirement. Therefore, this work provides a comprehensive and structured
survey on state-of-the-art explainable anomaly detection techniques. We propose
a taxonomy based on the main aspects that characterize each explainable anomaly
detection technique, aiming to help practitioners and researchers find the
explainable anomaly detection method that best suits their needs.Comment: Paper accepted by the ACM Transactions on Knowledge Discovery from
Data (TKDD) for publication (preprint version