14,787 research outputs found
Anomaly Detection with Joint Representation Learning of Content and Connection
Social media sites are becoming a key factor in politics. These platforms are
easy to manipulate for the purpose of distorting information space to confuse
and distract voters. Past works to identify disruptive patterns are mostly
focused on analyzing the content of tweets. In this study, we jointly embed the
information from both user posted content as well as a user's follower network,
to detect groups of densely connected users in an unsupervised fashion. We then
investigate these dense sub-blocks of users to flag anomalous behavior. In our
experiments, we study the tweets related to the upcoming 2019 Canadian
Elections, and observe a set of densely-connected users engaging in local
politics in different provinces, and exhibiting troll-like behavior.Comment: 2019 International Conference on Machine Learning Workshop on AI for
Social Goo
Bayesian Learning of Clique Tree Structure
The problem of categorical data analysis in high dimensions is considered. A
discussion of the fundamental difficulties of probability modeling is provided,
and a solution to the derivation of high dimensional probability distributions
based on Bayesian learning of clique tree decomposition is presented. The main
contributions of this paper are an automated determination of the optimal
clique tree structure for probability modeling, the resulting derived
probability distribution, and a corresponding unified approach to clustering
and anomaly detection based on the probability distribution.Comment: 7 pages, 11 figures; see
http://worldcomp-proceedings.com/proc/p2016/DMIN16_Contents.htm
Machine Learning Techniques for Intrusion Detection
An Intrusion Detection System (IDS) is a software that monitors a single or a
network of computers for malicious activities (attacks) that are aimed at
stealing or censoring information or corrupting network protocols. Most
techniques used in today's IDS are not able to deal with the dynamic and
complex nature of cyber attacks on computer networks. Hence, efficient adaptive
methods like various techniques of machine learning can result in higher
detection rates, lower false alarm rates and reasonable computation and
communication costs. In this paper, we study several such schemes and compare
their performance. We divide the schemes into methods based on classical
artificial intelligence (AI) and methods based on computational intelligence
(CI). We explain how various characteristics of CI techniques can be used to
build efficient IDS.Comment: 11 page
Energy-based Models for Video Anomaly Detection
Automated detection of abnormalities in data has been studied in research
area in recent years because of its diverse applications in practice including
video surveillance, industrial damage detection and network intrusion
detection. However, building an effective anomaly detection system is a
non-trivial task since it requires to tackle challenging issues of the shortage
of annotated data, inability of defining anomaly objects explicitly and the
expensive cost of feature engineering procedure. Unlike existing appoaches
which only partially solve these problems, we develop a unique framework to
cope the problems above simultaneously. Instead of hanlding with ambiguous
definition of anomaly objects, we propose to work with regular patterns whose
unlabeled data is abundant and usually easy to collect in practice. This allows
our system to be trained completely in an unsupervised procedure and liberate
us from the need for costly data annotation. By learning generative model that
capture the normality distribution in data, we can isolate abnormal data points
that result in low normality scores (high abnormality scores). Moreover, by
leverage on the power of generative networks, i.e. energy-based models, we are
also able to learn the feature representation automatically rather than
replying on hand-crafted features that have been dominating anomaly detection
research over many decades. We demonstrate our proposal on the specific
application of video anomaly detection and the experimental results indicate
that our method performs better than baselines and are comparable with
state-of-the-art methods in many benchmark video anomaly detection datasets
Deep Representation Learning for Social Network Analysis
Social network analysis is an important problem in data mining. A fundamental
step for analyzing social networks is to encode network data into
low-dimensional representations, i.e., network embeddings, so that the network
topology structure and other attribute information can be effectively
preserved. Network representation leaning facilitates further applications such
as classification, link prediction, anomaly detection and clustering. In
addition, techniques based on deep neural networks have attracted great
interests over the past a few years. In this survey, we conduct a comprehensive
review of current literature in network representation learning utilizing
neural network models. First, we introduce the basic models for learning node
representations in homogeneous networks. Meanwhile, we will also introduce some
extensions of the base models in tackling more complex scenarios, such as
analyzing attributed networks, heterogeneous networks and dynamic networks.
Then, we introduce the techniques for embedding subgraphs. After that, we
present the applications of network representation learning. At the end, we
discuss some promising research directions for future work
Recommended from our members
Machine learning for cognitive networks : technology assessment and research challenges
The field of machine learning has made major strides over the last 20 years. This document summarizes the major problem formulations that the discipline has studied, then reviews three tasks in cognitive networking and briefly discusses how aspects of those tasks fit these formulations. After this, it discusses challenges for machine learning research raised by Knowledge Plane applications and closes with proposals for the evaluation of learning systems developed for these problems
Finding Likely Errors with Bayesian Specifications
We present a Bayesian framework for learning probabilistic specifications
from large, unstructured code corpora, and a method to use this framework to
statically detect anomalous, hence likely buggy, program behavior. The
distinctive insight here is to build a statistical model that correlates all
specifications hidden inside a corpus with the syntax and observed behavior of
programs that implement these specifications. During the analysis of a
particular program, this model is conditioned into a posterior distribution
that prioritizes specifications that are relevant to this program. This allows
accurate program analysis even if the corpus is highly heterogeneous. The
problem of finding anomalies is now framed quantitatively, as a problem of
computing a distance between a "reference distribution" over program behaviors
that our model expects from the program, and the distribution over behaviors
that the program actually produces.
We present a concrete embodiment of our framework that combines a topic model
and a neural network model to learn specifications, and queries the learned
models to compute anomaly scores. We evaluate this implementation on the task
of detecting anomalous usage of Android APIs. Our encouraging experimental
results show that the method can automatically discover subtle errors in
Android applications in the wild, and has high precision and recall compared to
competing probabilistic approaches
Further Exploration of the Dendritic Cell Algorithm: Antigen Multiplier and Time Windows
As an immune-inspired algorithm, the Dendritic Cell Algorithm (DCA), produces
promising performances in the field of anomaly detection. This paper presents
the application of the DCA to a standard data set, the KDD 99 data set. The
results of different implementation versions of the DXA, including the antigen
multiplier and moving time windows are reported. The real-valued Negative
Selection Algorithm (NSA) using constant-sized detectors and the C4.5 decision
tree algorithm are used, to conduct a baseline comparison. The results suggest
that the DCA is applicable to KDD 99 data set, and the antigen multiplier and
moving time windows have the same effect on the DCA for this particular data
set. The real-valued NSA with constant-sized detectors is not applicable to the
data set, and the C4.5 decision tree algorithm provides a benchmark of the
classification performance for this data set.Comment: 12 pages, 3 figures, 3 tables, 7th International Conference on
Artificial Immune Systems (ICARIS 2008), Phuket, Thailan
ADS-ME: Anomaly Detection System for Micro-expression Spotting
Micro-expressions (MEs) are infrequent and uncontrollable facial events that
can highlight emotional deception and appear in a high-stakes environment. This
paper propose an algorithm for spatiotemporal MEs spotting. Since MEs are
unusual events, we treat them as abnormal patterns that diverge from expected
Normal Facial Behaviour (NFBs) patterns. NFBs correspond to facial muscle
activations, eye blink/gaze events and mouth opening/closing movements that are
all facial deformation but not MEs. We propose a probabilistic model to
estimate the probability density function that models the spatiotemporal
distributions of NFBs patterns. To rank the outputs, we compute the negative
log-likelihood and we developed an adaptive thresholding technique to identify
MEs from NFBs. While working only with NFBs data, the main challenge is to
capture intrinsic spatiotemoral features, hence we design a recurrent
convolutional autoencoder for feature representation. Finally, we show that our
system is superior to previous works for MEs spotting.Comment: 35 pages, 9 figures, 3 table
Incorporating Privileged Information to Unsupervised Anomaly Detection
We introduce a new unsupervised anomaly detection ensemble called SPI which
can harness privileged information - data available only for training examples
but not for (future) test examples. Our ideas build on the Learning Using
Privileged Information (LUPI) paradigm pioneered by Vapnik et al. [19,17],
which we extend to unsupervised learning and in particular to anomaly
detection. SPI (for Spotting anomalies with Privileged Information) constructs
a number of frames/fragments of knowledge (i.e., density estimates) in the
privileged space and transfers them to the anomaly scoring space through
"imitation" functions that use only the partial information available for test
examples. Our generalization of the LUPI paradigm to unsupervised anomaly
detection shepherds the field in several key directions, including (i) domain
knowledge-augmented detection using expert annotations as PI, (ii) fast
detection using computationally-demanding data as PI, and (iii) early detection
using "historical future" data as PI. Through extensive experiments on
simulated and real datasets, we show that augmenting privileged information to
anomaly detection significantly improves detection performance. We also
demonstrate the promise of SPI under all three settings (i-iii); with PI
capturing expert knowledge, computationally expensive features, and future data
on three real world detection tasks
- …