6,283 research outputs found
An End-to-End Framework to Identify Pathogenic Social Media Accounts on Twitter
Pathogenic Social Media (PSM) accounts such as terrorist supporter accounts
and fake news writers have the capability of spreading disinformation to viral
proportions. Early detection of PSM accounts is crucial as they are likely to
be key users to make malicious information "viral". In this paper, we adopt the
causal inference framework along with graph-based metrics in order to
distinguish PSMs from normal users within a short time of their activities. We
propose both supervised and semi-supervised approaches without taking the
network information and content into account. Results on a real-world dataset
from Twitter accentuates the advantage of our proposed frameworks. We show our
approach achieves 0.28 improvement in F1 score over existing approaches with
the precision of 0.90 and F1 score of 0.63.Comment: 9 pages, 8 figures, International Conference on Data Intelligence and
Security. arXiv admin note: text overlap with arXiv:1905.0155
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning
The mutual information is a core statistical quantity that has applications
in all areas of machine learning, whether this is in training of density models
over multiple data modalities, in maximising the efficiency of noisy
transmission channels, or when learning behaviour policies for exploration by
artificial agents. Most learning algorithms that involve optimisation of the
mutual information rely on the Blahut-Arimoto algorithm --- an enumerative
algorithm with exponential complexity that is not suitable for modern machine
learning applications. This paper provides a new approach for scalable
optimisation of the mutual information by merging techniques from variational
inference and deep learning. We develop our approach by focusing on the problem
of intrinsically-motivated learning, where the mutual information forms the
definition of a well-known internal drive known as empowerment. Using a
variational lower bound on the mutual information, combined with convolutional
networks for handling visual input streams, we develop a stochastic
optimisation algorithm that allows for scalable information maximisation and
empowerment-based reasoning directly from pixels to actions.Comment: Proceedings of the 29th Conference on Neural Information Processing
Systems (NIPS 2015
Implicit Causal Models for Genome-wide Association Studies
Progress in probabilistic generative models has accelerated, developing
richer models with neural architectures, implicit densities, and with scalable
algorithms for their Bayesian inference. However, there has been limited
progress in models that capture causal relationships, for example, how
individual genetic factors cause major human diseases. In this work, we focus
on two challenges in particular: How do we build richer causal models, which
can capture highly nonlinear relationships and interactions between multiple
causes? How do we adjust for latent confounders, which are variables
influencing both cause and effect and which prevent learning of causal
relationships? To address these challenges, we synthesize ideas from causality
and modern probabilistic modeling. For the first, we describe implicit causal
models, a class of causal models that leverages neural architectures with an
implicit density. For the second, we describe an implicit causal model that
adjusts for confounders by sharing strength across examples. In experiments, we
scale Bayesian inference on up to a billion genetic measurements. We achieve
state of the art accuracy for identifying causal factors: we significantly
outperform existing genetics methods by an absolute difference of 15-45.3%
Meta-learning of Sequential Strategies
In this report we review memory-based meta-learning as a tool for building
sample-efficient strategies that learn from past experience to adapt to any
task within a target class. Our goal is to equip the reader with the conceptual
foundations of this tool for building new, scalable agents that operate on
broad domains. To do so, we present basic algorithmic templates for building
near-optimal predictors and reinforcement learners which behave as if they had
a probabilistic model that allowed them to efficiently exploit task structure.
Furthermore, we recast memory-based meta-learning within a Bayesian framework,
showing that the meta-learned strategies are near-optimal because they amortize
Bayes-filtered data, where the adaptation is implemented in the memory dynamics
as a state-machine of sufficient statistics. Essentially, memory-based
meta-learning translates the hard problem of probabilistic sequential inference
into a regression problem.Comment: DeepMind Technical Report (15 pages, 6 figures). Version V1.
Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections
Modeling uncertainty in deep neural networks, despite recent important
advances, is still an open problem. Bayesian neural networks are a powerful
solution, where the prior over network weights is a design choice, often a
normal distribution or other distribution encouraging sparsity. However, this
prior is agnostic to the generative process of the input data, which might lead
to unwarranted generalization for out-of-distribution tested data. We suggest
the presence of a confounder for the relation between the input data and the
discriminative function given the target label. We propose an approach for
modeling this confounder by sharing neural connectivity patterns between the
generative and discriminative networks. This approach leads to a new deep
architecture, where networks are sampled from the posterior of local causal
structures, and coupled into a compact hierarchy. We demonstrate that sampling
networks from this hierarchy, proportionally to their posterior, is efficient
and enables estimating various types of uncertainties. Empirical evaluations of
our method demonstrate significant improvement compared to state-of-the-art
calibration and out-of-distribution detection methods
Mimic and Classify : A meta-algorithm for Conditional Independence Testing
Given independent samples generated from the joint distribution
, we study the problem of Conditional
Independence (CI-Testing), i.e., whether the joint equals the CI distribution
or not. We cast this problem
under the purview of the proposed, provable meta-algorithm, "Mimic and
Classify", which is realized in two-steps: (a) Mimic the CI distribution close
enough to recover the support, and (b) Classify to distinguish the joint and
the CI distribution. Thus, as long as we have a good generative model and a
good classifier, we potentially have a sound CI Tester. With this modular
paradigm, CI Testing becomes amiable to be handled by state-of-the-art, both
generative and classification methods from the modern advances in Deep
Learning, which in general can handle issues related to curse of dimensionality
and operation in small sample regime. We show intensive numerical experiments
on synthetic and real datasets where new mimic methods such conditional GANs,
Regression with Neural Nets, outperform the current best CI Testing performance
in the literature. Our theoretical results provide analysis on the estimation
of null distribution as well as allow for general measures, i.e., when either
some of the random variables are discrete and some are continuous or when one
or more of them are discrete-continuous mixtures.Comment: 16 pages, 2 figure
Graph2Seq: Scalable Learning Dynamics for Graphs
Neural networks have been shown to be an effective tool for learning
algorithms over graph-structured data. However, graph representation
techniques---that convert graphs to real-valued vectors for use with neural
networks---are still in their infancy. Recent works have proposed several
approaches (e.g., graph convolutional networks), but these methods have
difficulty scaling and generalizing to graphs with different sizes and shapes.
We present Graph2Seq, a new technique that represents vertices of graphs as
infinite time-series. By not limiting the representation to a fixed dimension,
Graph2Seq scales naturally to graphs of arbitrary sizes and shapes. Graph2Seq
is also reversible, allowing full recovery of the graph structure from the
sequences. By analyzing a formal computational model for graph representation,
we show that an unbounded sequence is necessary for scalability. Our
experimental results with Graph2Seq show strong generalization and new
state-of-the-art performance on a variety of graph combinatorial optimization
problems
Data-driven root-cause analysis for distributed system anomalies
Modern distributed cyber-physical systems encounter a large variety of
anomalies and in many cases, they are vulnerable to catastrophic fault
propagation scenarios due to strong connectivity among the sub-systems. In this
regard, root-cause analysis becomes highly intractable due to complex fault
propagation mechanisms in combination with diverse operating modes. This paper
presents a new data-driven framework for root-cause analysis for addressing
such issues. The framework is based on a spatiotemporal feature extraction
scheme for distributed cyber-physical systems built on the concept of symbolic
dynamics for discovering and representing causal interactions among subsystems
of a complex system. We present two approaches for root-cause analysis, namely
the sequential state switching (, based on free energy concept of a
Restricted Boltzmann Machine, RBM) and artificial anomaly association (, a
multi-class classification framework using deep neural networks, DNN).
Synthetic data from cases with failed pattern(s) and anomalous node are
simulated to validate the proposed approaches, then compared with the
performance of vector autoregressive (VAR) model-based root-cause analysis.
Real dataset based on Tennessee Eastman process (TEP) is also used for
validation. The results show that: (1) and approaches can obtain
high accuracy in root-cause analysis and successfully handle multiple nominal
operation modes, and (2) the proposed tool-chain is shown to be scalable while
maintaining high accuracy.Comment: 6 pages, 3 figure
An Interpretable and Sparse Neural Network Model for Nonlinear Granger Causality Discovery
While most classical approaches to Granger causality detection repose upon
linear time series assumptions, many interactions in neuroscience and economics
applications are nonlinear. We develop an approach to nonlinear Granger
causality detection using multilayer perceptrons where the input to the network
is the past time lags of all series and the output is the future value of a
single series. A sufficient condition for Granger non-causality in this setting
is that all of the outgoing weights of the input data, the past lags of a
series, to the first hidden layer are zero. For estimation, we utilize a group
lasso penalty to shrink groups of input weights to zero. We also propose a
hierarchical penalty for simultaneous Granger causality and lag estimation. We
validate our approach on simulated data from both a sparse linear
autoregressive model and the sparse and nonlinear Lorenz-96 model.Comment: Accepted to the NIPS Time Series Workshop 201
Plan2Vec: Unsupervised Representation Learning by Latent Plans
In this paper we introduce plan2vec, an unsupervised representation learning
approach that is inspired by reinforcement learning. Plan2vec constructs a
weighted graph on an image dataset using near-neighbor distances, and then
extrapolates this local metric to a global embedding by distilling
path-integral over planned path. When applied to control, plan2vec offers a way
to learn goal-conditioned value estimates that are accurate over long horizons
that is both compute and sample efficient. We demonstrate the effectiveness of
plan2vec on one simulated and two challenging real-world image datasets.
Experimental results show that plan2vec successfully amortizes the planning
cost, enabling reactive planning that is linear in memory and computation
complexity rather than exhaustive over the entire state space.Comment: code available at https://geyang.github.io/plan2ve
- …