19 research outputs found
Learning Temporal Point Processes via Reinforcement Learning
Social goods, such as healthcare, smart city, and information networks, often
produce ordered event data in continuous time. The generative processes of
these event data can be very complex, requiring flexible models to capture
their dynamics. Temporal point processes offer an elegant framework for
modeling event data without discretizing the time. However, the existing
maximum-likelihood-estimation (MLE) learning paradigm requires hand-crafting
the intensity function beforehand and cannot directly monitor the
goodness-of-fit of the estimated model in the process of training. To alleviate
the risk of model-misspecification in MLE, we propose to generate samples from
the generative model and monitor the quality of the samples in the process of
training until the samples and the real data are indistinguishable. We take
inspiration from reinforcement learning (RL) and treat the generation of each
event as the action taken by a stochastic policy. We parameterize the policy as
a flexible recurrent neural network and gradually improve the policy to mimic
the observed event distribution. Since the reward function is unknown in this
setting, we uncover an analytic and nonparametric form of the reward function
using an inverse reinforcement learning formulation. This new RL framework
allows us to derive an efficient policy gradient algorithm for learning
flexible point process models, and we show that it performs well in both
synthetic and real data
Fully Neural Network based Model for General Temporal Point Processes
A temporal point process is a mathematical model for a time series of
discrete events, which covers various applications. Recently, recurrent neural
network (RNN) based models have been developed for point processes and have
been found effective. RNN based models usually assume a specific functional
form for the time course of the intensity function of a point process (e.g.,
exponentially decreasing or increasing with the time since the most recent
event). However, such an assumption can restrict the expressive power of the
model. We herein propose a novel RNN based model in which the time course of
the intensity function is represented in a general manner. In our approach, we
first model the integral of the intensity function using a feedforward neural
network and then obtain the intensity function as its derivative. This approach
enables us to both obtain a flexible model of the intensity function and
exactly evaluate the log-likelihood function, which contains the integral of
the intensity function, without any numerical approximations. Our model
achieves competitive or superior performances compared to the previous
state-of-the-art methods for both synthetic and real datasets
Reinforcement Learning with Policy Mixture Model for Temporal Point Processes Clustering
Temporal point process is an expressive tool for modeling event sequences
over time. In this paper, we take a reinforcement learning view whereby the
observed sequences are assumed to be generated from a mixture of latent
policies. The purpose is to cluster the sequences with different temporal
patterns into the underlying policies while learning each of the policy model.
The flexibility of our model lies in: i) all the components are networks
including the policy network for modeling the intensity function of temporal
point process; ii) to handle varying-length event sequences, we resort to
inverse reinforcement learning by decomposing the observed sequence into states
(RNN hidden embedding of history) and actions (time interval to next event) in
order to learn the reward function, thus achieving better performance or
increasing efficiency compared to existing methods using rewards over the
entire sequence such as log-likelihood or Wasserstein distance. We adopt an
expectation-maximization framework with the E-step estimating the cluster
labels for each sequence, and the M-step aiming to learn the respective policy.
Extensive experiments show the efficacy of our method against
state-of-the-arts.Comment: 8 pages, 3 figures, 4 table
Self-Attentive Hawkes Processes
Asynchronous events on the continuous time domain, e.g., social media actions
and stock transactions, occur frequently in the world. The ability to recognize
occurrence patterns of event sequences is crucial to predict which typeof
events will happen next and when. A de facto standard mathematical framework to
do this is the Hawkes process. In order to enhance expressivity of multivariate
Hawkes processes, conventional statistical methods and deep recurrent networks
have been employed to modify its intensity function. The former is highly
interpretable and requires small size of training data but relies on correct
model design while the latter has less dependency on prior knowledge and is
more powerful in capturing complicated patterns. We leverage pros and cons of
these models and propose a self-attentive Hawkes process(SAHP). The proposed
method adapts self-attention to fit the intensity function of Hawkes processes.
This design has two benefits:(1) compared with conventional statistical
methods, the SAHP is more powerful to identify complicated dependency
relationships between temporal events; (2)compared with deep recurrent
networks, the self-attention mechanism is able to capture longer historical
information, and is more interpretable because the learnt attention weight
tensor shows contributions of each historical event. Experiments on four
real-world datasets demonstrate the effectiveness of the proposed method
ABC Learning of Hawkes Processes with Missing or Noisy Event Times
The self-exciting Hawkes process is widely used to model events which occur
in bursts. However, many real world data sets contain missing events and/or
noisily observed event times, which we refer to as data distortion. The
presence of such distortion can severely bias the learning of the Hawkes
process parameters. To circumvent this, we propose modeling the distortion
function explicitly. This leads to a model with an intractable likelihood
function which makes it difficult to deploy standard parameter estimation
techniques. As such, we develop the ABC-Hawkes algorithm which is a novel
approach to estimation based on Approximate Bayesian Computation (ABC) and
Markov Chain Monte Carlo. This allows the parameters of the Hawkes process to
be learned in settings where conventional methods induce substantial bias or
are inapplicable. The proposed approach is shown to perform well on both real
and simulated data.Comment: Added comparison to literatur
Insider Threat Detection via Hierarchical Neural Temporal Point Processes
Insiders usually cause significant losses to organizations and are hard to
detect. Currently, various approaches have been proposed to achieve insider
threat detection based on analyzing the audit data that record information of
the employee's activity type and time. However, the existing approaches usually
focus on modeling the users' activity types but do not consider the activity
time information. In this paper, we propose a hierarchical neural temporal
point process model by combining the temporal point processes and recurrent
neural networks for insider threat detection. Our model is capable of capturing
a general nonlinear dependency over the history of all activities by the
two-level structure that effectively models activity times, activity types,
session durations, and session intervals information. Experimental results on
two datasets demonstrate that our model outperforms the models that only
consider information of the activity types or time alone
Learning Latent Process from High-Dimensional Event Sequences via Efficient Sampling
We target modeling latent dynamics in high-dimension marked event sequences
without any prior knowledge about marker relations. Such problem has been
rarely studied by previous works which would have fundamental difficulty to
handle the arisen challenges: 1) the high-dimensional markers and unknown
relation network among them pose intractable obstacles for modeling the latent
dynamic process; 2) one observed event sequence may concurrently contain
several different chains of interdependent events; 3) it is hard to well define
the distance between two high-dimension event sequences. To these ends, in this
paper, we propose a seminal adversarial imitation learning framework for
high-dimension event sequence generation which could be decomposed into: 1) a
latent structural intensity model that estimates the adjacent nodes without
explicit networks and learns to capture the temporal dynamics in the latent
space of markers over observed sequence; 2) an efficient random walk based
generation model that aims at imitating the generation process of
high-dimension event sequences from a bottom-up view; 3) a discriminator
specified as a seq2seq network optimizing the rewards to help the generator
output event sequences as real as possible. Experimental results on both
synthetic and real-world datasets demonstrate that the proposed method could
effectively detect the hidden network among markers and make decent prediction
for future marked events, even when the number of markers scales to million
level
Understanding the Spread of COVID-19 Epidemic: A Spatio-Temporal Point Process View
Since the first coronavirus case was identified in the U.S. on Jan. 21, more
than 1 million people in the U.S. have confirmed cases of COVID-19. This
infectious respiratory disease has spread rapidly across more than 3000
counties and 50 states in the U.S. and have exhibited evolutionary clustering
and complex triggering patterns. It is essential to understand the complex
spacetime intertwined propagation of this disease so that accurate prediction
or smart external intervention can be carried out. In this paper, we model the
propagation of the COVID-19 as spatio-temporal point processes and propose a
generative and intensity-free model to track the spread of the disease. We
further adopt a generative adversarial imitation learning framework to learn
the model parameters. In comparison with the traditional likelihood-based
learning methods, this imitation learning framework does not need to prespecify
an intensity function, which alleviates the model-misspecification. Moreover,
the adversarial learning procedure bypasses the difficult-to-evaluate integral
involved in the likelihood evaluation, which makes the model inference more
scalable with the data and variables. We showcase the dynamic learning
performance on the COVID-19 confirmed cases in the U.S. and evaluate the social
distancing policy based on the learned generative model
Transformer Hawkes Process
Modern data acquisition routinely produce massive amounts of event sequence
data in various domains, such as social media, healthcare, and financial
markets. These data often exhibit complicated short-term and long-term temporal
dependencies. However, most of the existing recurrent neural network based
point process models fail to capture such dependencies, and yield unreliable
prediction performance. To address this issue, we propose a Transformer Hawkes
Process (THP) model, which leverages the self-attention mechanism to capture
long-term dependencies and meanwhile enjoys computational efficiency. Numerical
experiments on various datasets show that THP outperforms existing models in
terms of both likelihood and event prediction accuracy by a notable margin.
Moreover, THP is quite general and can incorporate additional structural
knowledge. We provide a concrete example, where THP achieves improved
prediction performance for learning multiple point processes when incorporating
their relational information
Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
We study reinforcement learning in settings where sampling an action from the
policy must be done concurrently with the time evolution of the controlled
system, such as when a robot must decide on the next action while still
performing the previous action. Much like a person or an animal, the robot must
think and move at the same time, deciding on its next action before the
previous one has completed. In order to develop an algorithmic framework for
such concurrent control problems, we start with a continuous-time formulation
of the Bellman equations, and then discretize them in a way that is aware of
system delays. We instantiate this new class of approximate dynamic programming
methods via a simple architectural extension to existing value-based deep
reinforcement learning algorithms. We evaluate our methods on simulated
benchmark tasks and a large-scale robotic grasping task where the robot must
"think while moving".Comment: Published as a conference paper at ICLR 202