5 research outputs found
Rescaling, thinning or complementing? On goodness-of-fit procedures for point process models and Generalized Linear Models
Generalized Linear Models (GLMs) are an increasingly popular framework for
modeling neural spike trains. They have been linked to the theory of stochastic
point processes and researchers have used this relation to assess
goodness-of-fit using methods from point-process theory, e.g. the
time-rescaling theorem. However, high neural firing rates or coarse
discretization lead to a breakdown of the assumptions necessary for this
connection. Here, we show how goodness-of-fit tests from point-process theory
can still be applied to GLMs by constructing equivalent surrogate point
processes out of time-series observations. Furthermore, two additional tests
based on thinning and complementing point processes are introduced. They
augment the instruments available for checking model adequacy of point
processes as well as discretized models.Comment: 9 pages, to appear in NIPS 2010 (Neural Information Processing
Systems), corrected missing referenc
Parameter estimation of binned Hawkes processes
A key difficulty that arises from real event data is imprecision in the recording of event time-stamps. In many cases, retaining event times with a high precision is expensive due to the sheer volume of activity. Combined with practical limits on the accuracy of measurements, binned data is common. In order to use point processes to model such event data, tools for handling parameter estimation are essential. Here we consider parameter estimation of the Hawkes process, a type of self-exciting point process that has found application in the modeling of financial stock markets, earthquakes and social media cascades. We develop a novel optimization approach to parameter estimation of binned Hawkes processes using a modified Expectation-Maximization algorithm, referred to as Binned Hawkes Expectation Maximization (BH-EM). Through a detailed simulation study, we demonstrate that existing methods are capable of producing severely biased and highly variable parameter estimates and that our novel BH-EM method significantly outperforms them in all studied circumstances. We further illustrate the performance on network flow (NetFlow) data between devices in a real large-scale computer network, to characterize triggering behavior. These results highlight the importance of correct handling of binned data
Point process modeling as a framework to dissociate intrinsic and extrinsic components in neural systems
Understanding the factors shaping neuronal spiking is a central problem in neuroscience. Neurons may have complicated sensitivity and, often, are embedded in dynamic networks whose ongoing activity may influence their likelihood of spiking. One approach to characterizing neuronal spiking is the point process generalized linear model (GLM), which decomposes spike probability into explicit factors. This model represents a higher level of abstraction than biophysical models, such as Hodgkin-Huxley, but benefits from principled approaches for estimation and validation.
Here we address how to infer factors affecting neuronal spiking in different types of neural systems. We first extend the point process GLM, most commonly used to analyze single neurons, to model population-level voltage discharges recorded during human seizures. Both GLMs and descriptive measures reveal rhythmic bursting and directional wave propagation. However, we show that GLM estimates account for covariance between these features in a way that pairwise measures do not. Failure to account for this covariance leads to confounded results. We interpret the GLM results to speculate the mechanisms of seizure and suggest new therapies.
The second chapter highlights flexibility of the GLM. We use this single framework to analyze enhancement, a statistical phenomenon, in three distinct systems. Here we define the enhancement score, a simple measure of shared information between spike factors in a GLM. We demonstrate how to estimate the score, including confidence intervals, using simulated data. In real data, we find that enhancement occurs prominently during human seizure, while redundancy tends to occur in mouse auditory networks. We discuss implications for physiology, particularly during seizure.
In the third part of this thesis, we apply point process modeling to spike trains recorded from single units in vitro under external stimulation. We re-parameterize models in a low-dimensional and physically interpretable way; namely, we represent their effects in principal component space. We show that this approach successfully separates the neurons observed in vitro into different classes consistent with their gene expression profiles.
Taken together, this work contributes a statistical framework for analyzing neuronal spike trains and demonstrates how it can be applied to create new insights into clinical and experimental data sets
Adversaria Attacks and Defense Mechanisms to Improve Robustness of Deep Temporal Point Processes
Indiana University-Purdue University Indianapolis (IUPUI)Temporal point processes (TPP) are mathematical approaches for modeling asynchronous
event sequences by considering the temporal dependency of each event on past events and its
instantaneous rate. Temporal point processes can model various problems, from earthquake
aftershocks, trade orders, gang violence, and reported crime patterns, to network analysis,
infectious disease transmissions, and virus spread forecasting. In each of these cases, the
entity’s behavior with the corresponding information is noted over time as an asynchronous
event sequence, and the analysis is done using temporal point processes, which provides a
means to define the generative mechanism of the sequence of events and ultimately predict
events and investigate causality.
Among point processes, Hawkes process as a stochastic point process is able to model
a wide range of contagious and self-exciting patterns. One of Hawkes process’s well-known
applications is predicting the evolution of viral processes on networks, which is an important
problem in biology, the social sciences, and the study of the Internet. In existing works,
mean-field analysis based upon degree distribution is used to predict viral spreading across
networks of different types. However, it has been shown that degree distribution alone
fails to predict the behavior of viruses on some real-world networks. Recent attempts have
been made to use assortativity to address this shortcoming. This thesis illustrates how the
evolution of such a viral process is sensitive to the underlying network’s structure.
In Chapter 3 , we show that adding assortativity does not fully explain the variance in
the spread of viruses for a number of real-world networks. We propose using the graphlet
frequency distribution combined with assortativity to explain variations in the evolution
of viral processes across networks with identical degree distribution. Using a data-driven
approach, by coupling predictive modeling with viral process simulation on real-world networks,
we show that simple regression models based on graphlet frequency distribution can
explain over 95% of the variance in virality on networks with the same degree distribution
but different network topologies. Our results highlight the importance of graphlets and identify
a small collection of graphlets that may have the most significant influence over the viral
processes on a network.
Due to the flexibility and expressiveness of deep learning techniques, several neural
network-based approaches have recently shown promise for modeling point process intensities.
However, there is a lack of research on the possible adversarial attacks and the
robustness of such models regarding adversarial attacks and natural shocks to systems.
Furthermore, while neural point processes may outperform simpler parametric models on
in-sample tests, how these models perform when encountering adversarial examples or sharp
non-stationary trends remains unknown.
In Chapter 4 , we propose several white-box and black-box adversarial attacks against
deep temporal point processes. Additionally, we investigate the transferability of whitebox
adversarial attacks against point processes modeled by deep neural networks, which are
considered a more elevated risk. Extensive experiments confirm that neural point processes
are vulnerable to adversarial attacks. Such a vulnerability is illustrated both in terms of
predictive metrics and the effect of attacks on the underlying point process’s parameters.
Expressly, adversarial attacks successfully transform the temporal Hawkes process regime
from sub-critical to into a super-critical and manipulate the modeled parameters that is
considered a risk against parametric modeling approaches. Additionally, we evaluate the
vulnerability and performance of these models in the presence of non-stationary abrupt
changes, using the crimes and Covid-19 pandemic dataset as an example.
Considering the security vulnerability of deep-learning models, including deep temporal
point processes, to adversarial attacks, it is essential to ensure the robustness of the deployed
algorithms that is despite the success of deep learning techniques in modeling temporal point
processes.
In Chapter 5 , we study the robustness of deep temporal point processes against several
proposed adversarial attacks from the adversarial defense viewpoint. Specifically, we
investigate the effectiveness of adversarial training using universal adversarial samples in
improving the robustness of the deep point processes. Additionally, we propose a general
point process domain-adopted (GPDA) regularization, which is strictly applicable to temporal
point processes, to reduce the effect of adversarial attacks and acquire an empirically
robust model. In this approach, unlike other computationally expensive approaches, there
is no need for additional back-propagation in the training step, and no further network isrequired. Ultimately, we propose an adversarial detection framework that has been trained
in the Generative Adversarial Network (GAN) manner and solely on clean training data.
Finally, in Chapter 6 , we discuss implications of the research and future research directions