359,028 research outputs found
Detecting One-variable Patterns
Given a pattern such that
, where is a
variable and its reversal, and
are strings that contain no variables, we describe an
algorithm that constructs in time a compact representation of all
instances of in an input string of length over a polynomially bounded
integer alphabet, so that one can report those instances in time.Comment: 16 pages (+13 pages of Appendix), 4 figures, accepted to SPIRE 201
Multi-Sensor Event Detection using Shape Histograms
Vehicular sensor data consists of multiple time-series arising from a number
of sensors. Using such multi-sensor data we would like to detect occurrences of
specific events that vehicles encounter, e.g., corresponding to particular
maneuvers that a vehicle makes or conditions that it encounters. Events are
characterized by similar waveform patterns re-appearing within one or more
sensors. Further such patterns can be of variable duration. In this work, we
propose a method for detecting such events in time-series data using a novel
feature descriptor motivated by similar ideas in image processing. We define
the shape histogram: a constant dimension descriptor that nevertheless captures
patterns of variable duration. We demonstrate the efficacy of using shape
histograms as features to detect events in an SVM-based, multi-sensor,
supervised learning scenario, i.e., multiple time-series are used to detect an
event. We present results on real-life vehicular sensor data and show that our
technique performs better than available pattern detection implementations on
our data, and that it can also be used to combine features from multiple
sensors resulting in better accuracy than using any single sensor. Since
previous work on pattern detection in time-series has been in the single series
context, we also present results using our technique on multiple standard
time-series datasets and show that it is the most versatile in terms of how it
ranks compared to other published results
Exploring Ways of Identifying Outliers in Spatial Point Patterns
This work discusses alternative methods to detect outliers in spatial point patterns.
Outliers are defined based on location only and also with respect to associated variables. Throughout the thesis we discuss five case studies, three of them come from experiments with spiders and bees, and the other two are data from earthquakes in a certain region. One of the main conclusions is that when detecting outliers from the point of view of location we need to take into consideration both the degree of clustering of the events and the context of the study. When detecting outliers from the point of view of an associated variable, outliers can be identified from a global or local perspective. For global outliers, one of the main questions addressed is whether the outliers tend to be clustered or randomly distributed in the region. All the work was done using the R programming language
A Semiparametric Bayesian Model for Detecting Synchrony Among Multiple Neurons
We propose a scalable semiparametric Bayesian model to capture dependencies
among multiple neurons by detecting their co-firing (possibly with some lag
time) patterns over time. After discretizing time so there is at most one spike
at each interval, the resulting sequence of 1's (spike) and 0's (silence) for
each neuron is modeled using the logistic function of a continuous latent
variable with a Gaussian process prior. For multiple neurons, the corresponding
marginal distributions are coupled to their joint probability distribution
using a parametric copula model. The advantages of our approach are as follows:
the nonparametric component (i.e., the Gaussian process model) provides a
flexible framework for modeling the underlying firing rates; the parametric
component (i.e., the copula model) allows us to make inference regarding both
contemporaneous and lagged relationships among neurons; using the copula model,
we construct multivariate probabilistic models by separating the modeling of
univariate marginal distributions from the modeling of dependence structure
among variables; our method is easy to implement using a computationally
efficient sampling algorithm that can be easily extended to high dimensional
problems. Using simulated data, we show that our approach could correctly
capture temporal dependencies in firing rates and identify synchronous neurons.
We also apply our model to spike train data obtained from prefrontal cortical
areas in rat's brain
Detecting Differential Item and Step Functioning with Rating Scale and Partial Credit Trees
Several statistical procedures have been suggested for detecting
differential item functioning (DIF) and differential step
functioning (DSF) in polytomous items. However, standard
procedures are designed for the comparison of pre-specified
reference and focal groups, such as males and females.
Here, we propose a framework for the detection of DIF and DSF in
polytomous items under the rating scale and partial credit model,
that employs a model-based recursive partitioning algorithm. In contrast to existing
procedures, with this approach no pre-specification of reference
and focal groups is necessary, because they are detected in a
data-driven way. The resulting groups are characterized by
(combinations of) covariates and thus directly interpretable.
The statistical background and construction of the new procedures
are introduced along with an instructive example. Four simulation
studies illustrate and compare their statistical properties to
the well-established likelihood ratio test (LRT). While both the
LRT and the new procedures respect a given significance level,
the new procedures are in most cases equally (simple DIF groups)
or more powerful (complex DIF groups) and can also detect
DSF. The sensitivity to model misspecification is
investigated. An application example with empirical data
illustrates the practical use.
A software implementation of the new procedures is freely
available in the R system for statistical computing
HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks
The unsupervised detection of anomalies in time series data has important
applications in user behavioral modeling, fraud detection, and cybersecurity.
Anomaly detection has, in fact, been extensively studied in categorical
sequences. However, we often have access to time series data that represent
paths through networks. Examples include transaction sequences in financial
networks, click streams of users in networks of cross-referenced documents, or
travel itineraries in transportation networks. To reliably detect anomalies, we
must account for the fact that such data contain a large number of independent
observations of paths constrained by a graph topology. Moreover, the
heterogeneity of real systems rules out frequency-based anomaly detection
techniques, which do not account for highly skewed edge and degree statistics.
To address this problem, we introduce HYPA, a novel framework for the
unsupervised detection of anomalies in large corpora of variable-length
temporal paths in a graph. HYPA provides an efficient analytical method to
detect paths with anomalous frequencies that result from nodes being traversed
in unexpected chronological order.Comment: 11 pages with 8 figures and supplementary material. To appear at SIAM
Data Mining (SDM 2020
- ā¦