3,047 research outputs found
Geometric Cross-Modal Comparison of Heterogeneous Sensor Data
In this work, we address the problem of cross-modal comparison of aerial data
streams. A variety of simulated automobile trajectories are sensed using two
different modalities: full-motion video, and radio-frequency (RF) signals
received by detectors at various locations. The information represented by the
two modalities is compared using self-similarity matrices (SSMs) corresponding
to time-ordered point clouds in feature spaces of each of these data sources;
we note that these feature spaces can be of entirely different scale and
dimensionality. Several metrics for comparing SSMs are explored, including a
cutting-edge time-warping technique that can simultaneously handle local time
warping and partial matches, while also controlling for the change in geometry
between feature spaces of the two modalities. We note that this technique is
quite general, and does not depend on the choice of modalities. In this
particular setting, we demonstrate that the cross-modal distance between SSMs
corresponding to the same trajectory type is smaller than the cross-modal
distance between SSMs corresponding to distinct trajectory types, and we
formalize this observation via precision-recall metrics in experiments.
Finally, we comment on promising implications of these ideas for future
integration into multiple-hypothesis tracking systems.Comment: 10 pages, 13 figures, Proceedings of IEEE Aeroconf 201
Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching
Personalization in marketing aims at improving the shopping experience of
customers by tailoring services to individuals. In order to achieve this,
businesses must be able to make personalized predictions regarding the next
purchase. That is, one must forecast the exact list of items that will comprise
the next purchase, i.e., the so-called market basket. Despite its relevance to
firm operations, this problem has received surprisingly little attention in
prior research, largely due to its inherent complexity. In fact,
state-of-the-art approaches are limited to intuitive decision rules for pattern
extraction. However, the simplicity of the pre-coded rules impedes performance,
since decision rules operate in an autoregressive fashion: the rules can only
make inferences from past purchases of a single customer without taking into
account the knowledge transfer that takes place between customers. In contrast,
our research overcomes the limitations of pre-set rules by contributing a novel
predictor of market baskets from sequential purchase histories: our predictions
are based on similarity matching in order to identify similar purchase habits
among the complete shopping histories of all customers. Our contributions are
as follows: (1) We propose similarity matching based on subsequential dynamic
time warping (SDTW) as a novel predictor of market baskets. Thereby, we can
effectively identify cross-customer patterns. (2) We leverage the Wasserstein
distance for measuring the similarity among embedded purchase histories. (3) We
develop a fast approximation algorithm for computing a lower bound of the
Wasserstein distance in our setting. An extensive series of computational
experiments demonstrates the effectiveness of our approach. The accuracy of
identifying the exact market baskets based on state-of-the-art decision rules
from the literature is outperformed by a factor of 4.0.Comment: Accepted for oral presentation at 25th ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD 2019
Probabilistic embeddings of the Fr\'echet distance
The Fr\'echet distance is a popular distance measure for curves which
naturally lends itself to fundamental computational tasks, such as clustering,
nearest-neighbor searching, and spherical range searching in the corresponding
metric space. However, its inherent complexity poses considerable computational
challenges in practice. To address this problem we study distortion of the
probabilistic embedding that results from projecting the curves to a randomly
chosen line. Such an embedding could be used in combination with, e.g.
locality-sensitive hashing. We show that in the worst case and under reasonable
assumptions, the discrete Fr\'echet distance between two polygonal curves of
complexity in , where , degrades
by a factor linear in with constant probability. We show upper and lower
bounds on the distortion. We also evaluate our findings empirically on a
benchmark data set. The preliminary experimental results stand in stark
contrast with our lower bounds. They indicate that highly distorted projections
happen very rarely in practice, and only for strongly conditioned input curves.
Keywords: Fr\'echet distance, metric embeddings, random projectionsComment: 27 pages, 11 figure
Generating Labels for Regression of Subjective Constructs using Triplet Embeddings
Human annotations serve an important role in computational models where the
target constructs under study are hidden, such as dimensions of affect. This is
especially relevant in machine learning, where subjective labels derived from
related observable signals (e.g., audio, video, text) are needed to support
model training and testing. Current research trends focus on correcting
artifacts and biases introduced by annotators during the annotation process
while fusing them into a single annotation. In this work, we propose a novel
annotation approach using triplet embeddings. By lifting the absolute
annotation process to relative annotations where the annotator compares
individual target constructs in triplets, we leverage the accuracy of
comparisons over absolute ratings by human annotators. We then build a
1-dimensional embedding in Euclidean space that is indexed in time and serves
as a label for regression. In this setting, the annotation fusion occurs
naturally as a union of sets of sampled triplet comparisons among different
annotators. We show that by using our proposed sampling method to find an
embedding, we are able to accurately represent synthetic hidden constructs in
time under noisy sampling conditions. We further validate this approach using
human annotations collected from Mechanical Turk and show that we can recover
the underlying structure of the hidden construct up to bias and scaling
factors.Comment: 9 pages, 5 figures, accepted journal pape
On Recursive Edit Distance Kernels with Application to Time Series Classification
This paper proposes some extensions to the work on kernels dedicated to
string or time series global alignment based on the aggregation of scores
obtained by local alignments. The extensions we propose allow to construct,
from classical recursive definition of elastic distances, recursive edit
distance (or time-warp) kernels that are positive definite if some sufficient
conditions are satisfied. The sufficient conditions we end-up with are original
and weaker than those proposed in earlier works, although a recursive
regularizing term is required to get the proof of the positive definiteness as
a direct consequence of the Haussler's convolution theorem. The classification
experiment we conducted on three classical time warp distances (two of which
being metrics), using Support Vector Machine classifier, leads to conclude
that, when the pairwise distance matrix obtained from the training data is
\textit{far} from definiteness, the positive definite recursive elastic kernels
outperform in general the distance substituting kernels for the classical
elastic distances we have tested.Comment: 14 page
- …