24,800 research outputs found
Times series averaging from a probabilistic interpretation of time-elastic kernel
At the light of regularized dynamic time warping kernels, this paper
reconsider the concept of time elastic centroid (TEC) for a set of time series.
From this perspective, we show first how TEC can easily be addressed as a
preimage problem. Unfortunately this preimage problem is ill-posed, may suffer
from over-fitting especially for long time series and getting a sub-optimal
solution involves heavy computational costs. We then derive two new algorithms
based on a probabilistic interpretation of kernel alignment matrices that
expresses in terms of probabilistic distributions over sets of alignment paths.
The first algorithm is an iterative agglomerative heuristics inspired from the
state of the art DTW barycenter averaging (DBA) algorithm proposed specifically
for the Dynamic Time Warping measure. The second proposed algorithm achieves a
classical averaging of the aligned samples but also implements an averaging of
the time of occurrences of the aligned samples. It exploits a straightforward
progressive agglomerative heuristics. An experimentation that compares for 45
time series datasets classification error rates obtained by first near
neighbors classifiers exploiting a single medoid or centroid estimate to
represent each categories show that: i) centroids based approaches
significantly outperform medoids based approaches, ii) on the considered
experience, the two proposed algorithms outperform the state of the art DBA
algorithm, and iii) the second proposed algorithm that implements an averaging
jointly in the sample space and along the time axes emerges as the most
significantly robust time elastic averaging heuristic with an interesting noise
reduction capability. Index Terms-Time series averaging Time elastic kernel
Dynamic Time Warping Time series clustering and classification
Reference face graph for face recognition
Face recognition has been studied extensively; however, real-world face recognition still remains a challenging task. The demand for unconstrained practical face recognition is rising with the explosion of online multimedia such as social networks, and video surveillance footage where face analysis is of significant importance. In this paper, we approach face recognition in the context of graph theory. We recognize an unknown face using an external reference face graph (RFG). An RFG is generated and recognition of a given face is achieved by comparing it to the faces in the constructed RFG. Centrality measures are utilized to identify distinctive faces in the reference face graph. The proposed RFG-based face recognition algorithm is robust to the changes in pose and it is also alignment free. The RFG recognition is used in conjunction with DCT locality sensitive hashing for efficient retrieval to ensure scalability. Experiments are conducted on several publicly available databases and the results show that the proposed approach outperforms the state-of-the-art methods without any preprocessing necessities such as face alignment. Due to the richness in the reference set construction, the proposed method can also handle illumination and expression variation
Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling
We present an approach for weakly supervised learning of human actions. Given
a set of videos and an ordered list of the occurring actions, the goal is to
infer start and end frames of the related action classes within the video and
to train the respective action classifiers without any need for hand labeled
frame boundaries. To address this task, we propose a combination of a
discriminative representation of subactions, modeled by a recurrent neural
network, and a coarse probabilistic model to allow for a temporal alignment and
inference over long sequences. While this system alone already generates good
results, we show that the performance can be further improved by approximating
the number of subactions to the characteristics of the different action
classes. To this end, we adapt the number of subaction classes by iterating
realignment and reestimation during training. The proposed system is evaluated
on two benchmark datasets, the Breakfast and the Hollywood extended dataset,
showing a competitive performance on various weak learning tasks such as
temporal action segmentation and action alignment
Graph edit distance from spectral seriation
This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graph-matching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems
Automatic Synchronization of Multi-User Photo Galleries
In this paper we address the issue of photo galleries synchronization, where
pictures related to the same event are collected by different users. Existing
solutions to address the problem are usually based on unrealistic assumptions,
like time consistency across photo galleries, and often heavily rely on
heuristics, limiting therefore the applicability to real-world scenarios. We
propose a solution that achieves better generalization performance for the
synchronization task compared to the available literature. The method is
characterized by three stages: at first, deep convolutional neural network
features are used to assess the visual similarity among the photos; then, pairs
of similar photos are detected across different galleries and used to construct
a graph; eventually, a probabilistic graphical model is used to estimate the
temporal offset of each pair of galleries, by traversing the minimum spanning
tree extracted from this graph. The experimental evaluation is conducted on
four publicly available datasets covering different types of events,
demonstrating the strength of our proposed method. A thorough discussion of the
obtained results is provided for a critical assessment of the quality in
synchronization.Comment: ACCEPTED to IEEE Transactions on Multimedi
- …