266,937 research outputs found
Energy-based Neural Modelling for Large-Scale Multiple Domain Dialogue State Tracking
Scaling up dialogue state tracking to multiple domains is challenging due to the growth in the number of variables being tracked. Furthermore, dialog state tracking models do not yet explicitly make use of relationships between dialogue variables, such as slots across domains. We propose using energy-based structure prediction methods for large-scale dialogue state tracking task in two multiple domain dialogue datasets. Our results indicate that: (i) modelling variable dependencies yields better results; and (ii) the structured prediction output aligns with the dialogue slot-value constraint principles. This leads to promising directions to improve state-of-the-art models by incorporating variable dependencies into their prediction process
Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance
on the use of appearance models for target re-identification and almost no
approaches employ a complete deep learning solution for both objectives. We
present a novel, complete deep learning framework for multi-person localisation
and tracking. In this context we first introduce a light weight sequential
Generative Adversarial Network architecture for person localisation, which
overcomes issues related to occlusions and noisy detections, typically found in
a multi person environment. In the proposed tracking framework we build upon
recent advances in pedestrian trajectory prediction approaches and propose a
novel data association scheme based on predicted trajectories. This removes the
need for computationally expensive person re-identification systems based on
appearance features and generates human like trajectories with minimal
fragmentation. The proposed method is evaluated on multiple public benchmarks
including both static and dynamic cameras and is capable of generating
outstanding performance, especially among other recently proposed deep neural
network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer
Vision (WACV), 201
Hierarchically Structured Non-Intrusive Sign Language Recognition
This work presents a hierarchically structured approach at the nonintrusive recognition of sign language from a monocular frontal view. Robustness is achieved through sophisticated localization and tracking methods, including a combined EM/CAMSHIFT overlap resolution procedure and the parallel pursuit of multiple hypotheses about hands position and movement. This allows handling of ambiguities and automatically corrects tracking errors. A biomechanical skeleton model and dynamic motion prediction using Kalman filters represents high level knowledge. Classification is performed by Hidden Markov Models. 152 signs from German sign language were recognized with an accuracy of 97.6%
Context-aware multi-head self-attentional neural network model for next location prediction
Accurate activity location prediction is a crucial component of many mobility
applications and is particularly required to develop personalized, sustainable
transportation systems. Despite the widespread adoption of deep learning
models, next location prediction models lack a comprehensive discussion and
integration of mobility-related spatio-temporal contexts. Here, we utilize a
multi-head self-attentional (MHSA) neural network that learns location
transition patterns from historical location visits, their visit time and
activity duration, as well as their surrounding land use functions, to infer an
individual's next location. Specifically, we adopt point-of-interest data and
latent Dirichlet allocation for representing locations' land use contexts at
multiple spatial scales, generate embedding vectors of the spatio-temporal
features, and learn to predict the next location with an MHSA network. Through
experiments on two large-scale GNSS tracking datasets, we demonstrate that the
proposed model outperforms other state-of-the-art prediction models, and reveal
the contribution of various spatio-temporal contexts to the model's
performance. Moreover, we find that the model trained on population data
achieves higher prediction performance with fewer parameters than
individual-level models due to learning from collective movement patterns. We
also reveal mobility conducted in the recent past and one week before has the
largest influence on the current prediction, showing that learning from a
subset of the historical mobility is sufficient to obtain an accurate location
prediction result. We believe that the proposed model is vital for
context-aware mobility prediction. The gained insights will help to understand
location prediction models and promote their implementation for mobility
applications.Comment: updated Discussion section; accepted by Transportation Research Part
Patterns of Text Readability in Human and Predicted Eye Movements
It has been shown that multilingual transformer models are able to predict human reading behavior when fine-tuned on small amounts of eye tracking data. As the cumulated prediction results do not provide insights into the linguistic cues that the model acquires to predict reading behavior, we conduct a deeper analysis of the predictions from the perspective of
readability. We try to disentangle the three-fold relationship between human eye movements, the capability of language models to predict these eye movement patterns, and sentence-level readability measures for English. We compare a range of model configurations to multiple baselines. We show that the models exhibit difficulties with function words and that pre-training only provides limited advantages for linguistic generalization
Extended Object Tracking: Introduction, Overview and Applications
This article provides an elaborate overview of current research in extended
object tracking. We provide a clear definition of the extended object tracking
problem and discuss its delimitation to other types of object tracking. Next,
different aspects of extended object modelling are extensively discussed.
Subsequently, we give a tutorial introduction to two basic and well used
extended object tracking approaches - the random matrix approach and the Kalman
filter-based approach for star-convex shapes. The next part treats the tracking
of multiple extended objects and elaborates how the large number of feasible
association hypotheses can be tackled using both Random Finite Set (RFS) and
Non-RFS multi-object trackers. The article concludes with a summary of current
applications, where four example applications involving camera, X-band radar,
light detection and ranging (lidar), red-green-blue-depth (RGB-D) sensors are
highlighted.Comment: 30 pages, 19 figure
Multi-Object Tracking with Interacting Vehicles and Road Map Information
In many applications, tracking of multiple objects is crucial for a
perception of the current environment. Most of the present multi-object
tracking algorithms assume that objects move independently regarding other
dynamic objects as well as the static environment. Since in many traffic
situations objects interact with each other and in addition there are
restrictions due to drivable areas, the assumption of an independent object
motion is not fulfilled. This paper proposes an approach adapting a
multi-object tracking system to model interaction between vehicles, and the
current road geometry. Therefore, the prediction step of a Labeled
Multi-Bernoulli filter is extended to facilitate modeling interaction between
objects using the Intelligent Driver Model. Furthermore, to consider road map
information, an approximation of a highly precise road map is used. The results
show that in scenarios where the assumption of a standard motion model is
violated, the tracking system adapted with the proposed method achieves higher
accuracy and robustness in its track estimations
- …