14,200 research outputs found
Interpretable 3D Human Action Analysis with Temporal Convolutional Networks
The discriminative power of modern deep learning models for 3D human action
recognition is growing ever so potent. In conjunction with the recent
resurgence of 3D human action representation with 3D skeletons, the quality and
the pace of recent progress have been significant. However, the inner workings
of state-of-the-art learning based methods in 3D human action recognition still
remain mostly black-box. In this work, we propose to use a new class of models
known as Temporal Convolutional Neural Networks (TCN) for 3D human action
recognition. Compared to popular LSTM-based Recurrent Neural Network models,
given interpretable input such as 3D skeletons, TCN provides us a way to
explicitly learn readily interpretable spatio-temporal representations for 3D
human action recognition. We provide our strategy in re-designing the TCN with
interpretability in mind and how such characteristics of the model is leveraged
to construct a powerful 3D activity recognition method. Through this work, we
wish to take a step towards a spatio-temporal model that is easier to
understand, explain and interpret. The resulting model, Res-TCN, achieves
state-of-the-art results on the largest 3D human action recognition dataset,
NTU-RGBD.Comment: 8 pages, 5 figures, BNMW CVPR 2017 Submissio
Lifelong Learning of Spatiotemporal Representations with Dual-Memory Recurrent Self-Organization
Artificial autonomous agents and robots interacting in complex environments
are required to continually acquire and fine-tune knowledge over sustained
periods of time. The ability to learn from continuous streams of information is
referred to as lifelong learning and represents a long-standing challenge for
neural network models due to catastrophic forgetting. Computational models of
lifelong learning typically alleviate catastrophic forgetting in experimental
scenarios with given datasets of static images and limited complexity, thereby
differing significantly from the conditions artificial agents are exposed to.
In more natural settings, sequential information may become progressively
available over time and access to previous experience may be restricted. In
this paper, we propose a dual-memory self-organizing architecture for lifelong
learning scenarios. The architecture comprises two growing recurrent networks
with the complementary tasks of learning object instances (episodic memory) and
categories (semantic memory). Both growing networks can expand in response to
novel sensory experience: the episodic memory learns fine-grained
spatiotemporal representations of object instances in an unsupervised fashion
while the semantic memory uses task-relevant signals to regulate structural
plasticity levels and develop more compact representations from episodic
experience. For the consolidation of knowledge in the absence of external
sensory input, the episodic memory periodically replays trajectories of neural
reactivations. We evaluate the proposed model on the CORe50 benchmark dataset
for continuous object recognition, showing that we significantly outperform
current methods of lifelong learning in three different incremental learning
scenario
Nature-Inspired Learning Models
Intelligent learning mechanisms found in natural world are still unsurpassed in their learning performance and eficiency of dealing with uncertain information coming in a variety of forms, yet remain under continuous challenge
from human driven artificial intelligence methods. This work intends to demonstrate how the phenomena observed in physical world can be directly used to guide artificial learning models. An inspiration for the new
learning methods has been found in the mechanics of physical fields found in both micro and macro scale.
Exploiting the analogies between data and particles subjected to gravity, electrostatic and gas particle fields, new algorithms have been developed and applied to classification and clustering while the properties of the
field further reused in regression and visualisation of classification and classifier fusion. The paper covers extensive pictorial examples and visual interpretations of the presented techniques along with some testing over
the well-known real and artificial datasets, compared when possible to the traditional methods
Iterative Temporal Learning and Prediction with the Sparse Online Echo State Gaussian Process
Abstract—In this work, we contribute the online echo state gaussian process (OESGP), a novel Bayesian-based online method that is capable of iteratively learning complex temporal dy-namics and producing predictive distributions (instead of point predictions). Our method can be seen as a combination of the echo state network with a sparse approximation of Gaussian processes (GPs). Extensive experiments on the one-step prediction task on well-known benchmark problems show that OESGP produced statistically superior results to current online ESNs and state-of-the-art regression methods. In addition, we characterise the benefits (and drawbacks) associated with the considered online methods, specifically with regards to the trade-off between computational cost and accuracy. For a high-dimensional action recognition task, we demonstrate that OESGP produces high accuracies comparable to a recently published graphical model, while being fast enough for real-time interactive scenarios. I
- …