117,156 research outputs found
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
Clustering with phylogenetic tools in astrophysics
Phylogenetic approaches are finding more and more applications outside the
field of biology. Astrophysics is no exception since an overwhelming amount of
multivariate data has appeared in the last twenty years or so. In particular,
the diversification of galaxies throughout the evolution of the Universe quite
naturally invokes phylogenetic approaches. We have demonstrated that Maximum
Parsimony brings useful astrophysical results, and we now proceed toward the
analyses of large datasets for galaxies. In this talk I present how we solve
the major difficulties for this goal: the choice of the parameters, their
discretization, and the analysis of a high number of objects with an
unsupervised NP-hard classification technique like cladistics. 1. Introduction
How do the galaxy form, and when? How did the galaxy evolve and transform
themselves to create the diversity we observe? What are the progenitors to
present-day galaxies? To answer these big questions, observations throughout
the Universe and the physical modelisation are obvious tools. But between
these, there is a key process, without which it would be impossible to extract
some digestible information from the complexity of these systems. This is
classification. One century ago, galaxies were discovered by Hubble. From
images obtained in the visible range of wavelengths, he synthetised his
observations through the usual process: classification. With only one parameter
(the shape) that is qualitative and determined with the eye, he found four
categories: ellipticals, spirals, barred spirals and irregulars. This is the
famous Hubble classification. He later hypothetized relationships between these
classes, building the Hubble Tuning Fork. The Hubble classification has been
refined, notably by de Vaucouleurs, and is still used as the only global
classification of galaxies. Even though the physical relationships proposed by
Hubble are not retained any more, the Hubble Tuning Fork is nearly always used
to represent the classification of the galaxy diversity under its new name the
Hubble sequence (e.g. Delgado-Serrano, 2012). Its success is impressive and can
be understood by its simplicity, even its beauty, and by the many correlations
found between the morphology of galaxies and their other properties. And one
must admit that there is no alternative up to now, even though both the Hubble
classification and diagram have been recognised to be unsatisfactory. Among the
most obvious flaws of this classification, one must mention its monovariate,
qualitative, subjective and old-fashioned nature, as well as the difficulty to
characterise the morphology of distant galaxies. The first two most significant
multivariate studies were by Watanabe et al. (1985) and Whitmore (1984). Since
the year 2005, the number of studies attempting to go beyond the Hubble
classification has increased largely. Why, despite of this, the Hubble
classification and its sequence are still alive and no alternative have yet
emerged (Sandage, 2005)? My feeling is that the results of the multivariate
analyses are not easily integrated into a one-century old practice of modeling
the observations. In addition, extragalactic objects like galaxies, stellar
clusters or stars do evolve. Astronomy now provides data on very distant
objects, raising the question of the relationships between those and our
present day nearby galaxies. Clearly, this is a phylogenetic problem.
Astrocladistics 1 aims at exploring the use of phylogenetic tools in
astrophysics (Fraix-Burnet et al., 2006a,b). We have proved that Maximum
Parsimony (or cladistics) can be applied in astrophysics and provides a new
exploration tool of the data (Fraix-Burnet et al., 2009, 2012, Cardone \&
Fraix-Burnet, 2013). As far as the classification of galaxies is concerned, a
larger number of objects must now be analysed. In this paper, IComment: Proceedings of the 60th World Statistics Congress of the
International Statistical Institute, ISI2015, Jul 2015, Rio de Janeiro,
Brazi
Joint segmentation of multivariate time series with hidden process regression for human activity recognition
The problem of human activity recognition is central for understanding and
predicting the human behavior, in particular in a prospective of assistive
services to humans, such as health monitoring, well being, security, etc. There
is therefore a growing need to build accurate models which can take into
account the variability of the human activities over time (dynamic models)
rather than static ones which can have some limitations in such a dynamic
context. In this paper, the problem of activity recognition is analyzed through
the segmentation of the multidimensional time series of the acceleration data
measured in the 3-d space using body-worn accelerometers. The proposed model
for automatic temporal segmentation is a specific statistical latent process
model which assumes that the observed acceleration sequence is governed by
sequence of hidden (unobserved) activities. More specifically, the proposed
approach is based on a specific multiple regression model incorporating a
hidden discrete logistic process which governs the switching from one activity
to another over time. The model is learned in an unsupervised context by
maximizing the observed-data log-likelihood via a dedicated
expectation-maximization (EM) algorithm. We applied it on a real-world
automatic human activity recognition problem and its performance was assessed
by performing comparisons with alternative approaches, including well-known
supervised static classifiers and the standard hidden Markov model (HMM). The
obtained results are very encouraging and show that the proposed approach is
quite competitive even it works in an entirely unsupervised way and does not
requires a feature extraction preprocessing step
Integer Echo State Networks: Hyperdimensional Reservoir Computing
We propose an approximation of Echo State Networks (ESN) that can be
efficiently implemented on digital hardware based on the mathematics of
hyperdimensional computing. The reservoir of the proposed Integer Echo State
Network (intESN) is a vector containing only n-bits integers (where n<8 is
normally sufficient for a satisfactory performance). The recurrent matrix
multiplication is replaced with an efficient cyclic shift operation. The intESN
architecture is verified with typical tasks in reservoir computing: memorizing
of a sequence of inputs; classifying time-series; learning dynamic processes.
Such an architecture results in dramatic improvements in memory footprint and
computational efficiency, with minimal performance loss.Comment: 10 pages, 10 figures, 1 tabl
BRUNO: A Deep Recurrent Model for Exchangeable Data
We present a novel model architecture which leverages deep learning tools to
perform exact Bayesian inference on sets of high dimensional, complex
observations. Our model is provably exchangeable, meaning that the joint
distribution over observations is invariant under permutation: this property
lies at the heart of Bayesian inference. The model does not require variational
approximations to train, and new samples can be generated conditional on
previous samples, with cost linear in the size of the conditioning set. The
advantages of our architecture are demonstrated on learning tasks that require
generalisation from short observed sequences while modelling sequence
variability, such as conditional image generation, few-shot learning, and
anomaly detection.Comment: NIPS 201
Continuum directions for supervised dimension reduction
Dimension reduction of multivariate data supervised by auxiliary information
is considered. A series of basis for dimension reduction is obtained as
minimizers of a novel criterion. The proposed method is akin to continuum
regression, and the resulting basis is called continuum directions. With a
presence of binary supervision data, these directions continuously bridge the
principal component, mean difference and linear discriminant directions, thus
ranging from unsupervised to fully supervised dimension reduction.
High-dimensional asymptotic studies of continuum directions for binary
supervision reveal several interesting facts. The conditions under which the
sample continuum directions are inconsistent, but their classification
performance is good, are specified. While the proposed method can be directly
used for binary and multi-category classification, its generalizations to
incorporate any form of auxiliary data are also presented. The proposed method
enjoys fast computation, and the performance is better or on par with more
computer-intensive alternatives
Network Uncertainty Informed Semantic Feature Selection for Visual SLAM
In order to facilitate long-term localization using a visual simultaneous
localization and mapping (SLAM) algorithm, careful feature selection can help
ensure that reference points persist over long durations and the runtime and
storage complexity of the algorithm remain consistent. We present SIVO
(Semantically Informed Visual Odometry and Mapping), a novel
information-theoretic feature selection method for visual SLAM which
incorporates semantic segmentation and neural network uncertainty into the
feature selection pipeline. Our algorithm selects points which provide the
highest reduction in Shannon entropy between the entropy of the current state
and the joint entropy of the state, given the addition of the new feature with
the classification entropy of the feature from a Bayesian neural network. Each
selected feature significantly reduces the uncertainty of the vehicle state and
has been detected to be a static object (building, traffic sign, etc.)
repeatedly with a high confidence. This selection strategy generates a sparse
map which can facilitate long-term localization. The KITTI odometry dataset is
used to evaluate our method, and we also compare our results against ORB_SLAM2.
Overall, SIVO performs comparably to the baseline method while reducing the map
size by almost 70%.Comment: Published in: 2019 16th Conference on Computer and Robot Vision (CRV
- …
