92 research outputs found
Joint segmentation of multivariate time series with hidden process regression for human activity recognition
The problem of human activity recognition is central for understanding and
predicting the human behavior, in particular in a prospective of assistive
services to humans, such as health monitoring, well being, security, etc. There
is therefore a growing need to build accurate models which can take into
account the variability of the human activities over time (dynamic models)
rather than static ones which can have some limitations in such a dynamic
context. In this paper, the problem of activity recognition is analyzed through
the segmentation of the multidimensional time series of the acceleration data
measured in the 3-d space using body-worn accelerometers. The proposed model
for automatic temporal segmentation is a specific statistical latent process
model which assumes that the observed acceleration sequence is governed by
sequence of hidden (unobserved) activities. More specifically, the proposed
approach is based on a specific multiple regression model incorporating a
hidden discrete logistic process which governs the switching from one activity
to another over time. The model is learned in an unsupervised context by
maximizing the observed-data log-likelihood via a dedicated
expectation-maximization (EM) algorithm. We applied it on a real-world
automatic human activity recognition problem and its performance was assessed
by performing comparisons with alternative approaches, including well-known
supervised static classifiers and the standard hidden Markov model (HMM). The
obtained results are very encouraging and show that the proposed approach is
quite competitive even it works in an entirely unsupervised way and does not
requires a feature extraction preprocessing step
An Unsupervised Approach for Automatic Activity Recognition based on Hidden Markov Model Regression
Using supervised machine learning approaches to recognize human activities
from on-body wearable accelerometers generally requires a large amount of
labelled data. When ground truth information is not available, too expensive,
time consuming or difficult to collect, one has to rely on unsupervised
approaches. This paper presents a new unsupervised approach for human activity
recognition from raw acceleration data measured using inertial wearable
sensors. The proposed method is based upon joint segmentation of
multidimensional time series using a Hidden Markov Model (HMM) in a multiple
regression context. The model is learned in an unsupervised framework using the
Expectation-Maximization (EM) algorithm where no activity labels are needed.
The proposed method takes into account the sequential appearance of the data.
It is therefore adapted for the temporal acceleration data to accurately detect
the activities. It allows both segmentation and classification of the human
activities. Experimental results are provided to demonstrate the efficiency of
the proposed approach with respect to standard supervised and unsupervised
classification approache
Noiseless Independent Factor Analysis with mixing constraints in a semi-supervised framework. Application to railway device fault diagnosis.
International audienceIn Independent Factor Analysis (IFA), latent components (or sources) are recovered from only their linear observed mixtures. Both the mixing process and the source densities (that are assumed to be gener- ated according to mixtures of Gaussians) are learned from observed data. This paper investigates the possibility of estimating the IFA model in its noiseless setting when two kinds of prior information are incorporated: constraints on the mixing process and partial knowledge on the cluster membership of some examples. Semi-supervised or partially supervised learning frameworks can thus be handled. These two proposals have been initially motivated by a real-world application that concerns fault diag- nosis of a railway device. Results from this application are provided to demonstrate the ability of our approach to enhance estimation accuracy and remove indeterminacy commonly encountered in unsupervised IFA such as source permutations
Visualization tools for spatio-temporal time-series analysis with context awareness: Montreal subway case
TRANSITDATA 2019 - 5th International Workshop and Symposium, Paris, FRANCE, 08-/07/2019 - 10/07/2019Forecasting passenger demand is of great interest for public transport operators. Despite the important role that forecasting play in mobility demand understanding, in-depth transport oriented analysis of the forecasting results is often overlooked, since it raised some challenges. In this context we developed two visualization tools with open source frameworks that allow to analyze spatio-temporal time-series forecasting with context awareness. The first visualization tool allows to analyze the forecasting results over large period in all the stations and to zoom in for more precise temporal details. The other tool allows to better understand the passenger demand relations between the different stations of the transport network, and enable a spatial analysis of the results. Analyzed time-series corresponds to the forecast results of the number of passengers entering each station with a fine-grained temporal resolution (15 minutes interval) during one year achieved with a well-known machine learning model, a Random Forest. In order to highlight the spatio-temporal specificity of the passenger demand, we have computed and analyzed the residuals of a long-term forecast model that returns normal passenger demand. Here we show that both visualization tools depict the stations and the period hard to predict and allow to have an insight on which contextual element (weather, event on the city and incident on the transport network) could impact the forecasting. Experiment are performed with real data given by the transport authority of Montreal (Société de transport de Montreal, STM)
Semi-supervised feature extraction using independent factor analysis
International audienceDimensionality reduction can be efficiently achieved by generative latent variable models such as probabilistic principal component analysis (PPCA) or independent component analysis (ICA), aiming to extract a reduced set of variables (latent variables) from the original ones. In most cases, the learning of these methods is achieved within the unsupervised framework where only unlabeled samples are used. In this paper we investigate the possibility of estimating independent factor analysis model (IFA) and thus projecting original data onto a lower dimensional space, when prior knowledge on the cluster membership of some training samples is incorporated. In the basic IFA model, latent variables are only recovered from their linear observed mixtures (original features). Both the mapping matrix (assumed to be linear) and the latent variable densities (that are assumed to be mutually independent and generated according to mixtures of Gaussians) are learned from observed data. We propose to learn this model within semisupervised framework where the likelihood of both labeled and unlabeled samples is maximized by a generalized expectation-maximization (GEM) algorithm. Experimental results on real data sets are provided to demonstrate the ability of our approach to find law dimensional manifold with good explanatory power
Representation Learning of public transport data. Application to event detection
5th International Workshop and Symposium TransitData 2019, Paris, France, 08-/07/2019 - 10/07/2019On the basis of data collected by counting sensors deployed on trains, this paper deals with a forecasting of passenger load in public transport taking into account train operation. Providing passengers with train load forecasting, in addition to the expected arrival time of the next train, can indeed be useful for a better planning of their journeys, which can prevent over-crowding situations in the trains [6] [7]. The proposed approach is built on both a hierarchy of recurrent neural networks [8] and representation learning [9] with the aim to explore the ability of such mobility data processing to simultaneously perform a forecasting task and highlight the impact of events on the public transport operation and demand. An event refers here to an unexpected passenger transport activity or to a modification in transport operation compared to those corresponding to normal conditions. Two kind of historical data are used, namely train load data and automatic vehicle location (AVL) data. This latter source contains all information related to the train operation (delay, time of arrival/departure of vehicles ...). The proposed methodology is applied on a railway transit network line operated by the French railway company SNCF in the suburban of Paris. The historical dataset used in the experiments covers the period from 2015 to 2016
LSTM encoder-predictor for short-term train load forecasting
ECML/PKDD - The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, WĂĽrtzburg, ALLEMAGNE, 16-/09/2019 - 20/09/2019The increase in the amount of data collected in the transport domain can greatly benefit mobility studies and help to create high value-added mobility services for passengers as well as regulation tools for operators. The research detailed in this paper is related to the development of an advanced machine learning approach with the aim of forecasting the passenger load of trains in public transport. Predicting the crowding level on public transport can indeed be useful for enriching the information available to passengers to enable them to better plan their daily trips. Moreover, operators will increasingly need to assess and predict network passenger load to improve train regulation processes and service quality levels. The main issues to address in this forecasting task are the variability in the train load series induced by the train schedule and the influence of several contextual factors, such as calendar information. We propose a neural network LSTM encoder-predictor combined with a contextual representation learning to address this problem. Experiments are conducted on a real dataset provided by the French railway company SNCF and collected over a period of one and a half years. The prediction performance provided by the proposed model are compared to those given by historical models and by traditional machine learning models. The obtained results have demonstrated the potential of the proposed LSTM encoder-predictor to address both one-step-ahead and multi-step forecasting and to outperform other models by maintaining robustness in the quality of the forecasts throughout the time horizon
Fuel cells static and dynamic characterizations as tools for the estimation of their ageing time
This paper deals with a pattern-recognition-based diagnosis approach, which aim is to estimate the Fuel Cell (FC) operating time, and consequently its remaining duration life. With the method proposed, both static and dynamic information extracted from the stack (i.e. polarization curve records and Electrochemical Impedance Spectroscopy (EIS) measurements) can be used. The complete diagnosis method consists of several steps. First, features are extracted from EIS measurements and polarization curves independently. This enables us to simplify the extracted information without losing relevant information, and to remove noise. For the polarization curves, an empiric model is exploited to ensure the feature extraction. For the impedance spectra, both expert knowledge and parametric modeling are used to extract features. In particular, a latent regression model is used to split automatically the imaginary part of the spectra into several segments that are approximated by polynomials. The next step of the method consists in selecting the most relevant features from the whole set of extracted features. This helps us to estimate the operating time, while adjusting the complexity of the model. The final step of the approach is a linear regression that uses the selected subset of features to estimate the FC operating time. The performances of the proposed approach are evaluated on a dataset made up of EIS measurements and polarization curves extracted from two FC lifetime tests. A mean error of about 2 h over a global operating duration of 1000 h can be obtained. Moreover, the portability of the method is shown by considering another FC ageing test conducted on a different FC stack type
Estimation of fuel cell operating time for prédictive maintenance strategies
International audienceOne of the limiting factors for the spreading of the fuel cell technology is the durability and researches to extend their lifetime are being done world-widely. We present here a pattern recognition approach aiming to estimate fuel cell operating time based on electrochemical impedance spectroscopy measurements. It consists in first extracting features from the impedance spectrum. For that purpose, two approaches have been investigated. In the first one, particular points of the spectrum are empirically extracted as features. In the second approach, a parametric modelling is performed to extract features from both the real and the imaginary parts of the impedance spectrum. In particular, a latent regression model is used to automatically split the spectrum into several segments that are approximated by polynomials. The number of segments is adjusted taking account the a priori knowledge about the physical behaviour of fuel cell components. Then, a linear regression model using different subsets of extracted features is employed for the estimation of fuel cell operating time. The performances of the proposed approach are evaluated on experimental data set to show its feasibility. Being able to estimate the fuel cell operating time, and consequently its remaining duration life, these results could lead to interesting perspectives for predictive maintenance policy of fuel cells
- …