6 research outputs found
Recommended from our members
A survey on wearable sensor modality centred human activity recognition in health care
Increased life expectancy coupled with declining birth rates is leading to an aging population structure. Aging-caused changes, such as physical or cognitive decline, could affect people's quality of life, result in injuries, mental health or the lack of physical activity. Sensor-based human activity recognition (HAR) is one of the most promising assistive technologies to support older people's daily life, which has enabled enormous potential in human-centred applications. Recent surveys in HAR either only focus on the deep learning approaches or one specific sensor modality. This survey aims to provide a more comprehensive introduction for newcomers and researchers to HAR. We first introduce the state-of-art sensor modalities in HAR. We look more into the techniques involved in each step of wearable sensor modality centred HAR in terms of sensors, activities, data pre-processing, feature learning and classification, including both conventional approaches and deep learning methods. In the feature learning section, we focus on both hand-crafted features and automatically learned features using deep networks. We also present the ambient-sensor-based HAR, including camera-based systems, and the systems which combine the wearable and ambient sensors. Finally, we identify the corresponding challenges in HAR to pose research problems for further improvement in HAR
Models for time series prediction based on neural networks. Case study : GLP sales prediction from ANCAP.
A time series is a sequence of real values that can be considered as observations of a certain
system. In this work, we are interested in time series coming from dynamical systems. Such
systems can be sometimes described by a set of equations that model the underlying mechanism
from where the samples come. However, in several real systems, those equations are
unknown, and the only information available is a set of temporal measures, that constitute
a time series. On the other hand, by practical reasons it is usually required to have a prediction,
v.g. to know the (approximated) value of the series in a future instant t. The goal of
this thesis is to solve one of such real-world prediction problem: given historical data related
with the lique ed bottled propane gas sales, predict the future gas sales, as accurately as
possible. This time series prediction problem is addressed by means of neural networks,
using both (dynamic) reconstruction and prediction. The problem of to dynamically reconstruct
the original system consists in building a model that captures certain characteristics
of it in order to have a correspondence between the long-term behavior of the model and of
the system.
The networks design process is basically guided by three ingredients. The dimensionality
of the problem is explored by our rst ingredient, the Takens-Mañé's theorem. By means
of this theorem, the optimal dimension of the (neural) network input can be investigated.
Our second ingredient is a strong theorem: neural networks with a single hidden layer are
universal approximators. As the third ingredient, we faced the search of the optimal size
of the hidden layer by means of genetic algorithms, used to suggest the number of hidden
neurons that maximizes a target tness function (related with prediction errors). These
algorithms are also used to nd the most in uential networks inputs in some cases. The
determination of the hidden layer size is a central (and hard) problem in the determination
of the network topology.
This thesis includes a state of the art of neural networks design for time series prediction, including
related topics such as dynamical systems, universal approximators, gradient-descent
searches and variations, as well as meta-heuristics. The survey of the related literature is
intended to be extensive, for both printed material and electronic format, in order to have a
landscape of the main aspects for the state of the art in time series prediction using neural
networks. The material found was sometimes extremely redundant (as in the case of the
back-propagation algorithm and its improvements) and scarce in others (memory structures
or estimation of the signal subspace dimension in the stochastic case). The surveyed literature
includes classical research works ([27], [50], [52]) as well as more recent ones ([79] , [16]
or [82]), which pretends to be another contribution of this thesis.
Special attention is given to the available software tools for neural networks design and time
series processing. After a review of the available software packages, the most promising
computational tools for both approaches are discussed. As a result, a whole framework
based on mature software tools was set and used. In order to work with such dynamical
systems, software intended speci cally for the analysis and processing of time series was
employed, and then chaotic series were part of our focus.
Since not all randomness is attributable to chaos, in order to characterize the dynamical
system generating the time series, an exploration of chaotic-stochastic systems is required,
as well as network models to predict a time series associated to one of them. Here we
pretend to show how the knowledge of the domain, something extensively treated in the
bibliography, can be someway sophisticated (such as the Lyapunov's spectrum for a series
or the embedding dimension). In order to model the dynamical system generated by the time series we used the state-space model, so the time series prediction was translated in the prediction of the next system
state. This state-space model, together with the delays method (delayed coordinates) have
practical importance for the development of this work, speci cally, the design of the input
layer in some networks (multi-layer perceptrons - MLPs) and other parameters (taps in the
TFLNs). Additionally, the rest of the network components where determined in many cases
through procedures traditionally used in neural networks : genetic algorithms.
The criteria of model (network) selection are discussed and a trade-o between performance
and network complexity is further explored, inspired in the Rissanen's minimum description
length and its estimation given by the chosen software. Regarding the employed network
models, the network topologies suggested from the literature as adequate for the prediction
are used (TLFNs and recurrent networks) together with MLPs (a classic of arti cial neural
networks) and networks committees. The e ectiveness of each method is con rmed for the
proposed prediction problem. Network committees, where the predictions are a naive convex
combination of predictions from individual networks, are also extensively used.
The need of criteria to compare the behaviors of the model and of the real system, in the long
run, for a dynamic stochastic systems, is presented and two alternatives are commented.
The obtained results proof the existence of a solution to the problem of learning of the
dependence Input ! Output . We also conjecture that the system is dynamic-stochastic
but not chaotic, because we only have a realization of the random process corresponding to
the sales. As a non-chaotic system, the mean of the predictions of the sales would improve
as the available data increase, although the probability of a prediction with a big error is
always non-null due to the randomness present. This solution is found in a constructive and
exhaustive way. The exhaustiveness can be deduced from the next ve statements:
the design of a neural network requires knowing the input and output dimension,the
number of the hidden layers and of the neurons in each of them.
the use of the Takens-Mañé's theorem allows to derive the dimension of the input data
by theorems such as the Kolmogorov's and Cybenko's ones the use of multi-layer
perceptrons with only one hidden layer is justi ed so several of such models were
tested
the number of neurons in the hidden layer is determined many times heuristically
using genetic algorithms
a neuron in the output gives the desired prediction
As we said, two tasks are carried out: the development of a time series prediction model
and the analysis of a feasible model for the dynamic reconstruction of the system. With
the best predictive model, obtained by an ensemble of two networks, an acceptable average
error was obtained when the week to be predicted is not adjacent to the training set (7.04%
for the week 46/2011). We believe that these results are acceptable provided the quantity
of information available, and represent an additional validation that neural networks are
useful for time series prediction coming from dynamical systems, no matter whether they
are stochastic or not.
Finally, the results con rmed several already known facts (such as that adding noise to the
inputs and outputs of the training values can improve the results; that recurrent networks
trained with the back-propagation algorithm don't have the problem of vanishing gradients
in short periods and that the use of committees - which can be seen as a very basic of
distributed arti cial intelligence - allows to improve signi cantly the predictions).Una serie temporal es una secuencia de valores reales que pueden ser considerados como observaciones
de un cierto sistema. En este trabajo, estamos interesados en series temporales
provenientes de sistemas dinámicos. Tales sistemas pueden ser algunas veces descriptos por
un conjunto de ecuaciones que modelan el mecanismo subyacente que genera las muestras.
sin embargo, en muchos sistemas reales, esas ecuaciones son desconocidas, y la única información disponible es un conjunto de medidas en el tiempo, que constituyen la serie temporal.
Por otra parte, por razones prácticas es generalmente requerida una predicción, es decir,
conocer el valor (aproximado) de la serie en un instante futuro t. La meta de esta tesis es
resolver un problema de predicción del mundo real: dados los datos históricos relacionados
con las ventas de gas propano licuado, predecir las ventas futuras, tan aproximadamente
como sea posible. Este problema de predicción de series temporales es abordado por medio
de redes neuronales, tanto para la reconstrucción como para la predicción. El problema de
reconstruir dinámicamente el sistema original consiste en construir un modelo que capture
ciertas características de él de forma de tener una correspondencia entre el comportamiento
a largo plazo del modelo y del sistema.
El proceso de diseño de las redes es guiado básicamente por tres ingredientes. La dimensionalidad
del problema es explorada por nuestro primer ingrediente, el teorema de Takens-Mañé.
Por medio de este teorema, la dimensión óptima de la entrada de la red neuronal puede ser
investigada. Nuestro segundo ingrediente es un teorema muy fuerte: las redes neuronales
con una sola capa oculta son un aproximador universal. Como tercer ingrediente, encaramos
la búsqueda del tamaño oculta de la capa oculta por medio de algoritmos genéticos, usados
para sugerir el número de neuronas ocultas que maximizan una función objetivo (relacionada
con los errores de predicción). Estos algoritmos se usan además para encontrar las entradas
a la red que influyen más en la salida en algunos casos. La determinación del tamaño de la
capa oculta es un problema central (y duro) en la determinación de la topología de la red.
Esta tesis incluye un estado del arte del diseño de redes neuronales para la predicción de series
temporales, incluyendo tópicos relacionados tales como sistemas dinámicos, aproximadores
universales, búsquedas basadas en el gradiente y sus variaciones, así como meta-heurísticas.
El relevamiento de la literatura relacionada busca ser extenso, para tanto el material impreso
como para el que esta en formato electrónico, de forma de tener un panorama de los
principales aspectos del estado del arte en la predicción de series temporales usando redes
neuronales. El material hallado fue algunas veces extremadamente redundante (como en
el caso del algoritmo de retropropagación y sus mejoras) y escaso en otros (estructuras de
memoria o estimación de la dimensión del sub-espacio de señal en el caso estocástico). La
literatura consultada incluye trabajos de investigación clásicos ( ([27], [50], [52])' así como
de los más reciente ([79] , [16] or [82]).
Se presta especial atención a las herramientas de software disponibles para el diseño de redes
neuronales y el procesamiento de series temporales. Luego de una revisión de los paquetes
de software disponibles, las herramientas más promisiorias para ambas tareas son discutidas.
Como resultado, un entorno de trabajo completo basado en herramientas de software maduras fue definido y usado. Para trabajar con los mencionados sistemas dinámicos, software
especializado en el análisis y proceso de las series temporales fue empleado, y entonces
las series caóticas fueron estudiadas.
Ya que no toda la aleatoriedad es atribuible al caos, para caracterizar al sistema dinámico
que genera la serie temporal se requiere una exploración de los sistemas caóticos-estocásticos,
así como de los modelos de red para predecir una serie temporal asociada a uno de ellos.
Aquí se pretende mostrar cómo el conocimiento del dominio, algo extensamente tratado en
la literatura, puede ser de alguna manera sofisticado (tal como el espectro de Lyapunov de
la serie o la dimensión del sub-espacio de señal).
Para modelar el sistema dinámico generado por la serie temporal se usa el modelo de espacio
de estados, por lo que la predicción de la serie temporal es traducida en la predicción
del siguiente estado del sistema. Este modelo de espacio de estados, junto con el método
de los delays (coordenadas demoradas) tiene importancia práctica en el desarrollo de este
trabajo, específicamente, en el diseño de la capa de entrada en algunas redes (los perceptrones
multicapa) y otros parámetros (los taps de las redes TLFN). Adicionalmente, el resto
de los componentes de la red con determinados en varios casos a través de procedimientos
tradicionalmente usados en las redes neuronales: los algoritmos genéticos.
Los criterios para la selección de modelo (red) son discutidos y un balance entre performance
y complejidad de la red es explorado luego, inspirado en el minimum description length de
Rissanen y su estimación dada por el software elegido.
Con respecto a los modelos de red empleados, las topologóas de sugeridas en la literatura
como adecuadas para la predicción son usadas (TLFNs y redes recurrentes) junto con perceptrones
multicapa (un clásico de las redes neuronales) y comités de redes. La efectividad
de cada método es confirmada por el problema de predicción propuesto. Los comités de
redes, donde las predicciones son una combinación convexa de las predicciones dadas por
las redes individuales, son también usados extensamente.
La necesidad de criterios para comparar el comportamiento del modelo con el del sistema
real, a largo plazo, para un sistema dinámico estocástico, es presentada y dos alternativas
son comentadas.
Los resultados obtenidos prueban la existencia de una solución al problema del aprendizaje
de la dependencia Entrada - Salida . Conjeturamos además que el sistema generador de
serie de las ventas es dinámico-estocástico pero no caótico, ya que sólo tenemos una realización del proceso aleatorio correspondiente a las ventas. Al ser un sistema no caótico, la media de las predicciones de las ventas debería mejorar a medida que los datos disponibles
aumentan, aunque la probabilidad de una predicción con un gran error es siempre no nula debido
a la aleatoriedad presente. Esta solución es encontrada en una forma constructiva
y exhaustiva. La exhaustividad puede deducirse de las siguiente cinco afirmaciones :
el diseño de una red neuronal requiere conocer la dimensión de la entrada y de la
salida, el número de capas ocultas y las neuronas en cada una de ellas
el uso del teorema de takens-Mañé permite derivar la dimensión de la entrada
por teoremas tales como los de Kolmogorov y Cybenko el uso de perceptrones con solo
una capa oculta es justificado, por lo que varios de tales modelos son probados
el número de neuronas en la capa oculta es determinada varias veces heurísticamente
a través de algoritmos genéticos
una sola neurona de salida da la predicción deseada. Como se dijo, dos tareas son llevadas a cabo: el desarrollo de un modelo para la predicción de la serie temporal y el análisis de un modelo factible para la reconstrucción dinámica del sistema. Con el mejor modelo predictivo, obtenido por el comité de dos redes se logró obtener un error aceptable en la predicción de una semana no contigua al conjunto de
entrenamiento (7.04% para la semana 46/2011). Creemos que este es un resultado aceptable
dada la cantidad de información disponible y representa una validación adicional de que las
redes neuronales son útiles para la predicción de series temporales provenientes de sistemas
dinámicos, sin importar si son estocásticos o no.
Finalmente, los resultados experimentales confirmaron algunos hechos ya conocidos (tales
como que agregar ruido a los datos de entrada y de salida de los valores de entrenamiento
puede mejorar los resultados: que las redes recurrentes entrenadas con el algoritmo de
retropropagación no presentan el problema del gradiente evanescente en periodos cortos y
que el uso de de comités - que puede ser visto como una forma muy básica de inteligencia
artificial distribuida - permite mejorar significativamente las predicciones)
A data fusion-based hybrid sensory system for older people’s daily activity recognition.
Population aged 60 and over is growing faster. Ageing-caused changes, such as physical or cognitive decline, could affect people’s quality of life, resulting in injuries, mental health or the lack of physical activity. Sensor-based human activity recognition (HAR) has become one of the most promising assistive technologies for older people’s daily life. Literature in HAR suggests that each sensor modality has its strengths and limitations and single sensor modalities may not cope with complex situations in practice. This research aims to design and implement a hybrid sensory HAR system to provide more comprehensive, practical and accurate surveillance for older people to assist them living independently. This reseach: 1) designs and develops a hybrid HAR system which provides a spatio- temporal surveillance system for older people by combining the wrist-worn sensors and the room-mounted ambient sensors (passive infrared); the wearable data are used to recognize the defined specific daily activities, and the ambient information is used to infer the occupant’s room-level daily routine; 2): proposes a unique and effective data fusion method to hybridize the two-source sensory data, in which the captured room-level location information from the ambient sensors is also utilized to trigger the sub classification models pretrained by room-assigned wearable data; 3): implements augmented features which are extracted from the attitude angles of the wearable device and explores the contribution of the new features to HAR; 4:) proposes a feature selection (FS) method in the view of kernel canonical correlation analysis (KCCA) to maximize the relevance between the feature candidate and the target class labels and simultaneously minimizes the joint redundancy between the already selected features and the feature candidate, named mRMJR-KCCA; 5:) demonstrates all the proposed methods above with the ground-truth data collected from recruited participants in home settings. The proposed system has three function modes: 1) the pure wearable sensing mode (the whole classification model) which can identify all the defined specific daily activities together and function alone when the ambient sensing fails; 2) the pure ambient sensing mode which can deliver the occupant’s room-level daily routine without wearable sensing; and 3) the data fusion mode (room-based sub classification mode) which provides a more comprehensive and accurate surveillance HAR when both the wearable sensing and ambient sensing function properly. The research also applies the mutual information (MI)-based FS methods for feature selection, Support Vector Machine (SVM) and Random Forest (RF) for classification. The experimental results demonstrate that the proposed hybrid sensory system improves the recognition accuracy to 98.96% after applying data fusion using Random Forest (RF) classification and mRMJR-KCCA feature selection. Furthermore, the improved results are achieved with a much smaller number of features compared with the scenario of recognizing all the defined activities using wearable data alone. The research work conducted in the thesis is unique, which is not directly compared with others since there are few other similar existing works in terms of the proposed data fusion method and the introduced new feature set
Enhancing brain-computer interfacing through advanced independent component analysis techniques
A Brain-computer interface (BCI) is a direct communication system between a brain
and an external device in which messages or commands sent by an individual do not
pass through the brain’s normal output pathways but is detected through brain signals.
Some severe motor impairments, such as Amyothrophic Lateral Sclerosis, head
trauma, spinal injuries and other diseases may cause the patients to lose their muscle
control and become unable to communicate with the outside environment. Currently
no effective cure or treatment has yet been found for these diseases. Therefore using a
BCI system to rebuild the communication pathway becomes a possible alternative
solution. Among different types of BCIs, an electroencephalogram (EEG) based BCI
is becoming a popular system due to EEG’s fine temporal resolution, ease of use,
portability and low set-up cost. However EEG’s susceptibility to noise is a major
issue to develop a robust BCI. Signal processing techniques such as coherent
averaging, filtering, FFT and AR modelling, etc. are used to reduce the noise and
extract components of interest. However these methods process the data on the
observed mixture domain which mixes components of interest and noise. Such a
limitation means that extracted EEG signals possibly still contain the noise residue or
coarsely that the removed noise also contains part of EEG signals embedded.
Independent Component Analysis (ICA), a Blind Source Separation (BSS)
technique, is able to extract relevant information within noisy signals and separate the
fundamental sources into the independent components (ICs). The most common
assumption of ICA method is that the source signals are unknown and statistically
independent. Through this assumption, ICA is able to recover the source signals.
Since the ICA concepts appeared in the fields of neural networks and signal
processing in the 1980s, many ICA applications in telecommunications, biomedical
data analysis, feature extraction, speech separation, time-series analysis and data
mining have been reported in the literature. In this thesis several ICA techniques are
proposed to optimize two major issues for BCI applications: reducing the recording
time needed in order to speed up the signal processing and reducing the number of
recording channels whilst improving the final classification performance or at least
with it remaining the same as the current performance. These will make BCI a more
practical prospect for everyday use.
This thesis first defines BCI and the diverse BCI models based on different
control patterns. After the general idea of ICA is introduced along with some
modifications to ICA, several new ICA approaches are proposed. The practical work
in this thesis starts with the preliminary analyses on the Southampton BCI pilot
datasets starting with basic and then advanced signal processing techniques. The
proposed ICA techniques are then presented using a multi-channel event related
potential (ERP) based BCI. Next, the ICA algorithm is applied to a multi-channel
spontaneous activity based BCI. The final ICA approach aims to examine the
possibility of using ICA based on just one or a few channel recordings on an ERP
based BCI.
The novel ICA approaches for BCI systems presented in this thesis show that ICA
is able to accurately and repeatedly extract the relevant information buried within
noisy signals and the signal quality is enhanced so that even a simple classifier can
achieve good classification accuracy. In the ERP based BCI application, after multichannel
ICA the data just applied to eight averages/epochs can achieve 83.9%
classification accuracy whilst the data by coherent averaging can reach only 32.3%
accuracy. In the spontaneous activity based BCI, the use of the multi-channel ICA
algorithm can effectively extract discriminatory information from two types of singletrial
EEG data. The classification accuracy is improved by about 25%, on average,
compared to the performance on the unpreprocessed data. The single channel ICA
technique on the ERP based BCI produces much better results than results using the
lowpass filter. Whereas the appropriate number of averages improves the signal to
noise rate of P300 activities which helps to achieve a better classification. These
advantages will lead to a reliable and practical BCI for use outside of the clinical
laboratory
Kernel-based dimensionality reduction using Renyi's α-entropy measures of similarity
Dimensionality reduction (DR) aims to reveal salient properties of high-dimensional (HD) data in a low-dimensional (LD) representation space. Two elements stipulate success of a DR approach: definition of a notion of pairwise relations in the HD and LD spaces, and measuring the mismatch between these relationships in the HD and LD representations of data. This paper introduces a new DR method, termed Kernel-based entropy dimensionality reduction (KEDR), to measure the embedding quality that is based on stochastic neighborhood preservation, involving a Gram matrix estimation of Renyi's α-entropy. The proposed approach is a data-driven framework for information theoretic learning, based on infinitely divisible matrices. Instead of relying upon regular Renyi's entropies, KEDR also computes the embedding mismatch through a parameterized mixture of divergences, resulting in an improved the preservation of both the local and global data structures. Our approach is validated on both synthetic and real-world datasets and compared to several state-of-the-art algorithms, including the Stochastic Neighbor Embedding-like techniques for which DR approach is a data-driven extension (from the perspective of kernel-based Gram matrices). In terms of visual inspection and quantitative evaluation of neighborhood preservation, the obtained results show that KEDR is competitive and promising DR method