237 research outputs found
Learning as a Nonlinear Line of Attraction for Pattern Association, Classification and Recognition
Development of a mathematical model for learning a nonlinear line of attraction is presented in this dissertation, in contrast to the conventional recurrent neural network model in which the memory is stored in an attractive fixed point at discrete location in state space. A nonlinear line of attraction is the encapsulation of attractive fixed points scattered in state space as an attractive nonlinear line, describing patterns with similar characteristics as a family of patterns.
It is usually of prime imperative to guarantee the convergence of the dynamics of the recurrent network for associative learning and recall. We propose to alter this picture. That is, if the brain remembers by converging to the state representing familiar patterns, it should also diverge from such states when presented by an unknown encoded representation of a visual image. The conception of the dynamics of the nonlinear line attractor network to operate between stable and unstable states is the second contribution in this dissertation research. These criteria can be used to circumvent the plasticity-stability dilemma by using the unstable state as an indicator to create a new line for an unfamiliar pattern. This novel learning strategy utilizes stability (convergence) and instability (divergence) criteria of the designed dynamics to induce self-organizing behavior. The self-organizing behavior of the nonlinear line attractor model can manifest complex dynamics in an unsupervised manner.
The third contribution of this dissertation is the introduction of the concept of manifold of color perception.
The fourth contribution of this dissertation is the development of a nonlinear dimensionality reduction technique by embedding a set of related observations into a low-dimensional space utilizing the result attained by the learned memory matrices of the nonlinear line attractor network.
Development of a system for affective states computation is also presented in this dissertation. This system is capable of extracting the user\u27s mental state in real time using a low cost computer. It is successfully interfaced with an advanced learning environment for human-computer interaction
Hidden Markov models and neural networks for speech recognition
The Hidden Markov Model (HMMs) is one of the most successful modeling approaches for acoustic events in speech recognition, and more recently it has proven useful for several problems in biological sequence analysis. Although the HMM is good at capturing the temporal nature of processes such as speech, it has a very limited capacity for recognizing complex patterns involving more than first order dependencies in the observed data sequences. This is due to the first order state process and the assumption of state conditional independence between observations. Artificial Neural Networks (NNs) are almost the opposite: they cannot model dynamic, temporally extended phenomena very well, but are good at static classification and regression tasks. Combining the two frameworks in a sensible way can therefore lead to a more powerful model with better classification abilities. The overall aim of this work has been to develop a probabilistic hybrid of hidden Markov models and neural networks and ..
Machine learning methods for sign language recognition: a critical review and analysis.
Sign language is an essential tool to bridge the communication gap between normal and hearing-impaired people. However, the diversity of over 7000 present-day sign languages with variability in motion position, hand shape, and position of body parts making automatic sign language recognition (ASLR) a complex system. In order to overcome such complexity, researchers are investigating better ways of developing ASLR systems to seek intelligent solutions and have demonstrated remarkable success. This paper aims to analyse the research published on intelligent systems in sign language recognition over the past two decades. A total of 649 publications related to decision support and intelligent systems on sign language recognition (SLR) are extracted from the Scopus database and analysed. The extracted publications are analysed using bibliometric VOSViewer software to (1) obtain the publications temporal and regional distributions, (2) create the cooperation networks between affiliations and authors and identify productive institutions in this context. Moreover, reviews of techniques for vision-based sign language recognition are presented. Various features extraction and classification techniques used in SLR to achieve good results are discussed. The literature review presented in this paper shows the importance of incorporating intelligent solutions into the sign language recognition systems and reveals that perfect intelligent systems for sign language recognition are still an open problem. Overall, it is expected that this study will facilitate knowledge accumulation and creation of intelligent-based SLR and provide readers, researchers, and practitioners a roadmap to guide future direction
A Comparison of Machine Learning Gesture Recognition Techniques for Medication Adherence
Every year, many poor health outcomes are the result of patients missing their medication, as prescribed by their healthcare providers. Guidance and reminders to these patients would result in better health outcomes and significant financial savings to the economy. This thesis utilizes accelerometers and gyroscopes, which are widely available inside devices (e.g., smart phones and watches) to actively monitor patient activities, including those related to adherence to medication regimens. Different machine learning techniques are compared for recognizing when a pill bottle has been opened. Such actions could remind the patient to take their medication if an opening were not detected. An artificial neural network (ANN) model will be compared with a support vector machine (SVM) and a K-nearest neighbor (KNN) classifier. The models are trained on data collected by former University of Oklahoma students. Raw (normalized) sensor data is used, without extensive data processing or feature extraction. A neural network proves the most promising with an accuracy of 98.12%, as well as the greatest flexibility in data pre-processing requirements. KNN achieved high accuracy, although results were likely due to overfitting limited data with the simple model. SVM did not perform as well as the others, however; it did achieve similar results to previous research utilizing the approach (e.g., ~95% accuracy). Data collected from a greater number of gestures and additional test subjects is needed to verify generalization. A medication adherence system utilizing the developed model would be an acceptable approach
Evolutionary design of deep neural networks
Mención Internacional en el tÃtulo de doctorFor three decades, neuroevolution has applied evolutionary computation to the optimization of
the topology of artificial neural networks, with most works focusing on very simple architectures.
However, times have changed, and nowadays convolutional neural networks are the industry and
academia standard for solving a variety of problems, many of which remained unsolved before the
discovery of this kind of networks.
Convolutional neural networks involve complex topologies, and the manual design of these
topologies for solving a problem at hand is expensive and inefficient. In this thesis, our aim is to
use neuroevolution in order to evolve the architecture of convolutional neural networks.
To do so, we have decided to try two different techniques: genetic algorithms and grammatical
evolution. We have implemented a niching scheme for preserving the genetic diversity, in order
to ease the construction of ensembles of neural networks. These techniques have been validated
against the MNIST database for handwritten digit recognition, achieving a test error rate of 0.28%,
and the OPPORTUNITY data set for human activity recognition, attaining an F1 score of 0.9275.
Both results have proven very competitive when compared with the state of the art. Also, in all
cases, ensembles have proven to perform better than individual models.
Later, the topologies learned for MNIST were tested on EMNIST, a database recently introduced
in 2017, which includes more samples and a set of letters for character recognition. Results have
shown that the topologies optimized for MNIST perform well on EMNIST, proving that architectures
can be reused across domains with similar characteristics.
In summary, neuroevolution is an effective approach for automatically designing topologies for
convolutional neural networks. However, it still remains as an unexplored field due to hardware
limitations. Current advances, however, should constitute the fuel that empowers the emergence of
this field, and further research should start as of today.This Ph.D. dissertation has been partially supported by the Spanish Ministry of Education, Culture and Sports under FPU fellowship with identifier FPU13/03917.
This research stay has been partially co-funded by the Spanish Ministry of Education, Culture and Sports under FPU short stay grant with identifier EST15/00260.Programa Oficial de Doctorado en Ciencia y TecnologÃa InformáticaPresidente: MarÃa Araceli SanchÃs de Miguel.- Secretario: Francisco Javier Segovia Pérez.- Vocal: Simon Luca
Human pose and action recognition
This thesis focuses on detection of persons and pose recognition using neural networks.
The goal is to detect human body poses in a visual scene with multiple
persons and to use this information in order to recognize human activity. This is
achieved by rst detecting persons in a scene and then by estimating their body
joints in order to infer articulated poses.
The work developed in this thesis explored neural networks and deep learning
methods. Deep learning allows to employ computational models that are composed
of multiple processing layers to learn representations of data with multiple levels
of abstraction. These methods have greatly improved the state-of-the-art in many
domains such as speech recognition and visual object detection and classi cation.
Deep learning discovers intricate structure in data by using the backpropagation
algorithm to indicate how a machine should change its internal parameters that are
used to compute the representation in each layer from the representation provided
by the previous one.
Person detection, in general, is a di cult task due to a large variability of representation
due to di erent factors such as scales, views and occlusion. An object
detection framework based on multi-stage convolutional features for pedestrian detection
is proposed in this thesis. This framework extends the Fast R-CNN framework
for the combination of several convolutional features from di erent stages of
a CNN (Convolutional Neural Network) to improve the detector's accuracy. This
provides high quality detections of persons in a visual scene, which are then used
as input in conjunction with a human pose estimation model in order to estimate
human body joint locations of multiple persons in an image.
Human pose estimation is done by a deep convolutional neural network composed
of a series of residual auto-encoders. These produce multiple predictions which are
later combined to provide a heatmap prediction of human body joints. In this network
topology, features are processed across all scales capturing the various spatial
relationships associated with the body. Repeated bottom-up and top-down processing
with intermediate supervision for each auto-encoder network is applied. This
results in very accurate 2D heatmaps of body joint predictions.
The methods presented in this thesis were benchmarked against other topperforming
methods on popular datasets for human pedestrian and pose estimation,
achieving good results compared with other state-of-the-art algorithms.Esta tese foca a detec c~ao de pessoas e o reconhecimento de poses usando redes neuronais.
O objectivo e detectar poses humanas num ambiente (cena) com m ultiplas
pessoas e usar essa informa c~ao para reconhecer actividade humana. Isto e alcan cado
ao detectar, em primeiro lugar, pessoas numa cena e, seguidamente, estimar as suas
juntas corporais de modo a inferir poses articuladas.
O trabalho desenvolvido nesta tese explorou m etodos de redes neuronais e de
aprendizagem profunda. A aprendizagem profunda permite que modelos computacionais
compostos por m ultiplas camadas de processamento aprendam representa
c~oes de dados com m ultiplos n veis de abstra c~ao. Estes m etodos t^em drasticamente
melhorado o estado-da-arte em muitos dom nios como o reconhecimento
de fala e a classi ca c~ao e o reconhecimento de objectos visuais. A aprendizagem
profunda descobre estruturas intr nsecas em conjuntos de dados ao usar algoritmos
de propaga c~ao inversa (backpropagation) para indicar como uma m aquina deve alterar
os seus par^ametros internos que, por sua vez, s~ao usados para processar a
representa c~ao em cada camada a partir da representa c~ao da camada anterior.
A detec c~ao de pessoas em geral e uma tarefa dif cil dado a grande variabilidade de
representa c~oes devido a diferentes escalas, vistas e oclus~oes. Uma estrutura de detec
c~ao de objectos baseada em caracter sticas convolucionais de m ultiplos est agios
para a detec c~ao de pedestres e proposta nesta tese. Esta estrutura estende a estrutura
Fast R-CNN com a combina c~ao de v arias caracter sticas convolucionais de
diferentes est agios da CNN (Convolutional Neural Network) usada de modo a melhorar
a precis~ao do detector. Isto proporciona detec c~oes de pessoas com elevada
abilidade numa cena, que s~ao posteriormente conjuntamente usadas como entrada
no modelo de estima c~ao de poses humanas de modo a estimar a localiza c~ao de
articula c~oes humanas para a detec c~ao de m ultiplas pessoas numa imagem.
A estima c~ao de poses humanas e obtido atrav es de redes neuronais convolucionais
profundas que s~ao compostas por uma s erie de auto-codi cadores residuais que
fornecem m ultiplas previs~oes que s~ao, posteriormente, combinadas para fornecer
um \mapa de calor" de articula c~oes corporais. Nesta topologia de rede, as caracter
sticas da imagem s~ao processadas ao longo de v arias escalas, capturando as
v arias rela c~oes espaciais associadas com o corpo humano. Repetidos processos de
baixo-para-cima e de cima-para-baixo com supervis~ao interm edia para cada autocodi
cador s~ao aplicados. Isto resulta em mapas de calor 2D muito precisos de
estima c~oes de articula c~oes corporais de pessoas.
Os m etodos apresentados nesta tese foram comparados com outros m etodos de
alto desempenho em bases de dados de detec c~ao de pessoas e de reconhecimento de
poses humanas, alcan cando muito bons resultados comparando com outros algoritmos
do estado-da-arte
Human pose and action recognition
This thesis focuses on detection of persons and pose recognition using neural networks.
The goal is to detect human body poses in a visual scene with multiple
persons and to use this information in order to recognize human activity. This is
achieved by rst detecting persons in a scene and then by estimating their body
joints in order to infer articulated poses.
The work developed in this thesis explored neural networks and deep learning
methods. Deep learning allows to employ computational models that are composed
of multiple processing layers to learn representations of data with multiple levels
of abstraction. These methods have greatly improved the state-of-the-art in many
domains such as speech recognition and visual object detection and classi cation.
Deep learning discovers intricate structure in data by using the backpropagation
algorithm to indicate how a machine should change its internal parameters that are
used to compute the representation in each layer from the representation provided
by the previous one.
Person detection, in general, is a di cult task due to a large variability of representation
due to di erent factors such as scales, views and occlusion. An object
detection framework based on multi-stage convolutional features for pedestrian detection
is proposed in this thesis. This framework extends the Fast R-CNN framework
for the combination of several convolutional features from di erent stages of
a CNN (Convolutional Neural Network) to improve the detector's accuracy. This
provides high quality detections of persons in a visual scene, which are then used
as input in conjunction with a human pose estimation model in order to estimate
human body joint locations of multiple persons in an image.
Human pose estimation is done by a deep convolutional neural network composed
of a series of residual auto-encoders. These produce multiple predictions which are
later combined to provide a heatmap prediction of human body joints. In this network
topology, features are processed across all scales capturing the various spatial
relationships associated with the body. Repeated bottom-up and top-down processing
with intermediate supervision for each auto-encoder network is applied. This
results in very accurate 2D heatmaps of body joint predictions.
The methods presented in this thesis were benchmarked against other topperforming
methods on popular datasets for human pedestrian and pose estimation,
achieving good results compared with other state-of-the-art algorithms.Esta tese foca a detec c~ao de pessoas e o reconhecimento de poses usando redes neuronais.
O objectivo e detectar poses humanas num ambiente (cena) com m ultiplas
pessoas e usar essa informa c~ao para reconhecer actividade humana. Isto e alcan cado
ao detectar, em primeiro lugar, pessoas numa cena e, seguidamente, estimar as suas
juntas corporais de modo a inferir poses articuladas.
O trabalho desenvolvido nesta tese explorou m etodos de redes neuronais e de
aprendizagem profunda. A aprendizagem profunda permite que modelos computacionais
compostos por m ultiplas camadas de processamento aprendam representa
c~oes de dados com m ultiplos n veis de abstra c~ao. Estes m etodos t^em drasticamente
melhorado o estado-da-arte em muitos dom nios como o reconhecimento
de fala e a classi ca c~ao e o reconhecimento de objectos visuais. A aprendizagem
profunda descobre estruturas intr nsecas em conjuntos de dados ao usar algoritmos
de propaga c~ao inversa (backpropagation) para indicar como uma m aquina deve alterar
os seus par^ametros internos que, por sua vez, s~ao usados para processar a
representa c~ao em cada camada a partir da representa c~ao da camada anterior.
A detec c~ao de pessoas em geral e uma tarefa dif cil dado a grande variabilidade de
representa c~oes devido a diferentes escalas, vistas e oclus~oes. Uma estrutura de detec
c~ao de objectos baseada em caracter sticas convolucionais de m ultiplos est agios
para a detec c~ao de pedestres e proposta nesta tese. Esta estrutura estende a estrutura
Fast R-CNN com a combina c~ao de v arias caracter sticas convolucionais de
diferentes est agios da CNN (Convolutional Neural Network) usada de modo a melhorar
a precis~ao do detector. Isto proporciona detec c~oes de pessoas com elevada
abilidade numa cena, que s~ao posteriormente conjuntamente usadas como entrada
no modelo de estima c~ao de poses humanas de modo a estimar a localiza c~ao de
articula c~oes humanas para a detec c~ao de m ultiplas pessoas numa imagem.
A estima c~ao de poses humanas e obtido atrav es de redes neuronais convolucionais
profundas que s~ao compostas por uma s erie de auto-codi cadores residuais que
fornecem m ultiplas previs~oes que s~ao, posteriormente, combinadas para fornecer
um \mapa de calor" de articula c~oes corporais. Nesta topologia de rede, as caracter
sticas da imagem s~ao processadas ao longo de v arias escalas, capturando as
v arias rela c~oes espaciais associadas com o corpo humano. Repetidos processos de
baixo-para-cima e de cima-para-baixo com supervis~ao interm edia para cada autocodi
cador s~ao aplicados. Isto resulta em mapas de calor 2D muito precisos de
estima c~oes de articula c~oes corporais de pessoas.
Os m etodos apresentados nesta tese foram comparados com outros m etodos de
alto desempenho em bases de dados de detec c~ao de pessoas e de reconhecimento de
poses humanas, alcan cando muito bons resultados comparando com outros algoritmos
do estado-da-arte
Deep Active Learning Explored Across Diverse Label Spaces
abstract: Deep learning architectures have been widely explored in computer vision and have
depicted commendable performance in a variety of applications. A fundamental challenge
in training deep networks is the requirement of large amounts of labeled training
data. While gathering large quantities of unlabeled data is cheap and easy, annotating
the data is an expensive process in terms of time, labor and human expertise.
Thus, developing algorithms that minimize the human effort in training deep models
is of immense practical importance. Active learning algorithms automatically identify
salient and exemplar samples from large amounts of unlabeled data and can augment
maximal information to supervised learning models, thereby reducing the human annotation
effort in training machine learning models. The goal of this dissertation is to
fuse ideas from deep learning and active learning and design novel deep active learning
algorithms. The proposed learning methodologies explore diverse label spaces to
solve different computer vision applications. Three major contributions have emerged
from this work; (i) a deep active framework for multi-class image classication, (ii)
a deep active model with and without label correlation for multi-label image classi-
cation and (iii) a deep active paradigm for regression. Extensive empirical studies
on a variety of multi-class, multi-label and regression vision datasets corroborate the
potential of the proposed methods for real-world applications. Additional contributions
include: (i) a multimodal emotion database consisting of recordings of facial
expressions, body gestures, vocal expressions and physiological signals of actors enacting
various emotions, (ii) four multimodal deep belief network models and (iii)
an in-depth analysis of the effect of transfer of multimodal emotion features between
source and target networks on classification accuracy and training time. These related
contributions help comprehend the challenges involved in training deep learning
models and motivate the main goal of this dissertation.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
- …