23 research outputs found
HActivityNet: A Deep Convolutional Neural Network for Human Activity Recognition
Human Activity Recognition (HAR), a vast area of a computer vision research, has gained standings in recent years due to its applications in various fields. As human activity has diversification in action, interaction, and it embraces a large amount of data and powerful computational resources, it is very difficult to recognize human activities from an image. In order to solve the computational cost and vanishing gradient problem, in this work, we have proposed a revised simple convolutional neural network (CNN) model named Human Activity Recognition Network (HActivityNet) that is automatically extract and learn features and recognize activities in a rapid, precise and consistent manner. To solve the problem of imbalanced positive and negative data, we have created two datasets, one is HARDataset1 dataset which is created by extracted image frames from KTH dataset, and another one is HARDataset2 dataset prepared from activity video frames performed by us. The comprehensive experiment shows that our model performs better with respect to the present state of the art models. The proposed model attains an accuracy of 99.5% on HARDatase1 and almost 100% on HARDataset2 dataset. The proposed model also performed well on real data
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
Fully Automatic Analysis of Engagement and Its Relationship to Personality in Human-Robot Interactions
Engagement is crucial to designing intelligent systems that can adapt to the characteristics of their users. This paper focuses on automatic analysis and classification of engagement based on humans’ and robot’s personality profiles in a triadic human-human-robot interaction setting. More explicitly, we present a study that involves two participants interacting with a humanoid robot, and investigate how participants’ personalities can be used together with the robot’s personality to predict the engagement state of each participant. The fully automatic system is firstly trained to predict the Big Five personality traits of each participant by extracting individual and interpersonal features from their nonverbal behavioural cues. Secondly, the output of the personality prediction system is used as an input to the engagement classification system. Thirdly, we focus on the concept of “group engagement”, which we define as the collective engagement of the participants with the robot, and analyse the impact of similar and dissimilar personalities on the engagement classification. Our experimental results show that (i) using the automatically predicted personality labels for engagement classification yields an F-measure on par with using the manually annotated personality labels, demonstrating the effectiveness of the automatic personality prediction module proposed; (ii) using the individual and interpersonal features without utilising personality information is not sufficient for engagement classification, instead incorporating the participants’ and robot’s personalities with individual/interpersonal features increases engagement classification performance; and (iii) the best classification performance is achieved when the participants and the robot are extroverted, while the worst results are obtained when all are introverted.This work was performed within the Labex SMART project (ANR-11-LABX-65) supported by French state funds managed by the ANR within the Investissements d’Avenir programme under reference ANR-11-IDEX-0004-02. The work of Oya Celiktutan and Hatice Gunes is also funded by the EPSRC under its IDEAS Factory Sandpits call on Digital Personhood (Grant Ref.: EP/L00416X/1).This is the author accepted manuscript. The final version is available from Institute of Electrical and Electronics Engineers via http://dx.doi.org/10.1109/ACCESS.2016.261452
Development of Biological Movement Recognition by Interaction between Active Basis Model and Fuzzy Optical Flow Division
Following the study on computational neuroscience through functional magnetic resonance imaging claimed that human action recognition in the brain of mammalian pursues two separated streams, that is, dorsal and ventral streams. It follows up by two pathways in the bioinspired model, which are specialized for motion and form information analysis (Giese and Poggio 2003). Active basis model is used to form information which is different from orientations and scales of Gabor wavelets to form a dictionary regarding object recognition (human). Also biologically movement optic-flow patterns utilized. As motion information guides share sketch algorithm in form pathway for adjustment plus it helps to prevent wrong recognition. A synergetic neural network is utilized to generate prototype templates, representing general characteristic form of every class. Having predefined templates, classifying performs based on multitemplate matching. As every human action has one action prototype, there are some overlapping and consistency among these templates. Using fuzzy optical flow division scoring can prevent motivation for misrecognition. We successfully apply proposed model on the human action video obtained from KTH human action database. Proposed approach follows the interaction between dorsal and ventral processing streams in the original model of the biological movement recognition. The attained results indicate promising outcome and improvement in robustness using proposed approach
Mixture-Based Clustering for High-Dimensional Count Data Using Minorization-Maximization Approaches
The Multinomial distribution has been widely used to model count data. To increase
clustering efficiency, we use an approximation of the Fisher Scoring as a learning algorithm, which is more robust to the choice of the initial parameter values. Moreover,
we consider the generalization of the multinomial model obtained by introducing the
Dirichlet as prior, which is called the Dirichlet Compound Multinomial (DCM). Even
though DCM can address the burstiness phenomenon of count data, the presence of
Gamma function in its density function usually leads to undesired complications. In
this thesis, we use two alternative representations of DCM distribution to perform
clustering based on finite mixture models, where the mixture parameters are estimated using minorization-maximization algorithm. Moreover, we propose an online
learning technique for unsupervised clustering based on a mixture of Neerchal- Morel
distributions. While, the novel mixture model is able to capture overdispersion due
to a weight parameter assigned to each feature in each cluster, online learning is able
to overcome the drawbacks of batch learning in such a way that the mixture’s parameters can be updated instantly for any new data instances. Finally, by implementing
a minimum message length model selection criterion, the weights of irrelevant mixture components are driven towards zero, which resolves the problem of knowing the
number of clusters beforehand. To evaluate and compare the performance of our
proposed models, we have considered five challenging real-world applications that
involve high-dimensional count vectors, namely, sentiment analysis, topic detection,
facial expression recognition, human action recognition and medical diagnosis. The
results show that the proposed algorithms increase the clustering efficiency remarkably as compared to other benchmarks, and the best results are achieved by the
models able to accommodate over-dispersed count data