Search CORE

11 research outputs found

Feature regularization and learning for human activity recognition.

Author: Osayamwen Festus Osazuwa.
Publication venue
Publication date: 01/01/2018
Field of study

Doctoral Degree. University of KwaZulu-Natal, Durban.Feature extraction is an essential component in the design of human activity recognition model. However, relying on extracted features alone for learning often makes the model a suboptimal model. Therefore, this research work seeks to address such potential problem by investigating feature regularization. Feature regularization is used for encapsulating discriminative patterns that are needed for better and efficient model learning. Firstly, a within-class subspace regularization approach is proposed for eigenfeatures extraction and regularization in human activity recognition. In this ap- proach, the within-class subspace is modelled using more eigenvalues from the reliable subspace to obtain a four-parameter modelling scheme. This model enables a better and true estimation of the eigenvalues that are distorted by the small sample size effect. This regularization is done in one piece, thereby avoiding undue complexity of modelling eigenspectrum differently. The whole eigenspace is used for performance evaluation because feature extraction and dimensionality reduction are done at a later stage of the evaluation process. Results show that the proposed approach has better discriminative capacity than several other subspace approaches for human activity recognition. Secondly, with the use of likelihood prior probability, a new regularization scheme that improves the loss function of deep convolutional neural network is proposed. The results obtained from this work demonstrate that a well regularized feature yields better class discrimination in human activity recognition. The major contribution of the thesis is the development of feature extraction strategies for determining discriminative patterns needed for efficient model learning

ResearchSpace@UKZN

Egocentric Activity Recognition with Multimodal Fisher Vector

Author: Chandrasekhar Vijay
Cheung Ngai-Man
Lin Jie
Mandal Bappaditya
Song Sibo
Publication venue
Publication date: 25/01/2016
Field of study

With the increasing availability of wearable devices, research on egocentric activity recognition has received much attention recently. In this paper, we build a Multimodal Egocentric Activity dataset which includes egocentric videos and sensor data of 20 fine-grained and diverse activity categories. We present a novel strategy to extract temporal trajectory-like features from sensor data. We propose to apply the Fisher Kernel framework to fuse video and temporal enhanced sensor features. Experiment results show that with careful design of feature extraction and fusion algorithm, sensor data can enhance information-rich video data. We make publicly available the Multimodal Egocentric Activity dataset to facilitate future research.Comment: 5 pages, 4 figures, ICASSP 2016 accepte

arXiv.org e-Print Archive

Crossref

Characterizing protein-ligand binding using atomistic simulation and machine learning: Application to drug resistance in HIV-1 protease

Author: Ragland Debra A.
Schiffer Celia A.
Whitfield Troy W.
Zeldovich Konstantin B.
Publication venue: eScholarship@UMassChan
Publication date: 26/12/2019
Field of study

Over the past several decades, atomistic simulations of biomolecules, whether carried out using molecular dynamics or Monte Carlo techniques, have provided detailed insights into their function. Comparing the results of such simulations for a few closely related systems has guided our understanding of the mechanisms by which changes like ligand binding or mutation can alter function. The general problem of detecting and interpreting such mechanisms from simulations of many related systems, however, remains a challenge. This problem is addressed here by applying supervised and unsupervised machine learning techniques to a variety of thermodynamic observables extracted from molecular dynamics simulations of different systems. As an important test case, these methods are applied to understanding the evasion by HIV-1 protease of darunavir, a potent inhibitor to which resistance can develop via the simultaneous mutation of multiple amino acids. Complex mutational patterns have been observed among resistant strains, presenting a challenge to developing a mechanistic picture of resistance in the protease. In order to dissect these patterns and gain mechanistic insight on the role of specific mutations, molecular dynamics simulations were carried out on a collection of HIV-1 protease variants, chosen to include highly resistant strains and susceptible controls, in complex with darunavir. Using a machine learning approach that takes advantage of the hierarchical nature in the relationships among sequence, structure and function, an integrative analysis of these trajectories reveals key details of the resistance mechanism, including changes in protein structure, hydrogen bonding and protein-ligand contacts

eScholarship@UMMS

Human action recognition using spatial-temporal analysis.

Author: Naidoo Denver.
Publication venue
Publication date: 01/01/2019
Field of study

Masters Degree. University of KwaZulu-Natal, Durban.In the past few decades’ human action recognition (HAR) from video has gained a lot of attention in the computer vision domain. The analysis of human activities in videos span a variety of applications including security and surveillance, entertainment, and the monitoring of the elderly. The task of recognizing human actions in any scenario is a difficult and complex one which is characterized by challenges such as self-occlusion, noisy backgrounds and variations in illumination. However, literature provides various techniques and approaches for action recognition which deal with these challenges. This dissertation focuses on a holistic approach to the human action recognition problem with specific emphasis on spatial-temporal analysis. Spatial-temporal analysis is achieved by using the Motion History Image (MHI) approach to solve the human action recognition problem. Three variants of MHI are investigated, these are: Original MHI, Modified MHI and Timed MHI. An MHI is a single image describing a silhouettes motion over a period of time. Brighter pixels in the resultant MHI show the most recent movement/motion. One of the key problems of MHI is that it is not easy to know the conditions needed to obtain an MHI silhouette that will result in a high recognition rate for action recognition. These conditions are often neglected and thus pose a problem for human action recognition systems as they could affect their overall performance. Two methods are proposed to solve the human action recognition problem and to show the conditions needed to obtain high recognition rates using the MHI approach. The first uses the concept of MHI with the Bag of Visual Words (BOVW) approach to recognize human actions. The second approach combines MHI with Local Binary Patterns (LBP). The Weizmann and KTH datasets are then used to validate the proposed methods. Results from experiments show promising recognition rates when compared to some existing methods. The BOVW approach used in combination with the three variants of MHI achieved the highest recognition rates compared to the LBP method. The original MHI method resulted in the highest recognition rate of 87% on the Weizmann dataset and an 81.6% recognition rate is achieved on the KTH dataset using the Modified MHI approach

ResearchSpace@UKZN

Distinguishing Posed and Spontaneous Smiles by Facial Dynamics

Author: B Mandal
B Mandal
B Mandal
B Mandal
C Cortes
EG Krumhuber
G Farnebäck
H Dibeklioglu
H Dibeklioğlu
HY Wu
J Cohn
J Hadwin
K Schmidt
K Schmidt
M Hoque
P Ekman
P Ekman
P Ekman
P Ekman
PF Felzenszwalb
Q Xu
V Ojansivu
X Yu
Z Ambadar
Z Zeng
Publication venue
Publication date: 17/02/2017
Field of study

Smile is one of the key elements in identifying emotions and present state of mind of an individual. In this work, we propose a cluster of approaches to classify posed and spontaneous smiles using deep convolutional neural network (CNN) face features, local phase quantization (LPQ), dense optical flow and histogram of gradient (HOG). Eulerian Video Magnification (EVM) is used for micro-expression smile amplification along with three normalization procedures for distinguishing posed and spontaneous smiles. Although the deep CNN face model is trained with large number of face images, HOG features outperforms this model for overall face smile classification task. Using EVM to amplify micro-expressions did not have a significant impact on classification accuracy, while the normalizing facial features improved classification accuracy. Unlike many manual or semi-automatic methodologies, our approach aims to automatically classify all smiles into either `spontaneous' or `posed' categories, by using support vector machines (SVM). Experimental results on large UvA-NEMO smile database show promising results as compared to other relevant methods.Comment: 16 pages, 8 figures, ACCV 2016, Second Workshop on Spontaneous Facial Behavior Analysi

arXiv.org e-Print Archive

Crossref

Advances in Spectral Learning with Applications to Text Analysis and Brain Imaging

Author: Dhillon Paramveer
Publication venue: ScholarlyCommons
Publication date: 01/01/2014
Field of study

Spectral learning algorithms are becoming increasingly popular in data-rich domains, driven in part by recent advances in large scale randomized SVD, and in spectral estimation of Hidden Markov Models. Extensions of these methods lead to statistical estimation algorithms which are not only fast, scalable, and useful on real data sets, but are also provably correct. Following this line of research, we make two contributions. First, we propose a set of spectral algorithms for text analysis and natural language processing. In particular, we propose fast and scalable spectral algorithms for learning word embeddings -- low dimensional real vectors (called Eigenwords) that capture the “meaning” of words from their context. Second, we show how similar spectral methods can be applied to analyzing brain images. State-of-the-art approaches to learning word embeddings are slow to train or lack theoretical grounding; We propose three spectral algorithms that overcome these limitations. All three algorithms harness the multi-view nature of text data i.e. the left and right context of each word, and share three characteristics: 1). They are fast to train and are scalable. 2). They have strong theoretical properties. 3). They can induce context-specific embeddings i.e. different embedding for “river bank” or “Bank of America”. \end{enumerate} They also have lower sample complexity and hence higher statistical power for rare words. We provide theory which establishes relationships between these algorithms and optimality criteria for the estimates they provide. We also perform thorough qualitative and quantitative evaluation of Eigenwords and demonstrate their superior performance over state-of-the-art approaches. Next, we turn to the task of using spectral learning methods for brain imaging data. Methods like Sparse Principal Component Analysis (SPCA), Non-negative Matrix Factorization (NMF) and Independent Component Analysis (ICA) have been used to obtain state-of-the-art accuracies in a variety of problems in machine learning. However, their usage in brain imaging, though increasing, is limited by the fact that they are used as out-of-the-box techniques and are seldom tailored to the domain specific constraints and knowledge pertaining to medical imaging, which leads to difficulties in interpretation of results. In order to address the above shortcomings, we propose Eigenanatomy (EANAT), a general framework for sparse matrix factorization. Its goal is to statistically learn the boundaries of and connections between brain regions by weighing both the data and prior neuroanatomical knowledge. Although EANAT incorporates some neuroanatomical prior knowledge in the form of connectedness and smoothness constraints, it can still be difficult for clinicians to interpret the results in specific domains where network-specific hypotheses exist. We thus extend EANAT and present a novel framework for prior-constrained sparse decomposition of matrices derived from brain imaging data, called Prior Based Eigenanatomy (p-Eigen). We formulate our solution in terms of a prior-constrained l1 penalized (sparse) principal component analysis. Experimental evaluation confirms that p-Eigen extracts biologically-relevant, patient-specific functional parcels and that it significantly aids classification of Mild Cognitive Impairment when compared to state-of-the-art competing approaches

ScholarlyCommons@Penn

Contributions into holistic human action recognition.

Author: Toudjeu Tchangou Ignance.
Publication venue
Publication date: 01/01/2020
Field of study

Doctoral Degrees. University of KwaZulu-Natal, Durban.In this thesis we holistically investigate the interpretation of human actions in both still images and videos. Human action recognition is currently a research problem of great interest both in academia and industry due to its potential applications which include security surveillances, sports annotation, human-computer interaction, and robotics. Action recognition, being a process of labelling actions using sensory observations, can be deﬁned as a sequence of movements engendered by a human during an executed task. Such a process, when considering visual observations, is quite challenging and faces issues such as background clutter, shadows, illumination variations, occlusions, changes in scale, changes in the person performing the action, and viewpoint variations. Although many approaches to development of human action recognition systems have been proposed in the literature, they focused more on recognition accuracy while ignoring the computational complexity accompanying the recognition process. However, a human action recognition system which is both eﬀective and eﬃcient and can be operated real-time is needed. Firstly, we review, evaluate and compare the most prominent state-of-the-art feature extraction representations categorized between handcrafted feature based techniques and deep learning feature based techniques. Secondly, we propose holistic approaches in each of the categories. The ﬁrst holistic approach takes advantage of existing slope patterns in the motion history images, which are a simple two dimensional representation of video, and reduces the running time of action recognition. The second one based on circular derivative local binary patterns outperforms the LBP based state-of-the-art techniques and addresses the issues of dimensionality by producing feature descriptor with minimal dimension size with less compromise on the recognition accuracy. The third one introduces a preprocessing step in a proposed 2D-convolutional neural network to deal with the same issue of dimensionality diﬀerently in the deep learning techniques. Here the temporal dimension is embedded into motion history images before being learned by a two dimensional convolutional neural network. Thirdly, three datasets (JAFFE, KTH and Pedestrian Action dataset) were used to validate the proposed human action recognition models. Finally, we show that better performance in comparison to the state-of-the-art methods can be achieved using holistic feature based techniques.Author's Keywords : Human Action Recognition; Motion History Image; Circular Derivative Local BinaryPattern; Convolutional Neural Network; Facial Expression Recognition; Spatio-Temporal features

ResearchSpace@UKZN

A three-step classification framework to handle complex data distribution for radar UAV detection

Author: Jiang Xudong
Ren Jianfeng
Publication venue: 'Elsevier BV'
Publication date: 22/10/2020
Field of study

Unmanned aerial vehicles (UAVs) have been used in a wide range of applications and become an increasingly important radar target. To better model radar data and to tackle the curse of dimensionality, a three-step classification framework is proposed for UAV detection. First we propose to utilize the greedy subspace clustering to handle potential outliers and the complex sample distribution of radar data. Parameters of the resulting multi-Gaussian model, especially the covariance matrices, could not be reliably estimated due to insufficient training samples and the high dimensionality. Thus, in the second step, a multi-Gaussian subspace reliability analysis is proposed to handle the unreliable feature dimensions of these covariance matrices. To address the challenges of classifying samples using the complex multi-Gaussian model and to fuse the distances of a sample to different clusters at different dimensionalities, a subspace-fusion scheme is proposed in the third step. The proposed approach is validated on a large benchmark dataset, which significantly outperforms the state-of-the-art approaches

Nottingham ePrints

Nottingham eTheses