55 research outputs found

    RBM-based Silhouette Encoding for Human Action Modelling

    Full text link
    Abstract—In this paper we evaluate the use of Restricted Bolzmann Machines (RBM) in the context of learning and recognizing human actions. The features used as basis are binary silhouettes of persons. We test the proposed approach on two datasets of human actions where binary silhouettes are available: ViHASi (synthetic data) and Weizmann (real data). In addition, on Weizmann dataset, we combine features based on optical flow with the associated binary silhouettes. The results show that thanks to the use of RBM-based models, very informative and shorter feature vectors can be obtained for the classification tasks, improving the classification performance. Keywords-Restricted Boltzmann Machines; binary silhouettes; human actions

    Feature regularization and learning for human activity recognition.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Durban.Feature extraction is an essential component in the design of human activity recognition model. However, relying on extracted features alone for learning often makes the model a suboptimal model. Therefore, this research work seeks to address such potential problem by investigating feature regularization. Feature regularization is used for encapsulating discriminative patterns that are needed for better and efficient model learning. Firstly, a within-class subspace regularization approach is proposed for eigenfeatures extraction and regularization in human activity recognition. In this ap- proach, the within-class subspace is modelled using more eigenvalues from the reliable subspace to obtain a four-parameter modelling scheme. This model enables a better and true estimation of the eigenvalues that are distorted by the small sample size effect. This regularization is done in one piece, thereby avoiding undue complexity of modelling eigenspectrum differently. The whole eigenspace is used for performance evaluation because feature extraction and dimensionality reduction are done at a later stage of the evaluation process. Results show that the proposed approach has better discriminative capacity than several other subspace approaches for human activity recognition. Secondly, with the use of likelihood prior probability, a new regularization scheme that improves the loss function of deep convolutional neural network is proposed. The results obtained from this work demonstrate that a well regularized feature yields better class discrimination in human activity recognition. The major contribution of the thesis is the development of feature extraction strategies for determining discriminative patterns needed for efficient model learning

    Vision-based human action recognition using machine learning techniques

    Get PDF
    The focus of this thesis is on automatic recognition of human actions in videos. Human action recognition is defined as automatic understating of what actions occur in a video performed by a human. This is a difficult problem due to the many challenges including, but not limited to, variations in human shape and motion, occlusion, cluttered background, moving cameras, illumination conditions, and viewpoint variations. To start with, The most popular and prominent state-of-the-art techniques are reviewed, evaluated, compared, and presented. Based on the literature review, these techniques are categorized into handcrafted feature-based and deep learning-based approaches. The proposed action recognition framework is then based on these handcrafted and deep learning based techniques, which are then adopted throughout the thesis by embedding novel algorithms for action recognition, both in the handcrafted and deep learning domains. First, a new method based on handcrafted approach is presented. This method addresses one of the major challenges known as “viewpoint variations” by presenting a novel feature descriptor for multiview human action recognition. This descriptor employs the region-based features extracted from the human silhouette. The proposed approach is quite simple and achieves state-of-the-art results without compromising the efficiency of the recognition process which shows its suitability for real-time applications. Second, two innovative methods are presented based on deep learning approach, to go beyond the limitations of handcrafted approach. The first method is based on transfer learning using pre-trained deep learning model as a source architecture to solve the problem of human action recognition. It is experimentally confirmed that deep Convolutional Neural Network model already trained on large-scale annotated dataset is transferable to action recognition task with limited training dataset. The comparative analysis also confirms its superior performance over handcrafted feature-based methods in terms of accuracy on same datasets. The second method is based on unsupervised deep learning-based approach. This method employs Deep Belief Networks (DBNs) with restricted Boltzmann machines for action recognition in unconstrained videos. The proposed method automatically extracts suitable feature representation without any prior knowledge using unsupervised deep learning model. The effectiveness of the proposed method is confirmed with high recognition results on a challenging UCF sports dataset. Finally, the thesis is concluded with important discussions and research directions in the area of human action recognition

    Bioinformatics Techniques for Studying Drug Resistance In HIV and Staphylococcus Aureus

    Get PDF
    The worldwide HIV/AIDS pandemic has been partly controlled and treated by antivirals targeting HIV protease, integrase and reverse transcriptase, however, drug resistance has become a serious problem. HIV-1 drug resistance to protease inhibitors evolves by mutations in the PR gene. The resistance mutations can alter protease catalytic activity, inhibitor binding, and stability. Different machine learning algorithms (restricted boltzmann machines, clustering, etc.) have been shown to be effective machine learning tools for classification of genomic and resistance data. Application of restricted boltzmann machine produced highly accurate and robust classification of HIV protease resistance. They can also be used to compare resistance profiles of different protease inhibitors. HIV drug resistance has also been studied by enzyme kinetics and X-ray crystallography. Triple mutant HIV-1 protease with resistance mutations V32I, I47V and V82I has been used as a model for the active site of HIV-2 protease. The effects of four investigational antiviral inhibitors was measured for Triple mutant. The tested compounds had significantly worse inhibition of triple mutant with Ki values of 17-40 nM compared to 2-10 pM for wild type protease. The crystal structure of triple mutant in complex with GRL01111 was solved and showed few changes in protease interactions with inhibitor. These new inhibitors are not expected to be effective for HIV-2 protease or HIV-1 protease with changes V32I, I47V and V82I. Methicillin-resistant Staphylococcus aureus (MRSA) is an opportunistic pathogen that causes hospital and community-acquired infections. Antibiotic resistance occurs because of newly acquired low-affinity penicillin-binding protein (PBP2a). Transcriptome analysis was performed to determine how MuM (mutated PBP2 gene) responds to spermine and how Mu50 (wild type) responds to spermine and spermine–β-lactam synergy. Exogenous spermine and oxacillin were found to alter some significant gene expression patterns with major biochemical pathways (iron, sigB regulon) in MRSA with mutant PBP2 protein

    A review of computer vision-based approaches for physical rehabilitation and assessment

    Get PDF
    The computer vision community has extensively researched the area of human motion analysis, which primarily focuses on pose estimation, activity recognition, pose or gesture recognition and so on. However for many applications, like monitoring of functional rehabilitation of patients with musculo skeletal or physical impairments, the requirement is to comparatively evaluate human motion. In this survey, we capture important literature on vision-based monitoring and physical rehabilitation that focuses on comparative evaluation of human motion during the past two decades and discuss the state of current research in this area. Unlike other reviews in this area, which are written from a clinical objective, this article presents research in this area from a computer vision application perspective. We propose our own taxonomy of computer vision-based rehabilitation and assessment research which are further divided into sub-categories to capture novelties of each research. The review discusses the challenges of this domain due to the wide ranging human motion abnormalities and difficulty in automatically assessing those abnormalities. Finally, suggestions on the future direction of research are offered

    3D surface reconstruction for lower limb prosthetic model using modified radon transform

    Get PDF
    Computer vision has received increased attention for the research and innovation on three-dimensional surface reconstruction with aim to obtain accurate results. Although many researchers have come up with various novel solutions and feasibility of the findings, most require the use of sophisticated devices which is computationally expensive. Thus, a proper countermeasure is needed to resolve the reconstruction constraints and create an algorithm that is able to do considerably fast reconstruction by giving attention to devices equipped with appropriate specification, performance and practical affordability. This thesis describes the idea to realize three-dimensional surface of the residual limb models by adopting the technique of tomographic imaging coupled with the strategy based on multiple-views from a digital camera and a turntable. The surface of an object is reconstructed from uncalibrated two-dimensional image sequences of thirty-six different projections with the aid of Radon transform algorithm and shape-from-silhouette. The results show that the main objective to reconstruct three-dimensional surface of lower limb model has been successfully achieved with reasonable accuracy as the starting point to reconstruct three-dimensional surface and extract digital reading of an amputated lower limb model where the maximum percent error obtained from the computation is approximately 3.3 % for the height whilst 7.4%, 7.9% and 8.1% for the diameters at three specific heights of the objects. It can be concluded that the reconstruction of three-dimensional surface for the developed method is particularly dependent to the effects the silhouette generated where high contrast two-dimensional images contribute to higher accuracy of the silhouette extraction. The advantage of the concept presented in this thesis is that it can be done with simple experimental setup and the reconstruction of three-dimensional model neither involves expensive equipment nor require any service by an expert to handle sophisticated mechanical scanning system

    Online Non-linear Prediction of Financial Time Series Patterns

    Get PDF
    We consider a mechanistic non-linear machine learning approach to learning signals in financial time series data. A modularised and decoupled algorithm framework is established and is proven on daily sampled closing time-series data for JSE equity markets. The input patterns are based on input data vectors of data windows preprocessed into a sequence of daily, weekly and monthly or quarterly sampled feature measurement changes (log feature fluctuations). The data processing is split into a batch processed step where features are learnt using a Stacked AutoEncoder (SAE) via unsupervised learning, and then both batch and online supervised learning are carried out on Feedforward Neural Networks (FNNs) using these features. The FNN output is a point prediction of measured time-series feature fluctuations (log differenced data) in the future (ex-post). Weight initializations for these networks are implemented with restricted Boltzmann machine pretraining, and variance based initializations. The validity of the FNN backtest results are shown under a rigorous assessment of backtest overfitting using both Combinatorially Symmetrical Cross Validation and Probabilistic and Deflated Sharpe Ratios. Results are further used to develop a view on the phenomenology of financial markets and the value of complex historical data under unstable dynamics

    Children's scale errors are a natural consequence of learning to associate objects with actions: A computational model.

    Get PDF
    Young children sometimes attempt an action on an object, which is inappropriate because of the object size-they make scale errors. Existing theories suggest that scale errors may result from immaturities in children's action planning system, which might be overpowered by increased complexity of object representations or developing teleofunctional bias. We used computational modelling to emulate children's learning to associate objects with actions and to select appropriate actions, given object shape and size. A computational Developmental Deep Model of Action and Naming (DDMAN) was built on the dual-route theory of action selection, in which actions on objects are selected via a direct (nonsemantic or visual) route or an indirect (semantic) route. As in case of children, DDMAN produced scale errors: the number of errors was high at the beginning of training and decreased linearly but did not disappear completely. Inspection of emerging object-action associations revealed that these were coarsely organized by shape, hence leading DDMAN to initially select actions based on shape rather than size. With experience, DDMAN gradually learned to use size in addition to shape when selecting actions. Overall, our simulations demonstrate that children's scale errors are a natural consequence of learning to associate objects with actions

    Factored Shapes and Appearances for Parts-based Object Understanding

    Get PDF

    Action recognition from RGB-D data

    Get PDF
    In recent years, action recognition based on RGB-D data has attracted increasing attention. Different from traditional 2D action recognition, RGB-D data contains extra depth and skeleton modalities. Different modalities have their own characteristics. This thesis presents seven novel methods to take advantages of the three modalities for action recognition. First, effective handcrafted features are designed and frequent pattern mining method is employed to mine the most discriminative, representative and nonredundant features for skeleton-based action recognition. Second, to take advantages of powerful Convolutional Neural Networks (ConvNets), it is proposed to represent spatio-temporal information carried in 3D skeleton sequences in three 2D images by encoding the joint trajectories and their dynamics into color distribution in the images, and ConvNets are adopted to learn the discriminative features for human action recognition. Third, for depth-based action recognition, three strategies of data augmentation are proposed to apply ConvNets to small training datasets. Forth, to take full advantage of the 3D structural information offered in the depth modality and its being insensitive to illumination variations, three simple, compact yet effective images-based representations are proposed and ConvNets are adopted for feature extraction and classification. However, both of previous two methods are sensitive to noise and could not differentiate well fine-grained actions. Fifth, it is proposed to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling to deal with the issue. The structured dynamic image preserves the spatial-temporal information, enhances the structure information across both body parts/joints and different temporal scales, and takes advantages of ConvNets for action recognition. Sixth, it is proposed to extract and use scene flow for action recognition from RGB and depth data. Last, to exploit the joint information in multi-modal features arising from heterogeneous sources (RGB, depth), it is proposed to cooperatively train a single ConvNet (referred to as c-ConvNet) on both RGB features and depth features, and deeply aggregate the two modalities to achieve robust action recognition
    corecore