50 research outputs found

    Investigation of Different Skeleton Features for CNN-based 3D Action Recognition

    Full text link
    Deep learning techniques are being used in skeleton based action recognition tasks and outstanding performance has been reported. Compared with RNN based methods which tend to overemphasize temporal information, CNN-based approaches can jointly capture spatio-temporal information from texture color images encoded from skeleton sequences. There are several skeleton-based features that have proven effective in RNN-based and handcrafted-feature-based methods. However, it remains unknown whether they are suitable for CNN-based approaches. This paper proposes to encode five spatial skeleton features into images with different encoding methods. In addition, the performance implication of different joints used for feature extraction is studied. The proposed method achieved state-of-the-art performance on NTU RGB+D dataset for 3D human action analysis. An accuracy of 75.32\% was achieved in Large Scale 3D Human Activity Analysis Challenge in Depth Videos

    RGB-D-based Action Recognition Datasets: A Survey

    Get PDF
    Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. This raises the question of which dataset to select and how to use it in providing a fair and objective comparative evaluation against state-of-the-art methods. To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and 7 multi-person datasets. The detailed information and analysis of these datasets is a useful resource in guiding insightful selection of datasets for future research. In addition, the issues with current algorithm evaluation vis-\'{a}-vis limitations of the available datasets and evaluation protocols are also highlighted; resulting in a number of recommendations for collection of new datasets and use of evaluation protocols

    ECONOMIC-EMISSION DISPATCH WITH SEMIDEFINITE PROGRAMMING AND RATIONAL FUNCTION APPROXIMATIONS

    Get PDF
    The emission function associated with the economic-emission dispatch problem contains exponential functions that model the emission pollutants. This paper presents a strategy of solving the economic-emission dispatch problem whereby the exponential function is approximated by a rational function that permits reduction to a standard polynomial optimization problem. This is reformulated as a hierarchy of semidefinite relaxation problems using the moment theory and the resulting SDP problem is solved. Different degrees of rational functional approximation were considered. The approach was tested on the IEEE 30-bus test systems to investigate its effectiveness. Solutions obtained were compared with those from some of the well known evolutionary methods. Results showed that SDP has inherently good convergence property and a lower but comparable diversity property

    New feature-based image adaptive vector quantisation coder

    Get PDF
    It is difficult to achieve a good low bit rate image compression performance with traditional block coding schemes such as transform coding and vector quantization, without regard for the human visual perception or signal dependency. These classical block coding schemes are based on minimizing the MSE at a certain rate. This procedure results in more bits being allocated to areas which may not be visually important and the resulting quantization noise manifests as a blocking artifact. Blocking artifacts are known to be psychologically more annoying than white noise when the human visual response is considered. While image adaptive vector quantization (IAVQ) attempts to address this problem for traditional vector quantization (VQ) schemes by exploiting image dependency, it ignores the human visual perception when allocating bits. This paper addresses this problem through a new IAVQ scheme based on the human visual perception. In this method, the input image is partitioned into visual classes and each class, depending on its visual importance, is adaptively or universally encoded. The objective and subjective quality of this scheme has been compared with JPEG and a previously proposed image adaptive VQ scheme. The new scheme subjectively outperforms both schemes at low bit rates

    Economic-emission Dispatch With Semidefinite Programming And Rational Function Approximations

    Get PDF
    The emission function associated with the economic-emission dispatch prob- lem contains exponential functions that model the emission pollutants. This paper presents a strategy of solving the economic-emission dispatch problem whereby the ex- ponential function is approximated by a rational function that permits reduction to a standard polynomial optimization problem. This is reformulated as a hierarchy of semidefinite relaxation problems using the moment theory and the resulting SDP prob- lem is solved. Different degrees of rational functional approximation were considered. The approach was tested on the IEEE 30-bus test systems to investigate its effectiveness. Solutions obtained were compared with those from some of the well known evolutionary methods. Results showed that SDP has inherently good convergence property and a lower but comparable diversity property

    Finding distinctive facial areas for face recognition

    Get PDF
    One of the key issues for local appearance based face recognition methods is that how to find the most discriminative facial areas. Most of the existing methods take the assumption that anatomical facial components, such as the eyes, nose, and mouth, are the most useful areas for recognition. Other more elaborate methods locate the most salient parts within the face according to a pre-specified criterion. In this paper, a novel method is proposed to identify the discriminative facial areas for face recognition. Unlike the existing methods that only analyze the given face, the proposed method identifies the distinctive areas of each individual’s face by its comparison to the general population. In particular, non-negative matrix factorization (NMF) is extended to learn a localized non-overlapping subspace representation of the facial patterns from a generic face image database. In the learned subspace, the degree of distinctiveness for any facial area is measured depends on the probability of this area is belong to a general face. For evaluation, the proposed method is tested on exaggerated face images and applied in exiting face recognition systems. Experimental results demonstrate the efficiency of the proposed method

    New wavelet based ART network for texture classification

    Get PDF
    A new method for texture classification is proposed. It is composed of two processing stages, namely, a low level evolutionary feature extraction based on Gabor wavelets and a high level neural network based pattern recognition. This resembles the process involved in the human visual system. Gabor wavelets are exploited as the feature extractor. A neural network, Fuzzy Adaptive Resonance Theory (Fuzzy ART), acts as the high level decision making and recognition system. Some modifications to the Fuzzy ART make it capable of simulating the post-natal and evolutionary development of the human visual system. The proposed system has been evaluated using natural textures. The results obtained show that it is able to effectively perform the object recognition task and will find useful application in the study of the human visual system model for artificial vision

    Inter-occlusion reasoning for human detection based on variational mean field

    Get PDF
    Detecting multiple humans in crowded scenes is challenging because the humans are often partially or even totally occluded by each other. In this paper, we propose a novel algorithm for partial inter-occlusion reasoning in human detection based on variational mean field theory. The proposed algorithm can be integrated with various part-based human detectors using different types of features, object representations, and classifiers. The algorithm takes as the input an initial set of possible human objects (hypotheses) detected using a part-based human detector. Each hypothesis is decomposed into a number of parts and the occlusion status of each part is inferred by the proposed algorithm. Specifically, initial detections (hypotheses) with spatial layout information are represented in a graphical model and the inference is formulated as an estimation of the marginal probability of the observed data in a Bayesian network. The variational mean field theory is employed as an effective estimation technique. The proposed method was evaluated on popular datasets including CAVIAR, iLIDS, and INRIA. Experimental results have shown that the proposed algorithm is not only able to detect humans under severe occlusion but also enhance the detection performance when there is no occlusion

    Facial expression recognition for multiplayer online games

    Get PDF
    The Multiplayer Online Game (MOG) becomes more popular than any other types of computer games for its collaboration, communication and interaction ability. However, compared with the ordinary human communication, the MOG still has many limitations, especially in communication using facial expressions. Although detailed facial animation has already been achieved in a number of MOGs, players have to use text commands to control avatars expressions. In this paper, we briefly review the state of the art in facial expression recognition and propose an automatic expression recognition system that can be integrated into a MOG to control the avatar’s facial expressions. We evaluate and improve a number of algorithms to meet the specific requirements of such a system and propose an efficient implementation. In particular, our proposed system uses fixed and less facial landmarks to reduce the computational load with little degradation of the recognition performance
    corecore