159 research outputs found

    Video-based online face recognition using identity surfaces

    Get PDF
    Recognising faces across multiple views is more challenging than that from a fixed view because of the severe non-linearity caused by rotation in depth, self-occlusion, self-shading, and change of illumination. The problem can be related to the problem of modelling the spatiotemporal dynamics of moving faces from video input for unconstrained live face recognition. Both problems remain largely under-developed. To address the problems, a novel approach is presented in this paper. A multi-view dynamic face model is designed to extract the shape-and-pose-free texture patterns of faces. The model provides a precise correspondence to the task of recognition since the 3D shape information is used to warp the multi-view faces onto the model mean shape in frontal-view. The identity surface of each subject is constructed in a discriminant feature space from a sparse set of face texture patterns, or more practically, from one or more learning sequences containing the face of the subject. Instead of matching templates or estimating multi-modal density functions, face recognition can be performed by computing the pattern distances to the identity surfaces or trajectory distances between the object and model trajectories. Experimental results depict that this approach provides an accurate recognition rate while using trajectory distances achieves a more robust performance since the trajectories encode the spatio-temporal information and contain accumulated evidence about the moving faces in a video input

    Video Classification Using Spatial-Temporal Features and PCA

    Get PDF
    We investigate the problem of automated video classification by analysing the low-level audio-visual signal patterns along the time course in a holistic manner. Five popular TV broadcast genre are studied including sports, cartoon, news, commercial and music. A novel statistically based approach is proposed comprising two important ingredients designed for implicit semantic content characterisation and class identities modelling. First, a spatial-temporal audio-visual "super" feature vector is computed, capturing crucial clip-level video structure information inherent in a video genre. Second, the feature vector is further processed using Principal Component Analysis to reduce the spatial-temporal redundancy while exploiting the correlations between feature elements, which give rise to a compact representation for effective probabilistic modelling of each video genre. Extensive experiments are conducted assessing various aspects of the approach and their influence on the overall system performance

    Visual tracking via efficient kernel discriminant subspace learning

    Get PDF
    © 2005 IEEERobustly tracking moving objects in video sequences is one of the key problems in computer vision. In this paper we introduce a computationally efficient nonlinear kernel learning strategy to find a discriminative model which distinguishes the tracked object from the background. Principal component analysis and linear discriminant analysis have been applied to this problem with some success. These techniques are limited, however, by the fact that they are capable only of identifying linear subspaces within the data. Kernel based methods, in contrast, are able to extract nonlinear subspaces, and thus represent more complex characteristics of the tracked object and background. This is a particular advantage when tracking deformable objects and where appearance changes due to the unstable illumination and pose occur. An efficient approximation to kernel discriminant analysis using QR decomposition proposed by Xiong et al. (2004) makes possible real-time updating of the optimal nonlinear subspace. We present a tracking method based on this result and show promising experimental results on real videos undergoing large pose and illumination changes

    Method for solving nonlinearity in recognising tropical wood species

    Get PDF
    Classifying tropical wood species pose a considerable economic challenge and failure to classify the wood species accurately can have significant effects on timber industries. Hence, an automatic tropical wood species recognition system was developed at Centre for Artificial Intelligence and Robotics (CAIRO), Universiti Teknologi Malaysia. The system classifies wood species based on texture analysis whereby wood surface images are captured and wood features are extracted from these images which will be used for classification. Previous research on tropical wood species recognition systems considered methods for wood species classification based on linear features. Since wood species are known to exhibit nonlinear features, a Kernel-Genetic Algorithm (Kernel-GA) is proposed in this thesis to perform nonlinear feature selection. This method combines the Kernel Discriminant Analysis (KDA) technique with Genetic Algorithm (GA) to generate nonlinear wood features and also reduce dimension of the wood database. The proposed system achieved classification accuracy of 98.69%, showing marked improvement to the work done previously. Besides, a fuzzy logic-based pre-classifier is also proposed in this thesis to mimic human interpretation on wood pores which have been proven to aid the data acquisition bottleneck and serve as a clustering mechanism for large database simplifying the classification. The fuzzy logic-based pre-classifier managed to reduce the processing time for training and testing by more than 75% and 26% respectively. Finally, the fuzzy pre-classifier is combined with the Kernal-GA algorithm to improve the performance of the tropical wood species recognition system. The experimental results show that the combination of fuzzy preclassifier and nonlinear feature selection improves the performance of the tropical wood species recognition system in terms of memory space, processing time and classification accuracy

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Hand tracking and bimanual movement understanding

    Get PDF
    Bimanual movements are a subset ot human movements in which the two hands move together in order to do a task or imply a meaning A bimanual movement appearing in a sequence of images must be understood in order to enable computers to interact with humans in a natural way This problem includes two main phases, hand tracking and movement recognition. We approach the problem of hand tracking from a neuroscience point ot view First the hands are extracted and labelled by colour detection and blob analysis algorithms In the presence of the two hands one hand may occlude the other occasionally Therefore, hand occlusions must be detected in an image sequence A dynamic model is proposed to model the movement of each hand separately Using this model in a Kalman filtering proccss the exact starting and end points of hand occlusions are detected We exploit neuroscience phenomena to understand the beha\ tour of the hands during occlusion periods Based on this, we propose a general hand tracking algorithm to track and reacquire the hands over a movement including hand occlusion The advantages of the algorithm and its generality are demonstrated in the experiments. In order to recognise the movements first we recognise the movement of a hand Using statistical pattern recognition methods (such as Principal Component Analysis and Nearest Neighbour) the static shape of each hand appearing in an image is recognised A Graph- Matching algorithm and Discrete Midden Markov Models (DHMM) as two spatio-temporal pattern recognition techniques are imestigated tor recognising a dynamic hand gesture For recognising bimanual movements we consider two general forms ot these movements, single and concatenated periodic We introduce three Bayesian networks for recognising die movements The networks are designed to recognise and combinc the gestures of the hands in order to understand the whole movement Experiments on different types ot movement demonstrate the advantages and disadvantages of each network

    Significant Association of Urinary Toxic Metals and Autism-Related Symptoms—A Nonlinear Statistical Analysis with Cross Validation

    Get PDF
    abstract: Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate Speech), but significant associations were found for UTM with all eleven autism-related assessments with cross-validation R[superscript 2] values ranging from 0.12–0.48.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.016952

    Automatic Landmarking for Non-cooperative 3D Face Recognition

    Get PDF
    This thesis describes a new framework for 3D surface landmarking and evaluates its performance for feature localisation on human faces. This framework has two main parts that can be designed and optimised independently. The first one is a keypoint detection system that returns positions of interest for a given mesh surface by using a learnt dictionary of local shapes. The second one is a labelling system, using model fitting approaches that establish a one-to-one correspondence between the set of unlabelled input points and a learnt representation of the class of object to detect. Our keypoint detection system returns local maxima over score maps that are generated from an arbitrarily large set of local shape descriptors. The distributions of these descriptors (scalars or histograms) are learnt for known landmark positions on a training dataset in order to generate a model. The similarity between the input descriptor value for a given vertex and a model shape is used as a descriptor-related score. Our labelling system can make use of both hypergraph matching techniques and rigid registration techniques to reduce the ambiguity attached to unlabelled input keypoints for which a list of model landmark candidates have been seeded. The soft matching techniques use multi-attributed hyperedges to reduce ambiguity, while the registration techniques use scale-adapted rigid transformation computed from 3 or more points in order to obtain one-to-one correspondences. Our final system achieves better or comparable (depending on the metric) results than the state-of-the-art while being more generic. It does not require pre-processing such as cropping, spike removal and hole filling and is more robust to occlusion of salient local regions, such as those near the nose tip and inner eye corners. It is also fully pose invariant and can be used with kinds of objects other than faces, provided that labelled training data is available
    corecore