1,894 research outputs found

    Swarm Optmization Algorithms for Face Recognition

    Get PDF
    In this thesis, a face recognition system based on swarm intelligence is developed. Swarm intelligence can be defined as the collective intelligence that emerges from a group of simple entities; these agents enter into interactions, sense and change their environment locally. A typical system for face recognition consists of three stages: feature extraction, feature selection and classification. Two approaches are explored. First, Bacterial Foraging Optimization(BFO), in which the features extracted from Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA) are optimized. Second, Particle Swarm Optimization(PSO), which optimizes the transform coefficients obtained from the Discrete Cosine Transform(DCT) of the images. PCA, LDA and DCT are all appearance-based methods of feature extraction. PCA and LDA are based on global appearance whereas DCT is performed on a block by block basis exploring the local appearance-based features. Finally, for classification Euclidean distance metric is used. The algorithms that have been applied are tested on Yale Face Database

    Deep Cellular Recurrent Neural Architecture for Efficient Multidimensional Time-Series Data Processing

    Get PDF
    Efficient processing of time series data is a fundamental yet challenging problem in pattern recognition. Though recent developments in machine learning and deep learning have enabled remarkable improvements in processing large scale datasets in many application domains, most are designed and regulated to handle inputs that are static in time. Many real-world data, such as in biomedical, surveillance and security, financial, manufacturing and engineering applications, are rarely static in time, and demand models able to recognize patterns in both space and time. Current machine learning (ML) and deep learning (DL) models adapted for time series processing tend to grow in complexity and size to accommodate the additional dimensionality of time. Specifically, the biologically inspired learning based models known as artificial neural networks that have shown extraordinary success in pattern recognition, tend to grow prohibitively large and cumbersome in the presence of large scale multi-dimensional time series biomedical data such as EEG. Consequently, this work aims to develop representative ML and DL models for robust and efficient large scale time series processing. First, we design a novel ML pipeline with efficient feature engineering to process a large scale multi-channel scalp EEG dataset for automated detection of epileptic seizures. With the use of a sophisticated yet computationally efficient time-frequency analysis technique known as harmonic wavelet packet transform and an efficient self-similarity computation based on fractal dimension, we achieve state-of-the-art performance for automated seizure detection in EEG data. Subsequently, we investigate the development of a novel efficient deep recurrent learning model for large scale time series processing. For this, we first study the functionality and training of a biologically inspired neural network architecture known as cellular simultaneous recurrent neural network (CSRN). We obtain a generalization of this network for multiple topological image processing tasks and investigate the learning efficacy of the complex cellular architecture using several state-of-the-art training methods. Finally, we develop a novel deep cellular recurrent neural network (CDRNN) architecture based on the biologically inspired distributed processing used in CSRN for processing time series data. The proposed DCRNN leverages the cellular recurrent architecture to promote extensive weight sharing and efficient, individualized, synchronous processing of multi-source time series data. Experiments on a large scale multi-channel scalp EEG, and a machine fault detection dataset show that the proposed DCRNN offers state-of-the-art recognition performance while using substantially fewer trainable recurrent units

    Method for the extraction of shock signal features based on the upper limit of density integral

    Get PDF
    Shock signal features must be extracted for use in pattern recognition or fault diagnosis. In this work, we proposed a method for the feature extraction of shock signals, which are vibration signals that change faster and have larger amplitude ranges than general signals. First, we proposed the concepts of amplitude density for monotonic functions and piecewise monotonic functions. On the basis of these concepts, we then proposed the concept of the upper limit of density integral (ULDI), which was adopted to obtain signal features. Then, we introduced two types of serious fault cracks to the latch sheet of an automatic gun mechanism that is used on warships. Next, we applied the proposed method to extract the features of shock signals from data acquired when the automatic gun mechanism fired with normal and two fault patterns. Finally, we verified the effectiveness of our proposed method by applying the features that it extracted to a support vector machine (SVM). Our proposed method provided good results and was superior to the traditional statistics-based feature extraction method when applied to a SVM for classification. In addition, the former method demonstrated better generalisation than the latter. Thus, our method is an efficient approach for extracting shock signal features in pattern recognition and fault diagnosis

    Audio-Visual Automatic Speech Recognition Using PZM, MFCC and Statistical Analysis

    Get PDF
    Audio-Visual Automatic Speech Recognition (AV-ASR) has become the most promising research area when the audio signal gets corrupted by noise. The main objective of this paper is to select the important and discriminative audio and visual speech features to recognize audio-visual speech. This paper proposes Pseudo Zernike Moment (PZM) and feature selection method for audio-visual speech recognition. Visual information is captured from the lip contour and computes the moments for lip reading. We have extracted 19th order of Mel Frequency Cepstral Coefficients (MFCC) as speech features from audio. Since all the 19 speech features are not equally important, therefore, feature selection algorithms are used to select the most efficient features. The various statistical algorithm such as Analysis of Variance (ANOVA), Kruskal-wallis, and Friedman test are employed to analyze the significance of features along with Incremental Feature Selection (IFS) technique. Statistical analysis is used to analyze the statistical significance of the speech features and after that IFS is used to select the speech feature subset. Furthermore, multiclass Support Vector Machine (SVM), Artificial Neural Network (ANN) and Naive Bayes (NB) machine learning techniques are used to recognize the speech for both the audio and visual modalities. Based on the recognition rate combined decision is taken from the two individual recognition systems. This paper compares the result achieved by the proposed model and the existing model for both audio and visual speech recognition. Zernike Moment (ZM) is compared with PZM and shows that our proposed model using PZM extracts better discriminative features for visual speech recognition. This study also proves that audio feature selection using statistical analysis outperforms methods without any feature selection technique

    A Hand-Based Biometric Verification System Using Ant Colony Optimization

    Get PDF
    This paper presents a novel personal authentication system using hand-based biometrics, which utilizes internal (beneath the skin) structure of veins on the dorsal part of the hand and the outer shape of the hand. The hand-vein and the hand-shape images can be simultaneously acquired by using infrared thermal and digital camera respectively. A claimed identity is authenticated by integrating these two traits based on the score-level fusion in which four fusion rules are used for the integration. Before their fusion, each modality is evaluated individually in terms of error rates and weights are assigned according to their performance. In order to achieve an adaptive security in the proposed bimodal system, an optimal selection of fusion parameters is required. Hence, Ant Colony Optimization (ACO) is employed in the bimodal system to select the weights and also one out of the four fusion rules optimally for the adaptive fusion of the two modalities to meet the user defined security levels. The databases of hand-veins and the hand-shapes consisting of 150 users are acquired using the peg-free imaging setup. The experimental results show genuine acceptance rate (GAR) of 98% at false acceptance rate (FAR) of 0.001% and the system has the potential for any online personal authentication based application.

    EEG-based brain-computer interfaces using motor-imagery: techniques and challenges.

    Get PDF
    Electroencephalography (EEG)-based brain-computer interfaces (BCIs), particularly those using motor-imagery (MI) data, have the potential to become groundbreaking technologies in both clinical and entertainment settings. MI data is generated when a subject imagines the movement of a limb. This paper reviews state-of-the-art signal processing techniques for MI EEG-based BCIs, with a particular focus on the feature extraction, feature selection and classification techniques used. It also summarizes the main applications of EEG-based BCIs, particularly those based on MI data, and finally presents a detailed discussion of the most prevalent challenges impeding the development and commercialization of EEG-based BCIs
    corecore