37 research outputs found

    Master of Science

    Get PDF
    thesisPresently, speech recognition is gaining worldwide popularity in applications like Google Voice, speech-to-text reporter (speech-to-text transcription, video captioning, real-time transcriptions), hands-free computing, and video games. Research has been done for several years and many speech recognizers have been built. However, most of the speech recognizers fail to recognize the speech accurately. Consider the well-known application of Google Voice, which aids in users search of the web using voice. Though Google Voice does a good job in transcribing the spoken words, it does not accurately recognize the words spoken with different accents. With the fact that several accents are evolving around the world, it is essential to train the speech recognizer to recognize accented speech. Accent classification is defined as the problem of classifying the accents in a given language. This thesis explores various methods to identify the accents. We introduce a new concept of clustering windows of a speech signal and learn a distance metric using specific distance measure over phonetic strings to classify the accents. A language structure is incorporated to learn this distance metric. We also show how kernel approximation algorithms help in learning a distance metric

    Computer vision based techniques for fall detection with application towards assisted living

    Get PDF
    In this thesis, new computer vision based techniques are proposed to detect falls of an elderly person living alone. This is an important problem in assisted living. Different types of information extracted from video recordings are exploited for fall detection using both analytical and machine learning techniques. Initially, a particle filter is used to extract a 2D cue, head velocity, to determine a likely fall event. The human body region is then extracted with a modern background subtraction algorithm. Ellipse fitting is used to represent this shape and its orientation angle is employed for fall detection. An analytical method is used by setting proper thresholds against which the head velocity and orientation angle are compared for fall discrimination. Movement amplitude is then integrated into the fall detector to reduce false alarms. Since 2D features can generate false alarms and are not invariant to different directions, more robust 3D features are next extracted from a 3D person representation formed from video measurements from multiple calibrated cameras. Instead of using thresholds, different data fitting methods are applied to construct models corresponding to fall activities. These are then used to distinguish falls and non-falls. In the final works, two practical fall detection schemes which use only one un-calibrated camera are tested in a real home environment. These approaches are based on 2D features which describe human body posture. These extracted features are then applied to construct either a supervised method for posture classification or an unsupervised method for abnormal posture detection. Certain rules which are set according to the characteristics of fall activities are lastly used to build robust fall detection methods. Extensive evaluation studies are included to confirm the efficiency of the schemes

    Intelligent computer vision processing techniques for fall detection in enclosed environments

    Get PDF
    Detecting unusual movement (falls) for elderly people in enclosed environments is receiving increasing attention and is likely to have massive potential social and economic impact. In this thesis, new intelligent computer vision processing based techniques are proposed to detect falls in indoor environments for senior citizens living independently, such as in intelligent homes. Different types of features extracted from video-camera recordings are exploited together with both background subtraction analysis and machine learning techniques. Initially, an improved background subtraction method is used to extract the region of a person in the recording of a room environment. A selective updating technique is introduced for adapting the change of the background model to ensure that the human body region will not be absorbed into the background model when it is static for prolonged periods of time. Since two-dimensional features can generate false alarms and are not invariant to different directions, more robust three-dimensional features are next extracted from a three-dimensional person representation formed from video-camera measurements of multiple calibrated video-cameras. The extracted three-dimensional features are applied to construct a single Gaussian model using the maximum likelihood technique. This can be used to distinguish falls from non-fall activity by comparing the model output with a single. In the final works, new fall detection schemes which use only one uncalibrated video-camera are tested in a real elderly person s home environment. These approaches are based on two-dimensional features which describe different human body posture. The extracted features are applied to construct a supervised method for posture classification for abnormal posture detection. Certain rules which are set according to the characteristics of fall activities are lastly used to build a robust fall detection model

    Vision-based hand shape identification for sign language recognition

    Get PDF
    This thesis introduces an approach to obtain image-based hand features to accurately describe hand shapes commonly found in the American Sign Language. A hand recognition system capable of identifying 31 hand shapes from the American Sign Language was developed to identify hand shapes in a given input image or video sequence. An appearance-based approach with a single camera is used to recognize the hand shape. A region-based shape descriptor, the generic Fourier descriptor, invariant of translation, scale, and orientation, has been implemented to describe the shape of the hand. A wrist detection algorithm has been developed to remove the forearm from the hand region before the features are extracted. The recognition of the hand shapes is performed with a multi-class Support Vector Machine. Testing provided a recognition rate of approximately 84% based on widely varying testing set of approximately 1,500 images and training set of about 2,400 images. With a larger training set of approximately 2,700 images and a testing set of approximately 1,200 images, a recognition rate increased to about 88%

    A novel method for apoptosis protein subcellular localization prediction combining encoding based on grouped weight and support vector machine

    Get PDF
    AbstractApoptosis proteins have a central role in the development and homeostasis of an organism. These proteins are very important for understanding the mechanism of programmed cell death. Based on the idea of coarse-grained description and grouping in physics, a new feature extraction method with grouped weight for protein sequence is presented, and applied to apoptosis protein subcellular localization prediction associated with support vector machine. For the same training dataset and the same predictive algorithm, the overall prediction accuracy of our method in Jackknife test is 13.2% and 15.3% higher than the accuracy based on the amino acid composition and instability index. Especially for the else class apoptosis proteins, the increment of prediction accuracy is 41.7 and 33.3 percentile, respectively. The experiment results show that the new feature extraction method is efficient to extract the structure information implicated in protein sequence and the method has reached a satisfied performance despite its simplicity. The overall prediction accuracy of EBGW_SVM model on dataset ZD98 reach 92.9% in Jackknife test, which is 8.2–20.4 percentile higher than other existing models. For a new dataset ZW225, the overall prediction accuracy of EBGW_SVM achieves 83.1%. Those implied that EBGW_SVM model is a simple but efficient prediction model for apoptosis protein subcellular location prediction

    Reinforcement Learning Environment for Orbital Station-Keeping

    Get PDF
    In this thesis, a Reinforcement Learning Environment for orbital station-keeping is created and tested against one of the most used Reinforcement Learning algorithm called Proximal Policy Optimization (PPO). This thesis also explores the foundations of Reinforcement Learning, from the taxonomy to a description of PPO, and shows a thorough explanation of the physics required to make the RL environment. Optuna optimizes PPO\u27s hyper-parameters for the created environment via distributed computing. This thesis then shows and analysis the results from training a PPO agent six times

    Marine propulsion shaft system fault diagnosis method based on partly ensemble empirical mode decomposition and SVM

    Get PDF
    This paper investigates the application of the Partly Ensemble Empirical Mode Decomposition (PEEMD), Principal Component Analysis (PCA) and Support Vector Machine (SVM) on signal processing, attribute reduction and pattern recognition. On this basis, a novel method for mechanical faulty diagnosis based on PEEMD, PCA and SVM is presented, which utilizes the PEEMD to extract faulty feature parameters from the statistical characteristics of intrinsic mode functions to constitute feature vectors, and then makes the attribute reduction by PCA method to obtain the key features, lastly these key features are input into GA-optimized SVM to accomplish faulty pattern recognition. The experimental results of the proposed method to fault diagnosis of the rolling bearing and stern bearing on marine propulsion shaft system show that this method can extract the faulty features, which have better classification ability and at the same time reduce the computation complexity significantly, accordingly improve the classifier efficiency and achieve a better classification performance

    Statistical Approaches for Signal Processing with Application to Automatic Singer Identification

    Get PDF
    In the music world, the oldest instrument is known as the singing voice that plays an important role in musical recordings. The singer\u27s identity serves as a primary aid for people to organize, browse, and retrieve music recordings. In this thesis, we focus on the problem of singer identification based on the acoustic features of singing voice. An automatic singer identification system is constructed and has achieved a very high identification accuracy. This system consists of three crucial parts: singing voice detection, background music removal and pattern recognition. These parts are introduced and explored in great details in this thesis. To be specific, in terms of the singing voice detection, we firstly study a traditional method, double GMM. Then an improved method, namely single GMM, is proposed. The experimental result shows that the detection accuracy of single GMM can be achieved as high as 96.42%. In terms of the background music removal, Non-negative Matrix Factorization (NMF) and Robust Principal Component Analysis (RPCA) are demonstrated. The evaluation result shows that RPCA outperforms NMF. In terms of pattern recognition, we explore the algorithms of Support Vector Machine (SVM) and Gaussian Mixture Model (GMM). Based on the experimental results, it turns out that the prediction accuracy of GMM classifier is about 16% higher than SVM
    corecore