602 research outputs found

    Temporal Feature Integration for Music Organisation

    Get PDF

    Acoustic Approaches to Gender and Accent Identification

    Get PDF
    There has been considerable research on the problems of speaker and language recognition from samples of speech. A less researched problem is that of accent recognition. Although this is a similar problem to language identification, di�erent accents of a language exhibit more fine-grained di�erences between classes than languages. This presents a tougher problem for traditional classification techniques. In this thesis, we propose and evaluate a number of techniques for gender and accent classification. These techniques are novel modifications and extensions to state of the art algorithms, and they result in enhanced performance on gender and accent recognition. The first part of the thesis focuses on the problem of gender identification, and presents a technique that gives improved performance in situations where training and test conditions are mismatched. The bulk of this thesis is concerned with the application of the i-Vector technique to accent identification, which is the most successful approach to acoustic classification to have emerged in recent years. We show that it is possible to achieve high accuracy accent identification without reliance on transcriptions and without utilising phoneme recognition algorithms. The thesis describes various stages in the development of i-Vector based accent classification that improve the standard approaches usually applied for speaker or language identification, which are insu�cient. We demonstrate that very good accent identification performance is possible with acoustic methods by considering di�erent i-Vector projections, frontend parameters, i-Vector configuration parameters, and an optimised fusion of the resulting i-Vector classifiers we can obtain from the same data. We claim to have achieved the best accent identification performance on the test corpus for acoustic methods, with up to 90% identification rate. This performance is even better than previously reported acoustic-phonotactic based systems on the same corpus, and is very close to performance obtained via transcription based accent identification. Finally, we demonstrate that the utilization of our techniques for speech recognition purposes leads to considerably lower word error rates. Keywords: Accent Identification, Gender Identification, Speaker Identification, Gaussian Mixture Model, Support Vector Machine, i-Vector, Factor Analysis, Feature Extraction, British English, Prosody, Speech Recognition

    Text-Independent Speaker Identification using Statistical Learning

    Get PDF
    The proliferation of voice-activated devices and systems and over-the-phone bank transactions has made our daily affairs much easier in recent times. The ease that these systems offer also call for a need for them to be fail-safe against impersonators. Due to the sensitive information that might be shred on these systems, it is imperative that security be an utmost concern during the development stages. Vital systems like these should incorporate a functionality of discriminating between the actual speaker and impersonators. That functionality is the focus of this thesis. Several methods have been proposed to be used to achieve this system and some success has been recorded so far. However, due to the vital role this system has to play in securing critical information, efforts have been continually made to reduce the probability of error in the systems. Therefore, statistical learning methods or techniques are utilized in this thesis because they have proven to have high accuracy and efficiency in various other applications. The statistical methods used are Gaussian Mixture Models and Support Vector Machines. These methods have become the de facto techniques for designing speaker identification systems. The effectiveness of the support vector machine is dependent on the type of kernel used. Several kernels have been proposed for achieving better results and we also introduce a kernel in this thesis which will serve as an alternative to the already defined ones. Other factors including the number of components used in modeling the Gaussian Mixture Model (GMM) affect the performance of the system and these factors are used in this thesis and exciting results were obtained

    Acoustic Features for Environmental Sound Analysis

    Get PDF
    International audienceMost of the time it is nearly impossible to differentiate between particular type of sound events from a waveform only. Therefore, frequency domain and time-frequency domain representations have been used for years providing representations of the sound signals that are more inline with the human perception. However, these representations are usually too generic and often fail to describe specific content that is present in a sound recording. A lot of work have been devoted to design features that could allow extracting such specific information leading to a wide variety of hand-crafted features. During the past years, owing to the increasing availability of medium scale and large scale sound datasets, an alternative approach to feature extraction has become popular, the so-called feature learning. Finally, processing the amount of data that is at hand nowadays can quickly become overwhelming. It is therefore of paramount importance to be able to reduce the size of the dataset in the feature space. The general processing chain to convert an sound signal to a feature vector that can be efficiently exploited by a classifier and the relation to features used for speech and music processing are described is this chapter

    Condition Monitoring Methods for Large, Low-speed Bearings

    Get PDF
    In all industrial production plants, well-functioning machines and systems are required for sustained and safe operation. However, asset performance degrades over time and may lead to reduced effiency, poor product quality, secondary damage to other assets or even complete failure and unplanned downtime of critical systems. Besides the potential safety hazards from machine failure, the economic consequences are large, particularly in offshore applications where repairs are difficult. This thesis focuses on large, low-speed rolling element bearings, concretized by the main swivel bearing of an offshore drilling machine. Surveys have shown that bearing failure in drilling machines is a major cause of rig downtime. Bearings have a finite lifetime, which can be estimated using formulas supplied by the bearing manufacturer. Premature failure may still occur as a result of irregularities in operating conditions and use, lubrication, mounting, contamination, or external environmental factors. On the contrary, a bearing may also exceed the expected lifetime. Compared to smaller bearings, historical failure data from large, low-speed machinery is rare. Due to the high cost of maintenance and repairs, the preferred maintenance arrangement is often condition based. Vibration measurements with accelerometers is the most common data acquisition technique. However, vibration based condition monitoring of large, low-speed bearings is challenging, due to non-stationary operating conditions, low kinetic energy and increased distance from fault to transducer. On the sensor side, this project has also investigated the usage of acoustic emission sensors for condition monitoring purposes. Roller end damage is identified as a failure mode of interest in tapered axial bearings. Early stage abrasive wear has been observed on bearings in drilling machines. The failure mode is currently only detectable upon visual inspection and potentially through wear debris in the bearing lubricant. In this thesis, multiple machine learning algorithms are developed and applied to handle the challenges of fault detection in large, low-speed bearings with little or no historical data and unknown fault signatures. The feasibility of transfer learning is demonstrated, as an approach to speed up implementation of automated fault detection systems when historical failure data is available. Variational autoencoders are proposed as a method for unsupervised dimensionality reduction and feature extraction, being useful for obtaining a health indicator with a statistical anomaly detection threshold. Data is collected from numerous experiments throughout the project. Most notably, a test was performed on a real offshore drilling machine with roller end wear in the bearing. To replicate this failure mode and aid development of condition monitoring methods, an axial bearing test rig has been designed and built as a part of the project. An overview of all experiments, methods and results are given in the thesis, with details covered in the appended papers.publishedVersio
    corecore