410 research outputs found

    Classification of music genres using sparse representations in overcomplete dictionaries

    Get PDF
    This paper presents a simple, but efficient and robust, method for music genre classification that utilizes sparse representations in overcomplete dictionaries. The training step involves creating dictionaries, using the K-SVD algorithm, in which data corresponding to a particular music genre has a sparse representation. In the classification step, the Orthogonal Matching Pursuit (OMP) algorithm is used to separate feature vectors that consist only of Linear Predictive Coding (LPC) coefficients. The paper analyses in detail a popular case study from the literature, the ISMIR 2004 database. Using the presented method, the correct classification percentage of the 6 music genres is 85.59, result that is comparable with the best results published so far

    A Two-Phase Damped-Exponential Model for Speech Synthesis

    Get PDF
    It is well known that there is room for improvement in the resultant quality of speech synthesizers in use today. This research focuses on the improvement of speech synthesis by analyzing various models for speech signals. An improvement in synthesis quality will benefit any system incorporating speech synthesis. Many synthesizers in use today use linear predictive coding (LPC) techniques and only use one set of vocal tract parameters per analysis frame or pitch period for pitch-synchronous synthesizers. This work is motivated by the two-phase analysis-synthesis model proposed by Krishnamurthy. In lieu of electroglottograph data for vocal tract model transition point determination, this work estimates this point directly from the speech signal. The work then evaluates the potential of the two-phase damped-exponential model for synthetic speech quality improvement. LPC and damped-exponential models are used for synthesis. Statistical analysis of data collected in a subjective listening test indicates a statistically significant improvement (at the 0.05 significance level) in quality using this two-phase damped-exponential model over single-phase LPC, single-phase damped-exponential and two-phase LPC for the speakers, sentences, and model orders used. This subjective test shows the potential for quality improvement of synthesized speech and supports the need for further research and testing

    Speech Analysis using Relative Spectral Filtering (RASTA) and Dynamic Time Warping (DTW) methods

    Get PDF
    This work consists of analysis of speech using RASTA and DTW methods. The analysis is based on the speech recognition. Speech recognition converts identified words or speech in spoken language into computer-readable format. The first speech recognition has been developed in the year of 1950s. The variation of speech spoken by individual becomes the main challenge for the speech recognition. Speech recognition has application in many areas such as customer call centers and as a medium in helping those with learning disabilities. This work presents an analysis of speech for Malay single words. There are three stages in speech recognition which are analysis, feature extraction and modeling. The Relative Spectral Filtering (RASTA) is used as the method for feature extraction. RASTA is a method that subsidized the undesirable and additive noise in speech recognition. Dynamic Time Warping (DTW) method is used as the modelling technique

    Bias Removal Approach in System Identification and Arma Spectral Estimation

    Get PDF
    Electrical Engineerin

    An Optimal Orthogonal Decomposition Method for Kalman Filter-Based Turbofan Engine Thrust Estimation

    Get PDF
    A new linear point design technique is presented for the determination of tuning parameters that enable the optimal estimation of unmeasured engine outputs such as thrust. The engine s performance is affected by its level of degradation, generally described in terms of unmeasurable health parameters related to each major engine component. Accurate thrust reconstruction depends upon knowledge of these health parameters, but there are usually too few sensors to be able to estimate their values. In this new technique, a set of tuning parameters is determined which accounts for degradation by representing the overall effect of the larger set of health parameters as closely as possible in a least squares sense. The technique takes advantage of the properties of the singular value decomposition of a matrix to generate a tuning parameter vector of low enough dimension that it can be estimated by a Kalman filter. A concise design procedure to generate a tuning vector that specifically takes into account the variables of interest is presented. An example demonstrates the tuning parameters ability to facilitate matching of both measured and unmeasured engine outputs, as well as state variables. Additional properties of the formulation are shown to lend themselves well to diagnostics

    Testing significance of features by lassoed principal components

    Full text link
    We consider the problem of testing the significance of features in high-dimensional settings. In particular, we test for differentially-expressed genes in a microarray experiment. We wish to identify genes that are associated with some type of outcome, such as survival time or cancer type. We propose a new procedure, called Lassoed Principal Components (LPC), that builds upon existing methods and can provide a sizable improvement. For instance, in the case of two-class data, a standard (albeit simple) approach might be to compute a two-sample tt-statistic for each gene. The LPC method involves projecting these conventional gene scores onto the eigenvectors of the gene expression data covariance matrix and then applying an L1L_1 penalty in order to de-noise the resulting projections. We present a theoretical framework under which LPC is the logical choice for identifying significant genes, and we show that LPC can provide a marked reduction in false discovery rates over the conventional methods on both real and simulated data. Moreover, this flexible procedure can be applied to a variety of types of data and can be used to improve many existing methods for the identification of significant features.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS182 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Evaluation of an Outer Loop Retrofit Architecture for Intelligent Turbofan Engine Thrust Control

    Get PDF
    The thrust control capability of a retrofit architecture for intelligent turbofan engine control and diagnostics is evaluated. The focus of the study is on the portion of the hierarchical architecture that performs thrust estimation and outer loop thrust control. The inner loop controls fan speed so the outer loop automatically adjusts the engine's fan speed command to maintain thrust at the desired level, based on pilot input, even as the engine deteriorates with use. The thrust estimation accuracy is assessed under nominal and deteriorated conditions at multiple operating points, and the closed loop thrust control performance is studied, all in a complex real-time nonlinear turbofan engine simulation test bed. The estimation capability, thrust response, and robustness to uncertainty in the form of engine degradation are evaluated

    Development of the Arabic Voice Pathology Database and Its Evaluation by Using Speech Features and Machine Learning Algorithms

    Get PDF
    A voice disorder database is an essential element in doing research on automatic voice disorder detection and classification. Ethnicity affects the voice characteristics of a person, and so it is necessary to develop a database by collecting the voice samples of the targeted ethnic group. This will enhance the chances of arriving at a global solution for the accurate and reliable diagnosis of voice disorders by understanding the characteristics of a local group. Motivated by such idea, an Arabic voice pathology database (AVPD) is designed and developed in this study by recording three vowels, running speech, and isolated words. For each recorded samples, the perceptual severity is also provided which is a unique aspect of the AVPD. During the development of the AVPD, the shortcomings of different voice disorder databases were identified so that they could be avoided in the AVPD. In addition, the AVPD is evaluated by using six different types of speech features and four types of machine learning algorithms. The results of detection and classification of voice disorders obtained with the sustained vowel and the running speech are also compared with the results of an English-language disorder database, the Massachusetts Eye and Ear Infirmary (MEEI) database
    • …
    corecore