13 research outputs found

    Non-linear System Identification with Composite Relevance Vector Machines

    Get PDF
    Nonlinear system identification based on relevance vector machines (RVMs) has been traditionally addressed by stacking the input and/or output regressors and then performing standard RVM regression. This letter introduces a full family of composite kernels in order to integrate the input and output information in the mapping function efficiently and hence generalize the standard approach. An improved trade-off between accuracy and sparsity is obtained in several benchmark problems. Also, the RVM yields confidence intervals for the predictions, and it is less sensitive to free parameter selectionPublicad

    Maximum Penalized Likelihood Kernel Regression for Fast Adaptation

    Full text link

    Speaker Recognition Based on Semantic Indexing

    Get PDF
    In this paper, a new pitch extraction method is established to be employed in improving the performance of the eigenvoices problem. This required indexing the pitch of the voice in a document matrix and then mapping the voice documents into preserved semantic features. The proposed voice recognition system was built to be operated in two phases; enrollment and recognition. Closed dataset of different voices belong to different sexes and ages of speakers were enrolled in the first phase. The results of the recognition phase were promising of about 81% for both sexes. This ensures the successful recognition task and the efficiency of the proposed system

    Transforming Analogous Time Series Data to Improve Natural Gas Demand Forecast Accuracy

    Get PDF
    This work improves daily natural gas demand forecasting models for days with unusual weather patterns through the use of analogous data (also known as surrogate data). To develop accurate mathematical models, data are required that describe the system. When this data does not completely describe the system or all possible events in the system, alternative methods are used to account for this lack of information. Improved models can be built by supplementing the lack of data with data or models from sources where more information is available. Time series forecasting involves building models using a set of historical data. When enough historical data are available, the set used to train models exhibits ample variation. This results in higher accuracy in GasDay natural gas demand forecasting models, since there is a wide range of history to describe. In real-world applications, this also means that the data are more realistic, due to the stochastic nature of real events. However, it is not always the case that enough historical data are available. This may be due to few years of available historical data, or a case where available data does not exhibit as much variation as desired. By taking advantage of GasDay\u27s many customers from various geographical locations, a large pool of data sets may be used to address this problem of insufficient data. Data from utilities of similar climate or gas use may be used to build useful models for other utilities. In other words, available data sets may be used as analogues or surrogates for building models for areas with insufficient data. The results show that the use of surrogate data improves forecasting models. Notably, forecasts for days with unusual weather patterns are improved. By applying clever transformation methods and carefully selecting donor areas, the methods discussed in this thesis help GasDay to improve forecasts for natural gas demand across the United States

    Kernel Methods in Computer-Aided Constructive Drug Design

    Get PDF
    A drug is typically a small molecule that interacts with the binding site of some target protein. Drug design involves the optimization of this interaction so that the drug effectively binds with the target protein while not binding with other proteins (an event that could produce dangerous side effects). Computational drug design involves the geometric modeling of drug molecules, with the goal of generating similar molecules that will be more effective drug candidates. It is necessary that algorithms incorporate strategies to measure molecular similarity by comparing molecular descriptors that may involve dozens to hundreds of attributes. We use kernel-based methods to define these measures of similarity. Kernels are general functions that can be used to formulate similarity comparisons. The overall goal of this thesis is to develop effective and efficient computational methods that are reliant on transparent mathematical descriptors of molecules with applications to affinity prediction, detection of multiple binding modes, and generation of new drug leads. While in this thesis we derive computational strategies for the discovery of new drug leads, our approach differs from the traditional ligandbased approach. We have developed novel procedures to calculate inverse mappings and subsequently recover the structure of a potential drug lead. The contributions of this thesis are the following: 1. We propose a vector space model molecular descriptor (VSMMD) based on a vector space model that is suitable for kernel studies in QSAR modeling. Our experiments have provided convincing comparative empirical evidence that our descriptor formulation in conjunction with kernel based regression algorithms can provide sufficient discrimination to predict various biological activities of a molecule with reasonable accuracy. 2. We present a new component selection algorithm KACS (Kernel Alignment Component Selection) based on kernel alignment for a QSAR study. Kernel alignment has been developed as a measure of similarity between two kernel functions. In our algorithm, we refine kernel alignment as an evaluation tool, using recursive component elimination to eventually select the most important components for classification. We have demonstrated empirically and proven theoretically that our algorithm works well for finding the most important components in different QSAR data sets. 3. We extend the VSMMD in conjunction with a kernel based clustering algorithm to the prediction of multiple binding modes, a challenging area of research that has been previously studied by means of time consuming docking simulations. The results reported in this study provide strong empirical evidence that our strategy has enough resolving power to distinguish multiple binding modes through the use of a standard k-means algorithm. 4. We develop a set of reverse engineering strategies for QSAR modeling based on our VSMMD. These strategies include: (a) The use of a kernel feature space algorithm to design or modify descriptor image points in a feature space. (b) The deployment of a pre-image algorithm to map the newly defined descriptor image points in the feature space back to the input space of the descriptors. (c) The design of a probabilistic strategy to convert new descriptors to meaningful chemical graph templates. The most important aspect of these contributions is the presentation of strategies that actually generate the structure of a new drug candidate. While the training set is still used to generate a new image point in the feature space, the reverse engineering strategies just described allows us to develop a new drug candidate that is independent of issues related to probability distribution constraints placed on test set molecules

    Exploiting primitive grouping constraints for noise robust automatic speech recognition : studies with simultaneous speech.

    Get PDF
    Significant strides have been made in the field of automatic speech recognition over the past three decades. However, the systems are not robust; their performance degrades in the presence of even moderate amounts of noise. This thesis presents an approach to developing a speech recognition system that takes inspiration firom the approach of human speech recognition

    Kernel eigenvoice speaker adaptation

    No full text
    Speech recognition is a powerful and widely used technology nowadays. However, its performance is not robust enough due to variations in speech introduced by the operating environment, noises (their type and energy) and inter-speaker differences. Speaker adaptation is an important technology to fine-tune either features or speech models for the mis-match due to inter-speaker variation. In the last decade, eigenvoice (EV) speaker adaptation has been developed. It makes use of the prior knowledge of training speakers to provide a fast adaptation algorithm (in other words, only a small amount of adaptation data is needed). Inspired by the kernel eigenface idea in face recognition, kernel eigenvoice (KEV) is proposed. KEV is a non-linear generalization to EV. This incorporates Kernel Principal Component Analysis (KPCA), a non-linear version of Principal Component Analysis (PCA), to capture the higher order correlations in order to further explore the speaker space and enhance recognition performance. The major difficulty is that through KEV adaptation, the adapted speaker model is estimated in the kernel feature space which may not have an exact pre-image in the input speaker-supervector space, yet observation likelihoods are computed in the acoustic observation space for both adaptation and recognition. Composite kernel is proposed to solve the problem. Experimental investigation on TIDIGITS corpus, an English continuous digits recognition task, using 4 seconds of adaptation data shows that KEV adaptation gives a 21% relative improvement (RI) over the speaker-independent (SI) model, a 25% RI over MLLR adaptation, a 32% RI over MAP adaptation and a 32% RI over EV adaptation. When the speaker-adapted models from KEV are interpolated with the SI model, the RI increase to 32% over SI model, 35% over MLLR adaptation, 41%over MAP adaptation and 32% over similarly interpolated EV adaptation

    Kernel eigenvoice speaker adaptation

    No full text

    Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA

    No full text
    Recently, we proposed an improvement to the eigenvoice (EV) speaker adaptation called kernel eigenvoice (KEV) speaker adaptation. In KEV adaptation, eigenvoices are computed using kernel PCA, and a new speaker’s adapted model is implicitly computed in the kernel-induced feature space. Due to many online kernel evaluations, both adaptation and subsequent recognition of KEV adaptation are slower than EV adaptation. In this paper, we eliminate all online kernel computations by finding an approximate pre-image of the implicit adapted model found by KEV adaptation. Furthermore, the two steps of finding the implicit adapted model and its approximate pre-image are integrated by embedding the kernel PCA procedure in our new embedded kernel eigenvoice (eKEV) speaker adaptation method. When tested in an TIDIGITS task with less than 10s of adaptation speech, eKEV adaptation obtained a speedup of 6–14 times in adaptation and 136 times in recognition over KEV adaptation with 12–13 % relative improvement in recognition accuracy. 1

    Various Reference Speakers Determination Methods for Embedded Kernel Eigenvoice Speaker Adaptation

    No full text
    corecore