77,595 research outputs found

    An MDL framework for sparse coding and dictionary learning

    Full text link
    The power of sparse signal modeling with learned over-complete dictionaries has been demonstrated in a variety of applications and fields, from signal processing to statistical inference and machine learning. However, the statistical properties of these models, such as under-fitting or over-fitting given sets of data, are still not well characterized in the literature. As a result, the success of sparse modeling depends on hand-tuning critical parameters for each data and application. This work aims at addressing this by providing a practical and objective characterization of sparse models by means of the Minimum Description Length (MDL) principle -- a well established information-theoretic approach to model selection in statistical inference. The resulting framework derives a family of efficient sparse coding and dictionary learning algorithms which, by virtue of the MDL principle, are completely parameter free. Furthermore, such framework allows to incorporate additional prior information to existing models, such as Markovian dependencies, or to define completely new problem formulations, including in the matrix analysis area, in a natural way. These virtues will be demonstrated with parameter-free algorithms for the classic image denoising and classification problems, and for low-rank matrix recovery in video applications

    Robust exponential smoothing of multivariate time series.

    Get PDF
    Multivariate time series may contain outliers of different types. In presence of such outliers, applying standard multivariate time series techniques becomes unreliable. A robust version of multivariate exponential smoothing is proposed. The method is affine equivariant, and involves the selection of a smoothing parameter matrix by minimizing a robust loss function. It is shown that the robust method results in much better forecasts than the classic approach in presence of outliers, and performs similar when the data contain no outliers. Moreover, the robust procedure yields an estimator of the smoothing parameter less subject to downward bias. As a byproduct, a cleaned version of the time series is obtained, as is illustrated by means of a real data example.Data cleaning; Exponential smoothing; Forecasting; Multivariate time series; Robustness;

    Improved model identification for nonlinear systems using a random subsampling and multifold modelling (RSMM) approach

    Get PDF
    In nonlinear system identification, the available observed data are conventionally partitioned into two parts: the training data that are used for model identification and the test data that are used for model performance testing. This sort of ‘hold-out’ or ‘split-sample’ data partitioning method is convenient and the associated model identification procedure is in general easy to implement. The resultant model obtained from such a once-partitioned single training dataset, however, may occasionally lack robustness and generalisation to represent future unseen data, because the performance of the identified model may be highly dependent on how the data partition is made. To overcome the drawback of the hold-out data partitioning method, this study presents a new random subsampling and multifold modelling (RSMM) approach to produce less biased or preferably unbiased models. The basic idea and the associated procedure are as follows. Firstly, generate K training datasets (and also K validation datasets), using a K-fold random subsampling method. Secondly, detect significant model terms and identify a common model structure that fits all the K datasets using a new proposed common model selection approach, called the multiple orthogonal search algorithm. Finally, estimate and refine the model parameters for the identified common-structured model using a multifold parameter estimation method. The proposed method can produce robust models with better generalisation performance

    Two Procedures for Robust Monitoring of Probability Distributions of Economic Data Streams induced by Depth Functions

    Full text link
    Data streams (streaming data) consist of transiently observed, evolving in time, multidimensional data sequences that challenge our computational and/or inferential capabilities. In this paper we propose user friendly approaches for robust monitoring of selected properties of unconditional and conditional distribution of the stream basing on depth functions. Our proposals are robust to a small fraction of outliers and/or inliers but sensitive to a regime change of the stream at the same time. Their implementations are available in our free R package DepthProc.Comment: Operations Research and Decisions, vol. 25, No. 1, 201

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    DizertačnĂ­ prĂĄce se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ho stavu mluvčích z ƙečovĂ©ho signĂĄlu. PrĂĄce je rozdělena do dvou hlavnĂ­ch častĂ­, prvnĂ­ část popisuju navrĆŸenĂ© metody pro rozpoznĂĄnĂ­ emočnĂ­ho stavu z hranĂœch databĂĄzĂ­. V rĂĄmci tĂ©to části jsou pƙedstaveny vĂœsledky rozpoznĂĄnĂ­ pouĆŸitĂ­m dvou rĆŻznĂœch databĂĄzĂ­ s rĆŻznĂœmi jazyky. HlavnĂ­mi pƙínosy tĂ©to části je detailnĂ­ analĂœza rozsĂĄhlĂ© ĆĄkĂĄly rĆŻznĂœch pƙíznakĆŻ zĂ­skanĂœch z ƙečovĂ©ho signĂĄlu, nĂĄvrh novĂœch klasifikačnĂ­ch architektur jako je napƙíklad „emočnĂ­ pĂĄrovĂĄní“ a nĂĄvrh novĂ© metody pro mapovĂĄnĂ­ diskrĂ©tnĂ­ch emočnĂ­ch stavĆŻ do dvou dimenzionĂĄlnĂ­ho prostoru. DruhĂĄ část se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ch stavĆŻ z databĂĄze spontĂĄnnĂ­ ƙeči, kterĂĄ byla zĂ­skĂĄna ze zĂĄznamĆŻ hovorĆŻ z reĂĄlnĂœch call center. Poznatky z analĂœzy a nĂĄvrhu metod rozpoznĂĄnĂ­ z hranĂ© ƙeči byly vyuĆŸity pro nĂĄvrh novĂ©ho systĂ©mu pro rozpoznĂĄnĂ­ sedmi spontĂĄnnĂ­ch emočnĂ­ch stavĆŻ. JĂĄdrem navrĆŸenĂ©ho pƙístupu je komplexnĂ­ klasifikačnĂ­ architektura zaloĆŸena na fĂșzi rĆŻznĂœch systĂ©mĆŻ. PrĂĄce se dĂĄle zabĂœvĂĄ vlivem emočnĂ­ho stavu mluvčího na Ășspěơnosti rozpoznĂĄnĂ­ pohlavĂ­ a nĂĄvrhem systĂ©mu pro automatickou detekci ĂșspěơnĂœch hovorĆŻ v call centrech na zĂĄkladě analĂœzy parametrĆŻ dialogu mezi ĂșčastnĂ­ky telefonnĂ­ch hovorĆŻ.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    Estimation of Single-Index Models Based on Boosting Techniques

    Get PDF
    In single-index models the link or response function is not considered as fixed. The data determine the form of the unknown link function. In order to obtain a flexible form of the link function we specify the link function as an expansion in basis function and propose to estimate parameters as well as the link function by weak learners within a boosting framework. It is shown that the method is a strong competitor to existing methods. The method is investigated in simulation studies and applied to real data
    • 

    corecore