393 research outputs found

    Ranking Median Regression: Learning to Order through Local Consensus

    Full text link
    This article is devoted to the problem of predicting the value taken by a random permutation Σ\Sigma, describing the preferences of an individual over a set of numbered items {1,  …,  n}\{1,\; \ldots,\; n\} say, based on the observation of an input/explanatory r.v. XX e.g. characteristics of the individual), when error is measured by the Kendall τ\tau distance. In the probabilistic formulation of the 'Learning to Order' problem we propose, which extends the framework for statistical Kemeny ranking aggregation developped in \citet{CKS17}, this boils down to recovering conditional Kemeny medians of Σ\Sigma given XX from i.i.d. training examples (X1,Σ1),  …,  (XN,ΣN)(X_1, \Sigma_1),\; \ldots,\; (X_N, \Sigma_N). For this reason, this statistical learning problem is referred to as \textit{ranking median regression} here. Our contribution is twofold. We first propose a probabilistic theory of ranking median regression: the set of optimal elements is characterized, the performance of empirical risk minimizers is investigated in this context and situations where fast learning rates can be achieved are also exhibited. Next we introduce the concept of local consensus/median, in order to derive efficient methods for ranking median regression. The major advantage of this local learning approach lies in its close connection with the widely studied Kemeny aggregation problem. From an algorithmic perspective, this permits to build predictive rules for ranking median regression by implementing efficient techniques for (approximate) Kemeny median computations at a local level in a tractable manner. In particular, versions of kk-nearest neighbor and tree-based methods, tailored to ranking median regression, are investigated. Accuracy of piecewise constant ranking median regression rules is studied under a specific smoothness assumption for Σ\Sigma's conditional distribution given XX

    Sparse Linear Prediction and Its Applications to Speech Processing

    Get PDF

    Psychophysical and signal-processing aspects of speech representation

    Get PDF

    Curve Estimation Based on Localised Principal Components - Theory and Applications

    Get PDF
    In this work, basic theory and some proposed developments to localised principal components and curves are introduced. In addition, some areas of application for local principal curves are explored. Only relatively recently, localised principal components utilising kernel-type weights have found their way into the statistical literature. In this study, the asymptotic behaviour of the method is investigated and extended to the context of local principal curves, where the characteristics of the points at which the curve stops at the edges are identified. This is used to develop a method that lets the curve `delay' convergence if desired, gaining more access to boundary regions of the data. Also, a method for automatic choice of the starting point to be one of the local modes within the data cloud is originated. The modified local principal curves' algorithm is then used for fitting multi-dimensional econometric data. Special attention is given to the role of the curve parametrisation, which serves as a feature extractor and also as a prediction tool when properly linked to time as a probable underlying latent variable. Local principal curves provide a good dimensionality reduction and feature extraction tool for insurance industry key indicators and consumer price indices. Also, through `calibrating' it with time, curve parametrisation is used for the purpose of predicting unemployment and inflation rates

    Sparsity in Linear Predictive Coding of Speech

    Get PDF
    nrpages: 197status: publishe

    Methods for Estimation of Intrinsic Dimensionality

    Get PDF
    Dimension reduction is an important tool used to describe the structure of complex data (explicitly or implicitly) through a small but sufficient number of variables, and thereby make data analysis more efficient. It is also useful for visualization purposes. Dimension reduction helps statisticians to overcome the ‘curse of dimensionality’. However, most dimension reduction techniques require the intrinsic dimension of the low-dimensional subspace to be fixed in advance. The availability of reliable intrinsic dimension (ID) estimation techniques is of major importance. The main goal of this thesis is to develop algorithms for determining the intrinsic dimensions of recorded data sets in a nonlinear context. Whilst this is a well-researched topic for linear planes, based mainly on principal components analysis, relatively little attention has been paid to ways of estimating this number for non–linear variable interrelationships. The proposed algorithms here are based on existing concepts that can be categorized into local methods, relying on randomly selected subsets of a recorded variable set, and global methods, utilizing the entire data set. This thesis provides an overview of ID estimation techniques, with special consideration given to recent developments in non–linear techniques, such as charting manifold and fractal–based methods. Despite their nominal existence, the practical implementation of these techniques is far from straightforward. The intrinsic dimension is estimated via Brand’s algorithm by examining the growth point process, which counts the number of points in hyper-spheres. The estimation needs to determine the starting point for each hyper-sphere. In this thesis we provide settings for selecting starting points which work well for most data sets. Additionally we propose approaches for estimating dimensionality via Brand’s algorithm, the Dip method and the Regression method. Other approaches are proposed for estimating the intrinsic dimension by fractal dimension estimation methods, which exploit the intrinsic geometry of a data set. The most popular concept from this family of methods is the correlation dimension, which requires the estimation of the correlation integral for a ball of radius tending to 0. In this thesis we propose new approaches to approximate the correlation integral in this limit. The new approaches are the Intercept method, the Slop method and the Polynomial method. In addition we propose a new approach, a localized global method, which could be defined as a local version of global ID methods. The objective of the localized global approach is to improve the algorithm based on a local ID method, which could significantly reduce the negative bias. Experimental results on real world and simulated data are used to demonstrate the algorithms and compare them to other methodology. A simulation study which verifies the effectiveness of the proposed methods is also provided. Finally, these algorithms are contrasted using a recorded data set from an industrial melter process

    The impact of inflation on financial development in South Africa

    Get PDF
    Growing theoretical and empirical studies have predicted different influences that inflation has on financial development in different economies. This dissertation observes the impact South Africa’s inflation has on financial development over the period between 1990 and 2012. Monetary policy framework in South Africa has, to a greater extent, assisted in monitoring the movement of the consumer price index. Although inflation does affect financial sector performance, the study also looked into other variables that have an effect like private credit, money supply and gross domestic product. To test for stationarity to avoid spurious regression, the ADF test and the PP test were used. To determine the long- and short-run relationship, the Johansen Maximum Likelihood test and VECM models were used. Results of the study indicated that money supply and inflation have a negative effect on financial development. In addition, apart from money supply and inflation the findings revealed that private credit and gross domestic product play a significant part in financial sector performance. The study recommends that the South African Reserve Bank should keep the inflation rate within its target range (3-6percent). This would ensure price stability and restore investor confidence in the financial sector, which then improves financial sector development

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Maximum likelihood estimation for discrete exponential families and random graphs

    Get PDF
    We characterize the existence of the maximum likelihood estimator for discrete exponential families. Our criterion is simple to apply as we show in various settings, most notably for exponential models of random graphs. As an application, we point out the size of independent identically distributed samples for which the maximum likelihood estimator exists with high probability.Comment: 21 pages, minor editorial changes, added connections to the criterion of Barndorff-Nielsen and linear programmin

    Improved compactly computable objective measures for predicting the acceptiability of speech communications systems

    Get PDF
    Issued as Monthly status reports [1-7], and Final report, Project no. E-21-61
    • …
    corecore