171 research outputs found

    Modelling function-valued processes with complex structure

    Get PDF
    PhD ThesisExisting approaches to functional principal component analysis (FPCA) usually rely on nonparametric estimation of the covariance structure. When function-valued processes are observed on a multidimensional domain, the nonparametric estimation suffers from the curse of dimensionality, forcing FPCA methods to make restrictive assumptions such as covariance separability. In this thesis, we discuss a general Bayesian framework on modelling function-valued processes by using a Gaussian process (GP) as a prior, enabling us to handle nonseparable and/or nonstationary covariance structure. The nonstationarity is introduced by a convolution-based approach through a varying kernel, whose parameters vary along the input space and are estimated via a local empirical Bayesian method. For the varying anisotropy matrix, we propose to use a spherical parametrisation, leading to unconstrained and interpretable parameters and allowing for interaction between coordinate directions in the covariance function. The unconstrained nature allows the parameters to be modelled as a nonparametric function of time, spatial location and even additional covariates. In the spirit of FPCA, the Bayesian framework can decompose the function-valued processes using the eigenvalues and eigensurfaces calculated from the estimated covariance structure. A finite number of the eigensurfaces can be used to extract some of the most important information involved in data with complex covariance structure. We also extend the methods to handle multivariate function-valued processes. The estimated covariance structure is shown to be important to analyse joint variation in the data and is further used in our proposed multiple functional partial least squares regression model. We show that the interaction between the scalar response variable and function-valued covariates can be explained by fewer terms than in a regression model which uses multivariate functional principal components. Simulation studies and applications to real data show that our proposed approaches provide new insights into the data and excellent prediction results

    Theoretical Analysis of Nonparametric Filament Estimation

    Full text link
    This paper provides a rigorous study of the nonparametric estimation of filaments or ridge lines of a probability density ff. Points on the filament are considered as local extrema of the density when traversing the support of ff along the integral curve driven by the vector field of second eigenvectors of the Hessian of ff. We `parametrize' points on the filaments by such integral curves, and thus both the estimation of integral curves and of filaments will be considered via a plug-in method using kernel density estimation. We establish rates of convergence and asymptotic distribution results for the estimation of both the integral curves and the filaments. The main theoretical result establishes the asymptotic distribution of the uniform deviation of the estimated filament from its theoretical counterpart. This result utilizes the extreme value behavior of non-stationary Gaussian processes indexed by manifolds Mh,h(0,1]M_h, h \in(0,1] as h0h \to 0.Comment: 55 pages, 1 figur

    Mini-Workshop: Semiparametric Modelling of Multivariate Economic Time Series With Changing Dynamics

    Get PDF
    Modelling multivariate time series of possibly high dimension calls for appropriate dimension-reduction, e.g. by some factor modelling, additive modelling, or some simplified parametric structure for the dynamics (i.e. the serial dependence) of the time series. This workshop aimed to bring together experts in this field in order to discuss recent methodology for multivariate time series dynamics which are changing over time: by an abrupt switch between two (or more) different regimes or rather smoothly evolving over time. The emphasis has been on mathematical methods for semiparametric modelling and estimation, where ”semiparametric” is to be understood in a rather broad sense: parametric models where the parameters are themselves nonparametric functions (of time), regime-switching nonparametric models with a parametric specification of the transition mechanism, and alike. An ultimate goal of these models to be applied to economic and financial time series is prediction. Another emphasis has been on comparing Bayesian with frequentist approaches, and to cover both theoretical aspects of estimation, such as consistency and efficiency, and computational aspects

    Challenges in Statistical Theory: Complex Data Structures and Algorithmic Optimization

    Get PDF
    Technological developments have created a constant incoming stream of complex new data structures that need analysis. Modern statistics therefore means mathematically sophisticated new statistical theory that generates or supports innovative data-analytic methodologies for complex data structures. Inherent in many of these methodologies are challenging numerical optimization methods. The proposed workshop intends to bring together experts from mathematical statistics as well as statisticians involved in serious modern applications and computing. The primary goal of this meeting was to advance the mathematical and methodological underpinnings of modern statistics for complex data. Particular focus was given to the advancement of theory and methods under non-stationarity and complex dependence structures including (multivariate) financial time series, scientific data analysis in neurosciences and bio-physics, estimation under shape constraints, and highdimensional discrimination/classification

    A computational framework for infinite-dimensional Bayesian inverse problems: Part II. Stochastic Newton MCMC with application to ice sheet flow inverse problems

    Full text link
    We address the numerical solution of infinite-dimensional inverse problems in the framework of Bayesian inference. In the Part I companion to this paper (arXiv.org:1308.1313), we considered the linearized infinite-dimensional inverse problem. Here in Part II, we relax the linearization assumption and consider the fully nonlinear infinite-dimensional inverse problem using a Markov chain Monte Carlo (MCMC) sampling method. To address the challenges of sampling high-dimensional pdfs arising from Bayesian inverse problems governed by PDEs, we build on the stochastic Newton MCMC method. This method exploits problem structure by taking as a proposal density a local Gaussian approximation of the posterior pdf, whose construction is made tractable by invoking a low-rank approximation of its data misfit component of the Hessian. Here we introduce an approximation of the stochastic Newton proposal in which we compute the low-rank-based Hessian at just the MAP point, and then reuse this Hessian at each MCMC step. We compare the performance of the proposed method to the original stochastic Newton MCMC method and to an independence sampler. The comparison of the three methods is conducted on a synthetic ice sheet inverse problem. For this problem, the stochastic Newton MCMC method with a MAP-based Hessian converges at least as rapidly as the original stochastic Newton MCMC method, but is far cheaper since it avoids recomputing the Hessian at each step. On the other hand, it is more expensive per sample than the independence sampler; however, its convergence is significantly more rapid, and thus overall it is much cheaper. Finally, we present extensive analysis and interpretation of the posterior distribution, and classify directions in parameter space based on the extent to which they are informed by the prior or the observations.Comment: 31 page

    Non-Negative Matrix Factorization Based Algorithms to Cluster Frequency Basis Functions for Monaural Sound Source Separation.

    Get PDF
    Monophonic sound source separation (SSS) refers to a process that separates out audio signals produced from the individual sound sources in a given acoustic mixture, when the mixture signal is recorded using one microphone or is directly recorded onto one reproduction channel. Many audio applications such as pitch modification and automatic music transcription would benefit from the availability of segregated sound sources from the mixture of audio signals for further processing. Recently, Non-negative matrix factorization (NMF) has found application in monaural audio source separation due to its ability to factorize audio spectrograms into additive part-based basis functions, where the parts typically correspond to individual notes or chords in music. An advantage of NMF is that there can be a single basis function for each note played by a given instrument, thereby capturing changes in timbre with pitch for each instrument or source. However, these basis functions need to be clustered to their respective sources for the reconstruction of the individual source signals. Many clustering methods have been proposed to map the separated signals into sources with considerable success. Recently, to avoid the need of clustering, Shifted NMF (SNMF) was proposed, which assumes that the timbre of a note is constant for all the pitches produced by an instrument. SNMF has two drawbacks. Firstly, the assumption that the timbre of the notes played by an instrument remains constant, is not true in general. Secondly, the SNMF method uses the Constant Q transform (CQT) and the lack of a true inverse of the CQT results in compromising on separation quality of the reconstructed signal. The principal aim of this thesis is to attempt to solve the problem of clustering NMF basis functions. Our first major contribution is the use of SNMF as a method of clustering the basis functions obtained via standard NMF. The proposed SNMF clustering method aims to cluster the frequency basis functions obtained via standard NMF to their respective sources by making use of shift invariance in a log-frequency domain. Further, a minor contribution is made by improving the separation performance of the standard SNMF algorithm (here used directly to separate sources) obtained through the use of an improved inverse CQT. Here, the standard SNMF algorithm finds shift-invariance in a CQ spectrogram, that contain the frequency basis functions, obtained directly from the spectrogram of the audio mixture. Our next contribution is an improvement in the SNMF clustering algorithm through the incorporation of the CQT matrix inside the SNMF model in order to avoid the need of an inverse CQT to reconstruct the clustered NMF basis unctions. Another major contribution deals with the incorporation of a constraint called group sparsity (GS) into the SNMF clustering algorithm at two stages to improve clustering. The effect of the GS is evaluated on various SNMF clustering algorithms proposed in this thesis. Finally, we have introduced a new family of masks to reconstruct the original signal from the clustered basis functions and compared their performance to the generalized Wiener filter masks using three different factorisation-based separation algorithms. We show that better separation performance can be achieved by using the proposed family of masks
    corecore