9 research outputs found

    A comparative evaluation of nonlinear dynamics methods for time series prediction

    Get PDF
    A key problem in time series prediction using autoregressive models is to fix the model order, namely the number of past samples required to model the time series adequately. The estimation of the model order using cross-validation may be a long process. In this paper, we investigate alternative methods to cross-validation, based on nonlinear dynamics methods, namely Grassberger-Procaccia, K,gl, Levina-Bickel and False Nearest Neighbors algorithms. The experiments have been performed in two different ways. In the first case, the model order has been used to carry out the prediction, performed by a SVM for regression on three real data time series showing that nonlinear dynamics methods have performances very close to the cross-validation ones. In the second case, we have tested the accuracy of nonlinear dynamics methods in predicting the known model order of synthetic time series. In this case, most of the methods have yielded a correct estimate and when the estimate was not correct, the value was very close to the real one

    Optimal Watermark Embedding and Detection Strategies Under Limited Detection Resources

    Full text link
    An information-theoretic approach is proposed to watermark embedding and detection under limited detector resources. First, we consider the attack-free scenario under which asymptotically optimal decision regions in the Neyman-Pearson sense are proposed, along with the optimal embedding rule. Later, we explore the case of zero-mean i.i.d. Gaussian covertext distribution with unknown variance under the attack-free scenario. For this case, we propose a lower bound on the exponential decay rate of the false-negative probability and prove that the optimal embedding and detecting strategy is superior to the customary linear, additive embedding strategy in the exponential sense. Finally, these results are extended to the case of memoryless attacks and general worst case attacks. Optimal decision regions and embedding rules are offered, and the worst attack channel is identified.Comment: 36 pages, 5 figures. Revised version. Submitted to IEEE Transactions on Information Theor

    Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates

    Get PDF
    The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.Comment: Accepted to the Journal of Machine Learning Research (Feb 2011

    Generalized Phase Space Projection for Nonlinear Noise Reduction

    Get PDF
    Improved phase space projection methods, adapted from related work in the linear signal processing field based on subspace decomposition, are presented for application to the problem of additive noise reduction in the context of phase space analysis. These methods improve upon existing methods such as Broomhead–King singular spectrum analysis projection by minimizing overall signal distortion subject to constraints on the residual error, rather than using a direct least-squares fit. This results in a range of weighted projections which estimate and compensate for the portion of the principal component\u27s singular values corresponding to noise rather than signal energy, and which include least-squares (LS) and least minimum mean square error (LMMSE) as subcases. The nature of phase space covariance, the key element in construction of the projection matrix, is examined across global phase spaces as well as within local neighborhood regions. The resulting algorithm, illustrated on a noisy Henon map as well as on the task of speech enhancement, is applicable to a wide variety of nonlinear noise reduction tasks

    Large-deviation analysis and applications Of learning tree-structured graphical models

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 213-228).The design and analysis of complexity-reduced representations for multivariate data is important in many scientific and engineering domains. This thesis explores such representations from two different perspectives: deriving and analyzing performance measures for learning tree-structured graphical models and salient feature subset selection for discrimination. Graphical models have proven to be a flexible class of probabilistic models for approximating high-dimensional data. Learning the structure of such models from data is an important generic task. It is known that if the data are drawn from tree-structured distributions, then the algorithm of Chow and Liu (1968) provides an efficient algorithm for finding the tree that maximizes the likelihood of the data. We leverage this algorithm and the theory of large deviations to derive the error exponent of structure learning for discrete and Gaussian graphical models. We determine the extremal tree structures for learning, that is, the structures that lead to the highest and lowest exponents. We prove that the star minimizes the exponent and the chain maximizes the exponent, which means that among all unlabeled trees, the star and the chain are the worst and best for learning respectively. The analysis is also extended to learning foreststructured graphical models by augmenting the Chow-Liu algorithm with a thresholding procedure. We prove scaling laws on the number of samples and the number variables for structure learning to remain consistent in high-dimensions. The next part of the thesis is concerned with discrimination. We design computationally efficient tree-based algorithms to learn pairs of distributions that are specifically adapted to the task of discrimination and show that they perform well on various datasets vis-`a-vis existing tree-based algorithms. We define the notion of a salient set for discrimination using information-theoretic quantities and derive scaling laws on the number of samples so that the salient set can be recovered asymptotically.by Vincent Yan Fu Tan.Ph.D

    The estimation of the model order in exponential families

    No full text
    corecore