296 research outputs found

    Exact Bayesian curve fitting and signal segmentation.

    Get PDF
    We consider regression models where the underlying functional relationship between the response and the explanatory variable is modeled as independent linear regressions on disjoint segments. We present an algorithm for perfect simulation from the posterior distribution of such a model, even allowing for an unknown number of segments and an unknown model order for the linear regressions within each segment. The algorithm is simple, can scale well to large data sets, and avoids the problem of diagnosing convergence that is present with Monte Carlo Markov Chain (MCMC) approaches to this problem. We demonstrate our algorithm on standard denoising problems, on a piecewise constant AR model, and on a speech segmentation problem

    Simulator adaptation at runtime for component-based simulation software

    Get PDF
    Component-based simulation software can provide many opportunities to compose and configure simulators, resulting in an algorithm selection problem for the user of this software. This thesis aims to automate the selection and adaptation of simulators at runtime in an application-independent manner. Further, it explores the potential of tailored and approximate simulators - in this thesis concretely developed for the modeling language ML-Rules - supporting the effectiveness of the adaptation scheme.Komponenten-basierte Simulationssoftware kann viele Möglichkeiten zur Komposition und Konfiguration von Simulatoren bieten und damit zu einem Konfigurationsproblem für Nutzer dieser Software führen. Das Ziel dieser Arbeit ist die Entwicklung einer generischen und automatisierten Auswahl- und Adaptionsmethode für Simulatoren. Darüber hinaus wird das Potential von spezifischen und approximativen Simulatoren anhand der Modellierungssprache ML-Rules untersucht, welche die Effektivität des entwickelten Adaptionsmechanismus erhöhen können

    Sequential Adaptive Detection for In-Situ Transmission Electron Microscopy (TEM)

    Full text link
    We develop new efficient online algorithms for detecting transient sparse signals in TEM video sequences, by adopting the recently developed framework for sequential detection jointly with online convex optimization [1]. We cast the problem as detecting an unknown sparse mean shift of Gaussian observations, and develop adaptive CUSUM and adaptive SSRS procedures, which are based on likelihood ratio statistics with post-change mean vector being online maximum likelihood estimators with 1\ell_1. We demonstrate the meritorious performance of our algorithms for TEM imaging using real data

    Contributions on evolutionary computation for statistical inference

    Get PDF
    Evolutionary Computation (EC) techniques have been introduced in the 1960s for dealing with complex situations. One possible example is an optimization problems not having an analytical solution or being computationally intractable; in many cases such methods, named Evolutionary Algorithms (EAs), have been successfully implemented. In statistics there are many situations where complex problems arise, in particular concerning optimization. A general example is when the statistician needs to select, inside a prohibitively large discrete set, just one element, which could be a model, a partition, an experiment, or such: this would be the case of model selection, cluster analysis or design of experiment. In other situations there could be an intractable function of data, such as a likelihood, which needs to be maximized, as it happens in model parameter estimation. These kind of problems are naturally well suited for EAs, and in the last 20 years a large number of papers has been concerned with applications of EAs in tackling statistical issues. The present dissertation is set in this part of literature, as it reports several implementations of EAs in statistics, although being mainly focused on statistical inference problems. Original results are proposed, as well as overviews and surveys on several topics. EAs are employed and analyzed considering various statistical points of view, showing and confirming their efficiency and flexibility. The first proposal is devoted to parametric estimation problems. When EAs are employed in such analysis a novel form of variability related to their stochastic elements is introduced. We shall analyze both variability due to sampling, associated with selected estimator, and variability due to the EA. This analysis is set in a framework of statistical and computational tradeoff question, crucial in nowadays problems, by introducing cost functions related to both data acquisition and EA iterations. The proposed method will be illustrated by means of model building problem examples. Subsequent chapter is concerned with EAs employed in Markov Chain Monte Carlo (MCMC) sampling. When sampling from multimodal or highly correlated distribution is concerned, in fact, a possible strategy suggests to run several chains in parallel, in order to improve their mixing. If these chains are allowed to interact with each other then many analogies with EC techniques can be observed, and this has led to research in many fields. The chapter aims at reviewing various methods found in literature which conjugates EC techniques and MCMC sampling, in order to identify specific and common procedures, and unifying them in a framework of EC. In the last proposal we present a complex time series model and an identification procedure based on Genetic Algorithms (GAs). The model is capable of dealing with seasonality, by Periodic AutoRegressive (PAR) modelling, and structural changes in time, leading to a nonstationary structure. As far as a very large number of parameters and possibilites of change points are concerned, GAs are appropriate for identifying such model. Effectiveness of procedure is shown on both simulated data and real examples, these latter referred to river flow data in hydrology. The thesis concludes with some final remarks, concerning also future work

    Contributions on evolutionary computation for statistical inference

    Get PDF
    Evolutionary Computation (EC) techniques have been introduced in the 1960s for dealing with complex situations. One possible example is an optimization problems not having an analytical solution or being computationally intractable; in many cases such methods, named Evolutionary Algorithms (EAs), have been successfully implemented. In statistics there are many situations where complex problems arise, in particular concerning optimization. A general example is when the statistician needs to select, inside a prohibitively large discrete set, just one element, which could be a model, a partition, an experiment, or such: this would be the case of model selection, cluster analysis or design of experiment. In other situations there could be an intractable function of data, such as a likelihood, which needs to be maximized, as it happens in model parameter estimation. These kind of problems are naturally well suited for EAs, and in the last 20 years a large number of papers has been concerned with applications of EAs in tackling statistical issues. The present dissertation is set in this part of literature, as it reports several implementations of EAs in statistics, although being mainly focused on statistical inference problems. Original results are proposed, as well as overviews and surveys on several topics. EAs are employed and analyzed considering various statistical points of view, showing and confirming their efficiency and flexibility. The first proposal is devoted to parametric estimation problems. When EAs are employed in such analysis a novel form of variability related to their stochastic elements is introduced. We shall analyze both variability due to sampling, associated with selected estimator, and variability due to the EA. This analysis is set in a framework of statistical and computational tradeoff question, crucial in nowadays problems, by introducing cost functions related to both data acquisition and EA iterations. The proposed method will be illustrated by means of model building problem examples. Subsequent chapter is concerned with EAs employed in Markov Chain Monte Carlo (MCMC) sampling. When sampling from multimodal or highly correlated distribution is concerned, in fact, a possible strategy suggests to run several chains in parallel, in order to improve their mixing. If these chains are allowed to interact with each other then many analogies with EC techniques can be observed, and this has led to research in many fields. The chapter aims at reviewing various methods found in literature which conjugates EC techniques and MCMC sampling, in order to identify specific and common procedures, and unifying them in a framework of EC. In the last proposal we present a complex time series model and an identification procedure based on Genetic Algorithms (GAs). The model is capable of dealing with seasonality, by Periodic AutoRegressive (PAR) modelling, and structural changes in time, leading to a nonstationary structure. As far as a very large number of parameters and possibilites of change points are concerned, GAs are appropriate for identifying such model. Effectiveness of procedure is shown on both simulated data and real examples, these latter referred to river flow data in hydrology. The thesis concludes with some final remarks, concerning also future work

    Segmentation of the Poisson and negative binomial rate models: a penalized estimator

    Full text link
    We consider the segmentation problem of Poisson and negative binomial (i.e. overdispersed Poisson) rate distributions. In segmentation, an important issue remains the choice of the number of segments. To this end, we propose a penalized log-likelihood estimator where the penalty function is constructed in a non-asymptotic context following the works of L. Birg\'e and P. Massart. The resulting estimator is proved to satisfy an oracle inequality. The performances of our criterion is assessed using simulated and real datasets in the RNA-seq data analysis context

    Changepoint detection for data intensive settings

    Get PDF
    Detecting a point in a data sequence where the behaviour alters abruptly, otherwise known as a changepoint, has been an active area of interest for decades. More recently, with the advent of the data intensive era, the need for automated and computationally efficient changepoint methods has grown. We here introduce several new techniques for doing this which address many of the issues inherent in detecting changes in a streaming setting. In short, these new methods, which may be viewed as non-trivial extensions of existing classical procedures, are intended to be as useful in as wide a set of situations as possible, while retaining important theoretical guarantees and ease of implementation. The first novel contribution concerns two methods for parallelising existing dynamic programming based approaches to changepoint detection in the single variate setting. We demonstrate that these methods can result in near quadratic computational gains, while retaining important theoretical guarantees. Our next area of focus is the multivariate setting. We introduce two new methods for data intensive scenarios with a fixed, but possibly large, number of dimensions. The first of these is an offline method which detects one change at a time using a new test statistic. We demonstrate that this test statistic has competitive power in a variety of possible settings for a given changepoint, while allowing the method to be versatile across a range of possible modelling assumptions. The other method we introduce for multivariate data is also suitable in the streaming setting. In addition, it is able to relax many standard modelling assumptions. We discuss the empirical properties of the procedure, especially insofar as they relate to a desired false alarm error rate
    corecore