1,686 research outputs found

    Mixture model with multiple allocations for clustering spatially correlated observations in the analysis of ChIP-Seq data

    Get PDF
    Model-based clustering is a technique widely used to group a collection of units into mutually exclusive groups. There are, however, situations in which an observation could in principle belong to more than one cluster. In the context of Next-Generation Sequencing (NGS) experiments, for example, the signal observed in the data might be produced by two (or more) different biological processes operating together and a gene could participate in both (or all) of them. We propose a novel approach to cluster NGS discrete data, coming from a ChIP-Seq experiment, with a mixture model, allowing each unit to belong potentially to more than one group: these multiple allocation clusters can be flexibly defined via a function combining the features of the original groups without introducing new parameters. The formulation naturally gives rise to a `zero-inflation group' in which values close to zero can be allocated, acting as a correction for the abundance of zeros that manifest in this type of data. We take into account the spatial dependency between observations, which is described through a latent Conditional Auto-Regressive process that can reflect different dependency patterns. We assess the performance of our model within a simulation environment and then we apply it to ChIP-seq real data.Comment: 25 pages; 3 tables, 6 figure

    A Simple Class of Bayesian Nonparametric Autoregression Models

    Get PDF
    We introduce a model for a time series of continuous outcomes, that can be expressed as fully nonparametric regression or density regression on lagged terms. The model is based on a dependent Dirichlet process prior on a family of random probability measures indexed by the lagged covariates. The approach is also extended to sequences of binary responses. We discuss implementation and applications of the models to a sequence of waiting times between eruptions of the Old Faithful Geyser, and to a dataset consisting of sequences of recurrence indicators for tumors in the bladder of several patients.MIUR 2008MK3AFZFONDECYT 1100010NIH/NCI R01CA075981Mathematic

    Nonlinear Time Series Modeling: A Unified Perspective, Algorithm, and Application

    Full text link
    A new comprehensive approach to nonlinear time series analysis and modeling is developed in the present paper. We introduce novel data-specific mid-distribution based Legendre Polynomial (LP) like nonlinear transformations of the original time series Y(t) that enables us to adapt all the existing stationary linear Gaussian time series modeling strategy and made it applicable for non-Gaussian and nonlinear processes in a robust fashion. The emphasis of the present paper is on empirical time series modeling via the algorithm LPTime. We demonstrate the effectiveness of our theoretical framework using daily S&P 500 return data between Jan/2/1963 - Dec/31/2009. Our proposed LPTime algorithm systematically discovers all the `stylized facts' of the financial time series automatically all at once, which were previously noted by many researchers one at a time.Comment: Major restructuring has been don

    Non-parametric Probabilistic Time Series Forecasting via Innovations Representation

    Full text link
    Probabilistic time series forecasting predicts the conditional probability distributions of the time series at a future time given past realizations. Such techniques are critical in risk-based decision-making and planning under uncertainties. Existing approaches are primarily based on parametric or semi-parametric time-series models that are restrictive, difficult to validate, and challenging to adapt to varying conditions. This paper proposes a nonparametric method based on the classic notion of {\em innovations} pioneered by Norbert Wiener and Gopinath Kallianpur that causally transforms a nonparametric random process to an independent and identical uniformly distributed {\em innovations process}. We present a machine-learning architecture and a learning algorithm that circumvent two limitations of the original Wiener-Kallianpur innovations representation: (i) the need for known probability distributions of the time series and (ii) the existence of a causal decoder that reproduces the original time series from the innovations representation. We develop a deep-learning approach and a Monte Carlo sampling technique to obtain a generative model for the predicted conditional probability distribution of the time series based on a weak notion of Wiener-Kallianpur innovations representation. The efficacy of the proposed probabilistic forecasting technique is demonstrated on a variety of electricity price datasets, showing marked improvement over leading benchmarks of probabilistic forecasting techniques

    Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation

    Full text link
    The control of nonlinear dynamical systems remains a major challenge for autonomous agents. Current trends in reinforcement learning (RL) focus on complex representations of dynamics and policies, which have yielded impressive results in solving a variety of hard control tasks. However, this new sophistication and extremely over-parameterized models have come with the cost of an overall reduction in our ability to interpret the resulting policies. In this paper, we take inspiration from the control community and apply the principles of hybrid switching systems in order to break down complex dynamics into simpler components. We exploit the rich representational power of probabilistic graphical models and derive an expectation-maximization (EM) algorithm for learning a sequence model to capture the temporal structure of the data and automatically decompose nonlinear dynamics into stochastic switching linear dynamical systems. Moreover, we show how this framework of switching models enables extracting hierarchies of Markovian and auto-regressive locally linear controllers from nonlinear experts in an imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro

    Methods for event time series prediction and anomaly detection

    Get PDF
    Event time series are sequences of events occurring in continuous time. They arise in many real-world problems and may represent, for example, posts in social media, administrations of medications to patients, or adverse events, such as episodes of atrial fibrillation or earthquakes. In this work, we study and develop methods for prediction and anomaly detection on event time series. We study two general approaches. The first approach converts event time series to regular time series of counts via time discretization. We develop methods relying on (a) nonparametric time series decomposition and (b) dynamic linear models for regular time series. The second approach models the events in continuous time directly. We develop methods relying on point processes. For prediction, we develop a new model based on point processes to combine the advantages of existing models. It is flexible enough to capture complex dependency structures between events, while not sacrificing applicability in common scenarios. For anomaly detection, we develop methods that can detect new types of anomalies in continuous time and that show advantages compared to time discretization
    • …
    corecore