Search CORE

34 research outputs found

Monotone Regression: A Simple and Fast O(n) PAVA Implementation

Author: Busing Frank M. T. A.
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 26/05/2022
Field of study

Efficient coding and improvements in the execution order of the up-and-down-blocks algorithm for monotone or isotonic regression leads to a significant increase in speed as well as a short and simple O(n) implementation. Algorithms that use monotone regression as a subroutine, e.g., unimodal or bivariate monotone regression, also benefit from the acceleration. A substantive comparison with and characterization of currently available implementations provides an extensive overview of up-and-down-blocks implementations for the pool-adjacent-violators algorithm for simple linear ordered monotone regression

Journal of Statistical Software

Leiden University Scholary Publications

Optimal Rates of Statistical Seriation

Author: Flammarion Nicolas
Mao Cheng
Rigollet Philippe
Publication venue
Publication date: 01/08/2016
Field of study

Given a matrix the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.Comment: V2 corrects an error in Lemma A.1, v3 corrects appendix F on unimodal regression where the bounds now hold with polynomial probability rather than exponentia

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

A dynamic programming approach for generalized nearly isotonic optimization

Author: Chen Xuyu
Li Xudong
Yu Zhensheng
Publication venue
Publication date: 06/11/2020
Field of study

Shape restricted statistical estimation problems have been extensively studied, with many important practical applications in signal processing, bioinformatics, and machine learning. In this paper, we propose and study a generalized nearly isotonic optimization (GNIO) model, which recovers, as special cases, many classic problems in shape constrained statistical regression, such as isotonic regression, nearly isotonic regression and unimodal regression problems. We develop an efficient and easy-to-implement dynamic programming algorithm for solving the proposed model whose recursion nature is carefully uncovered and exploited. For special

\ell_2

-GNIO problems, implementation details and the optimal

{\cal O}(n)

running time analysis of our algorithm are discussed. Numerical experiments, including the comparison between our approach and the powerful commercial solver Gurobi for solving

\ell_1

-GNIO and

\ell_2

-GNIO problems, on both simulated and real data sets are presented to demonstrate the high efficiency and robustness of our proposed algorithm in solving large scale GNIO problems

arXiv.org e-Print Archive

Private Isotonic Regression

Author: Ghazi Badih
Kamath Pritish
Kumar Ravi
Manurangsi Pasin
Publication venue
Publication date: 27/10/2022
Field of study

In this paper, we consider the problem of differentially private (DP) algorithms for isotonic regression. For the most general problem of isotonic regression over a partially ordered set (poset)

\mathcal{X}

and for any Lipschitz loss function, we obtain a pure-DP algorithm that, given

n

input points, has an expected excess empirical risk of roughly

\mathrm{width}(\mathcal{X}) \cdot \log|\mathcal{X}| / n

, where

\mathrm{width}(\mathcal{X})

is the width of the poset. In contrast, we also obtain a near-matching lower bound of roughly

(\mathrm{width}(\mathcal{X}) + \log |\mathcal{X}|) / n

, that holds even for approximate-DP algorithms. Moreover, we show that the above bounds are essentially the best that can be obtained without utilizing any further structure of the poset. In the special case of a totally ordered set and for

\ell_1

and

\ell_2^2

losses, our algorithm can be implemented in near-linear running time; we also provide extensions of this algorithm to the problem of private isotonic regression with additional structural constraints on the output function.Comment: Neural Information Processing Systems (NeurIPS), 202

arXiv.org e-Print Archive

Efficient Second-Order Shape-Constrained Function Fitting

Author: A Aggarwal
A Guntuboyina
C. Fefferman
CE Tsourakakis
DT Lee
FF Yao
H Fournier
H Fournier
PK Agarwal
QF Stout
R Luss
Rahul Mazumder
Z Galil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

We give an algorithm to compute a one-dimensional shape-constrained function that best fits given data in weighted-

L_{\infty}

norm. We give a single algorithm that works for a variety of commonly studied shape constraints including monotonicity, Lipschitz-continuity and convexity, and more generally, any shape constraint expressible by bounds on first- and/or second-order differences. Our algorithm computes an approximation with additive error

\varepsilon

O\left(n \log \frac{U}{\varepsilon} \right)

time, where

U

captures the range of input values. We also give a simple greedy algorithm that runs in

O(n)

time for the special case of unweighted

L_{\infty}

convex regression. These are the first (near-)linear-time algorithms for second-order-constrained function fitting. To achieve these results, we use a novel geometric interpretation of the underlying dynamic programming problem. We further show that a generalization of the corresponding problems to directed acyclic graphs (DAGs) is as difficult as linear programming.Comment: accepted for WADS 2019; (v2 fixes various typos

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

gfpop: an R Package for Univariate Graph-Constrained Change-point Detection

Author: Afghah Fatemeh
Fearnhead Paul
Hocking Toby Dylan
Rigaill Guillem
Romano Gaetano
Runge Vincent
Publication venue
Publication date: 10/02/2020
Field of study

In a world with data that change rapidly and abruptly, it is important to detect those changes accurately. In this paper we describe an R package implementing an algorithm recently proposed by Hocking et al. [2017] for penalised maximum likelihood inference of constrained multiple change-point models. This algorithm can be used to pinpoint the precise locations of abrupt changes in large data sequences. There are many application domains for such models, such as medicine, neuroscience or genomics. Often, practitioners have prior knowledge about the changes they are looking for. For example in genomic data, biologists sometimes expect peaks: up changes followed by down changes. Taking advantage of such prior information can substantially improve the accuracy with which we can detect and estimate changes. Hocking et al. [2017] described a graph framework to encode many examples of such prior information and a generic algorithm to infer the optimal model parameters, but implemented the algorithm for just a single scenario. We present the gfpop package that implements the algorithm in a generic manner in R/C++. gfpop works for a user-defined graph that can encode the prior nformation of the types of change and implements several loss functions (Gauss, Poisson, Binomial, Biweight and Huber). We then illustrate the use of gfpop on isotonic simulations and several applications in biology. For a number of graphs the algorithm runs in a matter of seconds or minutes for 10^5 datapoints

arXiv.org e-Print Archive

HAL Evry

Journal of Statistical Software

Lancaster E-Prints