4 research outputs found
Efficient Algorithms for Non-convex Isotonic Regression through Submodular Optimization
International audienceWe consider the minimization of submodular functions subject to ordering constraints. We show that this optimization problem can be cast as a convex optimization problem on a space of uni-dimensional measures, with ordering constraints corresponding to first-order stochastic dominance. We propose new discretization schemes that lead to simple and efficient algorithms based on zero-th, first, or higher order oracles; these algorithms also lead to improvements without isotonic constraints. Finally, our experiments show that non-convex loss functions can be much more robust to outliers for isotonic regression, while still leading to an efficient optimization problem
gfpop: an R Package for Univariate Graph-Constrained Change-point Detection
In a world with data that change rapidly and abruptly, it is important to
detect those changes accurately. In this paper we describe an R package
implementing an algorithm recently proposed by Hocking et al. [2017] for
penalised maximum likelihood inference of constrained multiple change-point
models. This algorithm can be used to pinpoint the precise locations of abrupt
changes in large data sequences. There are many application domains for such
models, such as medicine, neuroscience or genomics. Often, practitioners have
prior knowledge about the changes they are looking for. For example in genomic
data, biologists sometimes expect peaks: up changes followed by down changes.
Taking advantage of such prior information can substantially improve the
accuracy with which we can detect and estimate changes. Hocking et al. [2017]
described a graph framework to encode many examples of such prior information
and a generic algorithm to infer the optimal model parameters, but implemented
the algorithm for just a single scenario. We present the gfpop package that
implements the algorithm in a generic manner in R/C++. gfpop works for a
user-defined graph that can encode the prior nformation of the types of change
and implements several loss functions (Gauss, Poisson, Binomial, Biweight and
Huber). We then illustrate the use of gfpop on isotonic simulations and several
applications in biology. For a number of graphs the algorithm runs in a matter
of seconds or minutes for 10^5 datapoints