184 research outputs found
Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions
We demonstrate the first algorithms for the problem of regression for
generalized linear models (GLMs) in the presence of additive oblivious noise.
We assume we have sample access to examples where is a noisy
measurement of . In particular, \new{the noisy labels are of
the form} , where is the oblivious
noise drawn independently of \new{and satisfies} ,
and . Our goal is to accurately recover
a \new{parameter vector such that the} function \new{has}
arbitrarily small error when compared to the true values ,
rather than the noisy measurements .
We present an algorithm that tackles \new{this} problem in its most general
distribution-independent setting, where the solution may not \new{even} be
identifiable. \new{Our} algorithm returns \new{an accurate estimate of} the
solution if it is identifiable, and otherwise returns a small list of
candidates, one of which is close to the true solution. Furthermore, we
\new{provide} a necessary and sufficient condition for identifiability, which
holds in broad settings. \new{Specifically,} the problem is identifiable when
the quantile at which is known, or when the family of
hypotheses does not contain candidates that are nearly equal to a translated
for some real number , while also having large error
when compared to .
This is the first \new{algorithmic} result for GLM regression \new{with
oblivious noise} which can handle more than half the samples being arbitrarily
corrupted. Prior work focused largely on the setting of linear regression, and
gave algorithms under restrictive assumptions.Comment: Published in COLT 202
An exact dynamic programming approach to segmented isotonic regression
This paper proposes a polynomial-time algorithm to construct the monotone stepwise curve that minimizes the sum of squared errors with respect to a given cloud of data points. The fitted curve is also constrained on the maximum number of steps it can be composed of and on the minimum step length. Our algorithm relies on dynamic programming and is built on the basis that said curve-fitting task can be tackled as a shortest-path type of problem. Numerical results on synthetic and realistic data sets reveal that our algorithm is able to provide the globally optimal monotone stepwise curve fit for samples with thousands of data points in less than a few hours. Furthermore, the algorithm gives a certificate on the optimality gap of any incumbent solution it generates. From a practical standpoint, this piece of research is motivated by the roll-out of smart grids and the increasing role played by the small flexible consumption of electricity in the large-scale integration of renewable energy sources into current power systems. Within this context, our algorithm constitutes an useful tool to generate bidding curves for a pool of small flexible consumers to partake in wholesale electricity markets.This research has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 755705). This work was also supported in part by the Spanish Ministry of Economy, Industry and Competitiveness and the European Regional Development Fund (ERDF) through project ENE2017-83775-P. Martine Labbé has been partially supported by the Fonds de la Recherche Scientifique - FNRS under Grant(s) no PDR T0098.18
- …