Search CORE

184 research outputs found

Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions

Author: Diakonikolas Ilias
Karmalkar Sushrut
Park Jongho
Tzamos Christos
Publication venue
Publication date: 27/09/2023
Field of study

We demonstrate the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We assume we have sample access to examples

(x, y)

where

y

is a noisy measurement of

g(w^* \cdot x)

. In particular, \new{the noisy labels are of the form}

y = g(w^* \cdot x) + \xi + \epsilon

, where

\xi

is the oblivious noise drawn independently of

x

\new{and satisfies}

\Pr[\xi = 0] \geq o(1)

, and

\epsilon \sim \mathcal N(0, \sigma^2)

. Our goal is to accurately recover a \new{parameter vector

w

such that the} function

g(w \cdot x)

\new{has} arbitrarily small error when compared to the true values

g(w^* \cdot x)

, rather than the noisy measurements

y

. We present an algorithm that tackles \new{this} problem in its most general distribution-independent setting, where the solution may not \new{even} be identifiable. \new{Our} algorithm returns \new{an accurate estimate of} the solution if it is identifiable, and otherwise returns a small list of candidates, one of which is close to the true solution. Furthermore, we \new{provide} a necessary and sufficient condition for identifiability, which holds in broad settings. \new{Specifically,} the problem is identifiable when the quantile at which

\xi + \epsilon = 0

is known, or when the family of hypotheses does not contain candidates that are nearly equal to a translated

g(w^* \cdot x) + A

for some real number

A

, while also having large error when compared to

g(w^* \cdot x)

. This is the first \new{algorithmic} result for GLM regression \new{with oblivious noise} which can handle more than half the samples being arbitrarily corrupted. Prior work focused largely on the setting of linear regression, and gave algorithms under restrictive assumptions.Comment: Published in COLT 202

arXiv.org e-Print Archive

An exact dynamic programming approach to segmented isotonic regression

Author: Bucarey Víctor
Labbé Martine
Morales Juan Miguel
Pineda Salvador
Publication venue: 'Elsevier BV'
Publication date: 08/12/2020
Field of study

This paper proposes a polynomial-time algorithm to construct the monotone stepwise curve that minimizes the sum of squared errors with respect to a given cloud of data points. The fitted curve is also constrained on the maximum number of steps it can be composed of and on the minimum step length. Our algorithm relies on dynamic programming and is built on the basis that said curve-fitting task can be tackled as a shortest-path type of problem. Numerical results on synthetic and realistic data sets reveal that our algorithm is able to provide the globally optimal monotone stepwise curve fit for samples with thousands of data points in less than a few hours. Furthermore, the algorithm gives a certificate on the optimality gap of any incumbent solution it generates. From a practical standpoint, this piece of research is motivated by the roll-out of smart grids and the increasing role played by the small flexible consumption of electricity in the large-scale integration of renewable energy sources into current power systems. Within this context, our algorithm constitutes an useful tool to generate bidding curves for a pool of small flexible consumers to partake in wholesale electricity markets.This research has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 755705). This work was also supported in part by the Spanish Ministry of Economy, Industry and Competitiveness and the European Regional Development Fund (ERDF) through project ENE2017-83775-P. Martine Labbé has been partially supported by the Fonds de la Recherche Scientifique - FNRS under Grant(s) no PDR T0098.18

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Repositorio Institucional Universidad de Málaga

DI-fusion

Hal-Diderot