36,503 research outputs found
Efficient Algorithms and Lower Bounds for Robust Linear Regression
We study the problem of high-dimensional linear regression in a robust model
where an -fraction of the samples can be adversarially corrupted. We
focus on the fundamental setting where the covariates of the uncorrupted
samples are drawn from a Gaussian distribution on
. We give nearly tight upper bounds and computational lower
bounds for this problem. Specifically, our main contributions are as follows:
For the case that the covariance matrix is known to be the identity, we give
a sample near-optimal and computationally efficient algorithm that outputs a
candidate hypothesis vector which approximates the unknown
regression vector within -norm , where is the standard deviation of the random observation
noise. An error of is information-theoretically
necessary, even with infinite sample size. Prior work gave an algorithm for
this problem with sample complexity whose
error guarantee scales with the -norm of .
For the case of unknown covariance, we show that we can efficiently achieve
the same error guarantee as in the known covariance case using an additional
unlabeled examples. On the other hand, an error of
can be information-theoretically attained with
samples. We prove a Statistical Query (SQ) lower bound
providing evidence that this quadratic tradeoff in the sample size is inherent.
More specifically, we show that any polynomial time SQ learning algorithm for
robust linear regression (in Huber's contamination model) with estimation
complexity , where is an arbitrarily small constant, must
incur an error of
SOCP relaxation bounds for the optimal subset selection problem applied to robust linear regression
This paper deals with the problem of finding the globally optimal subset of h
elements from a larger set of n elements in d space dimensions so as to
minimize a quadratic criterion, with an special emphasis on applications to
computing the Least Trimmed Squares Estimator (LTSE) for robust regression. The
computation of the LTSE is a challenging subset selection problem involving a
nonlinear program with continuous and binary variables, linked in a highly
nonlinear fashion. The selection of a globally optimal subset using the branch
and bound (BB) algorithm is limited to problems in very low dimension,
tipically d<5, as the complexity of the problem increases exponentially with d.
We introduce a bold pruning strategy in the BB algorithm that results in a
significant reduction in computing time, at the price of a negligeable accuracy
lost. The novelty of our algorithm is that the bounds at nodes of the BB tree
come from pseudo-convexifications derived using a linearization technique with
approximate bounds for the nonlinear terms. The approximate bounds are computed
solving an auxiliary semidefinite optimization problem. We show through a
computational study that our algorithm performs well in a wide set of the most
difficult instances of the LTSE problem.Comment: 12 pages, 3 figures, 2 table
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Despite the significant interest and progress in reinforcement learning (RL)
problems with adversarial corruption, current works are either confined to the
linear setting or lead to an undesired regret bound,
where is the number of rounds and is the total amount of
corruption. In this paper, we consider the contextual bandit with general
function approximation and propose a computationally efficient algorithm to
achieve a regret of . The proposed algorithm relies
on the recently developed uncertainty-weighted least-squares regression from
linear contextual bandit \citep{he2022nearly} and a new weighted estimator of
uncertainty for the general function class. In contrast to the existing
analysis that heavily relies on the linear structure, we develop a novel
technique to control the sum of weighted uncertainty, thus establishing the
final regret bounds. We then generalize our algorithm to the episodic MDP
setting and first achieve an additive dependence on the corruption level
in the scenario of general function approximation. Notably, our
algorithms achieve regret bounds either nearly match the performance lower
bound or improve the existing methods for all the corruption levels and in both
known and unknown cases.Comment: We study the corruption-robust MDPs and contextual bandits with
general function approximatio
Global optimization for low-dimensional switching linear regression and bounded-error estimation
The paper provides global optimization algorithms for two particularly
difficult nonconvex problems raised by hybrid system identification: switching
linear regression and bounded-error estimation. While most works focus on local
optimization heuristics without global optimality guarantees or with guarantees
valid only under restrictive conditions, the proposed approach always yields a
solution with a certificate of global optimality. This approach relies on a
branch-and-bound strategy for which we devise lower bounds that can be
efficiently computed. In order to obtain scalable algorithms with respect to
the number of data, we directly optimize the model parameters in a continuous
optimization setting without involving integer variables. Numerical experiments
show that the proposed algorithms offer a higher accuracy than convex
relaxations with a reasonable computational burden for hybrid system
identification. In addition, we discuss how bounded-error estimation is related
to robust estimation in the presence of outliers and exact recovery under
sparse noise, for which we also obtain promising numerical results
- …