747 research outputs found
Linear system identification using stable spline kernels and PLQ penalties
The classical approach to linear system identification is given by parametric
Prediction Error Methods (PEM). In this context, model complexity is often
unknown so that a model order selection step is needed to suitably trade-off
bias and variance. Recently, a different approach to linear system
identification has been introduced, where model order determination is avoided
by using a regularized least squares framework. In particular, the penalty term
on the impulse response is defined by so called stable spline kernels. They
embed information on regularity and BIBO stability, and depend on a small
number of parameters which can be estimated from data. In this paper, we
provide new nonsmooth formulations of the stable spline estimator. In
particular, we consider linear system identification problems in a very broad
context, where regularization functionals and data misfits can come from a rich
set of piecewise linear quadratic functions. Moreover, our anal- ysis includes
polyhedral inequality constraints on the unknown impulse response. For any
formulation in this class, we show that interior point methods can be used to
solve the system identification problem, with complexity O(n3)+O(mn2) in each
iteration, where n and m are the number of impulse response coefficients and
measurements, respectively. The usefulness of the framework is illustrated via
a numerical experiment where output measurements are contaminated by outliers.Comment: 8 pages, 2 figure
An asymptotically superlinearly convergent semismooth Newton augmented Lagrangian method for Linear Programming
Powerful interior-point methods (IPM) based commercial solvers, such as
Gurobi and Mosek, have been hugely successful in solving large-scale linear
programming (LP) problems. The high efficiency of these solvers depends
critically on the sparsity of the problem data and advanced matrix
factorization techniques. For a large scale LP problem with data matrix
that is dense (possibly structured) or whose corresponding normal matrix
has a dense Cholesky factor (even with re-ordering), these solvers may require
excessive computational cost and/or extremely heavy memory usage in each
interior-point iteration. Unfortunately, the natural remedy, i.e., the use of
iterative methods based IPM solvers, although can avoid the explicit
computation of the coefficient matrix and its factorization, is not practically
viable due to the inherent extreme ill-conditioning of the large scale normal
equation arising in each interior-point iteration. To provide a better
alternative choice for solving large scale LPs with dense data or requiring
expensive factorization of its normal equation, we propose a semismooth Newton
based inexact proximal augmented Lagrangian ({\sc Snipal}) method. Different
from classical IPMs, in each iteration of {\sc Snipal}, iterative methods can
efficiently be used to solve simpler yet better conditioned semismooth Newton
linear systems. Moreover, {\sc Snipal} not only enjoys a fast asymptotic
superlinear convergence but is also proven to enjoy a finite termination
property. Numerical comparisons with Gurobi have demonstrated encouraging
potential of {\sc Snipal} for handling large-scale LP problems where the
constraint matrix has a dense representation or has a dense
factorization even with an appropriate re-ordering.Comment: Due to the limitation "The abstract field cannot be longer than 1,920
characters", the abstract appearing here is slightly shorter than that in the
PDF fil
Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness
The susceptibility of modern machine learning classifiers to adversarial
examples has motivated theoretical results suggesting that these might be
unavoidable. However, these results can be too general to be applicable to
natural data distributions. Indeed, humans are quite robust for tasks involving
vision. This apparent conflict motivates a deeper dive into the question: Are
adversarial examples truly unavoidable? In this work, we theoretically
demonstrate that a key property of the data distribution -- concentration on
small-volume subsets of the input space -- determines whether a robust
classifier exists. We further demonstrate that, for a data distribution
concentrated on a union of low-dimensional linear subspaces, exploiting data
structure naturally leads to classifiers that enjoy good robustness guarantees,
improving upon methods for provable certification in certain regimes.Comment: Accepted to Neural Information Processing Systems (NeurIPS) 202
- …