2 research outputs found
Adversarial Robustness of Supervised Sparse Coding
Several recent results provide theoretical insights into the phenomena of
adversarial examples. Existing results, however, are often limited due to a gap
between the simplicity of the models studied and the complexity of those
deployed in practice. In this work, we strike a better balance by considering a
model that involves learning a representation while at the same time giving a
precise generalization bound and a robustness certificate. We focus on the
hypothesis class obtained by combining a sparsity-promoting encoder coupled
with a linear classifier, and show an interesting interplay between the
expressivity and stability of the (supervised) representation map and a notion
of margin in the feature space. We bound the robust risk (to -bounded
perturbations) of hypotheses parameterized by dictionaries that achieve a mild
encoder gap on training data. Furthermore, we provide a robustness certificate
for end-to-end classification. We demonstrate the applicability of our analysis
by computing certified accuracy on real data, and compare with other
alternatives for certified robustness
Reframing Neural Networks: Deep Structure in Overcomplete Representations
In comparison to classical shallow representation learning techniques, deep
neural networks have achieved superior performance in nearly every application
benchmark. But despite their clear empirical advantages, it is still not well
understood what makes them so effective. To approach this question, we
introduce deep frame approximation, a unifying framework for representation
learning with structured overcomplete frames. While exact inference requires
iterative optimization, it may be approximated by the operations of a
feed-forward deep neural network. We then indirectly analyze how model capacity
relates to the frame structure induced by architectural hyperparameters such as
depth, width, and skip connections. We quantify these structural differences
with the deep frame potential, a data-independent measure of coherence linked
to representation uniqueness and stability. As a criterion for model selection,
we show correlation with generalization error on a variety of common deep
network architectures such as ResNets and DenseNets. We also demonstrate how
recurrent networks implementing iterative optimization algorithms achieve
performance comparable to their feed-forward approximations. This connection to
the established theory of overcomplete representations suggests promising new
directions for principled deep network architecture design with less reliance
on ad-hoc engineering.Comment: arXiv admin note: substantial text overlap with arXiv:2003.1386