1,097 research outputs found
A nodal domain theorem and a higher-order Cheeger inequality for the graph -Laplacian
We consider the nonlinear graph -Laplacian and its set of eigenvalues and
associated eigenfunctions of this operator defined by a variational principle.
We prove a nodal domain theorem for the graph -Laplacian for any .
While for the bounds on the number of weak and strong nodal domains are
the same as for the linear graph Laplacian (), the behavior changes for
. We show that the bounds are tight for as the bounds are
attained by the eigenfunctions of the graph -Laplacian on two graphs.
Finally, using the properties of the nodal domains, we prove a higher-order
Cheeger inequality for the graph -Laplacian for . If the eigenfunction
associated to the -th variational eigenvalue of the graph -Laplacian has
exactly strong nodal domains, then the higher order Cheeger inequality
becomes tight as
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Adaptive gradient methods have become recently very popular, in particular as
they have been shown to be useful in the training of deep neural networks. In
this paper we have analyzed RMSProp, originally proposed for the training of
deep neural networks, in the context of online convex optimization and show
-type regret bounds. Moreover, we propose two variants SC-Adagrad and
SC-RMSProp for which we show logarithmic regret bounds for strongly convex
functions. Finally, we demonstrate in the experiments that these new variants
outperform other adaptive gradient techniques or stochastic gradient descent in
the optimization of strongly convex functions as well as in training of deep
neural networks.Comment: ICML 2017, 16 pages, 23 figure
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Adaptive gradient methods have become recently very popular, in particular as
they have been shown to be useful in the training of deep neural networks. In
this paper we have analyzed RMSProp, originally proposed for the training of
deep neural networks, in the context of online convex optimization and show
-type regret bounds. Moreover, we propose two variants SC-Adagrad and
SC-RMSProp for which we show logarithmic regret bounds for strongly convex
functions. Finally, we demonstrate in the experiments that these new variants
outperform other adaptive gradient techniques or stochastic gradient descent in
the optimization of strongly convex functions as well as in training of deep
neural networks.Comment: ICML 2017, 16 pages, 23 figure
Matrix factorization with Binary Components
Motivated by an application in computational biology, we consider low-rank
matrix factorization with -constraints on one of the factors and
optionally convex constraints on the second one. In addition to the
non-convexity shared with other matrix factorization schemes, our problem is
further complicated by a combinatorial constraint set of size ,
where is the dimension of the data points and the rank of the
factorization. Despite apparent intractability, we provide - in the line of
recent work on non-negative matrix factorization by Arora et al. (2012) - an
algorithm that provably recovers the underlying factorization in the exact case
with operations for datapoints. To obtain this
result, we use theory around the Littlewood-Offord lemma from combinatorics.Comment: appeared in NIPS 201
Regularization-free estimation in trace regression with symmetric positive semidefinite matrices
Over the past few years, trace regression models have received considerable
attention in the context of matrix completion, quantum state tomography, and
compressed sensing. Estimation of the underlying matrix from
regularization-based approaches promoting low-rankedness, notably nuclear norm
regularization, have enjoyed great popularity. In the present paper, we argue
that such regularization may no longer be necessary if the underlying matrix is
symmetric positive semidefinite (\textsf{spd}) and the design satisfies certain
conditions. In this situation, simple least squares estimation subject to an
\textsf{spd} constraint may perform as well as regularization-based approaches
with a proper choice of the regularization parameter, which entails knowledge
of the noise level and/or tuning. By contrast, constrained least squares
estimation comes without any tuning parameter and may hence be preferred due to
its simplicity
Disentangling Adversarial Robustness and Generalization
Obtaining deep networks that are robust against adversarial examples and
generalize well is an open problem. A recent hypothesis even states that both
robust and accurate models are impossible, i.e., adversarial robustness and
generalization are conflicting goals. In an effort to clarify the relationship
between robustness and generalization, we assume an underlying, low-dimensional
data manifold and show that: 1. regular adversarial examples leave the
manifold; 2. adversarial examples constrained to the manifold, i.e.,
on-manifold adversarial examples, exist; 3. on-manifold adversarial examples
are generalization errors, and on-manifold adversarial training boosts
generalization; 4. regular robustness and generalization are not necessarily
contradicting goals. These assumptions imply that both robust and accurate
models are possible. However, different models (architectures, training
strategies etc.) can exhibit different robustness and generalization
characteristics. To confirm our claims, we present extensive experiments on
synthetic data (with known manifold) as well as on EMNIST, Fashion-MNIST and
CelebA.Comment: Conference on Computer Vision and Pattern Recognition 201
Learning Using Privileged Information: SVM+ and Weighted SVM
Prior knowledge can be used to improve predictive performance of learning
algorithms or reduce the amount of data required for training. The same goal is
pursued within the learning using privileged information paradigm which was
recently introduced by Vapnik et al. and is aimed at utilizing additional
information available only at training time -- a framework implemented by SVM+.
We relate the privileged information to importance weighting and show that the
prior knowledge expressible with privileged features can also be encoded by
weights associated with every training example. We show that a weighted SVM can
always replicate an SVM+ solution, while the converse is not true and we
construct a counterexample highlighting the limitations of SVM+. Finally, we
touch on the problem of choosing weights for weighted SVMs when privileged
features are not available.Comment: 18 pages, 8 figures; integrated reviewer comments, improved
typesettin
- …