9 research outputs found
No-substitution k-means Clustering with Adversarial Order
We investigate -means clustering in the online no-substitution setting
when the input arrives in \emph{arbitrary} order. In this setting, points
arrive one after another, and the algorithm is required to instantly decide
whether to take the current point as a center before observing the next point.
Decisions are irrevocable. The goal is to minimize both the number of centers
and the -means cost. Previous works in this setting assume that the input's
order is random, or that the input's aspect ratio is bounded. It is known that
if the order is arbitrary and there is no assumption on the input, then any
algorithm must take all points as centers. Moreover, assuming a bounded aspect
ratio is too restrictive -- it does not include natural input generated from
mixture models.
We introduce a new complexity measure that quantifies the difficulty of
clustering a dataset arriving in arbitrary order. We design a new random
algorithm and prove that if applied on data with complexity , the algorithm
takes centers and is an -approximation. We also
prove that if the data is sampled from a ``natural" distribution, such as a
mixture of Gaussians, then the new complexity measure is equal to
. This implies that for data generated from those distributions,
our new algorithm takes only centers and is a
-approximation. In terms of negative results, we prove that the
number of centers needed to achieve an -approximation is at least
.Comment: accepted to ALT 202
What relations are reliably embeddable in Euclidean space?
We consider the problem of embedding a relation, represented as a directed
graph, into Euclidean space. For three types of embeddings motivated by the
recent literature on knowledge graphs, we obtain characterizations of which
relations they are able to capture, as well as bounds on the minimal
dimensionality and precision needed.Comment: submitted to COLT 201
Sample Complexity of Adversarially Robust Linear Classification on Separated Data
We consider the sample complexity of learning with adversarial robustness.
Most prior theoretical results for this problem have considered a setting where
different classes in the data are close together or overlapping. Motivated by
some real applications, we consider, in contrast, the well-separated case where
there exists a classifier with perfect accuracy and robustness, and show that
the sample complexity narrates an entirely different story. Specifically, for
linear classifiers, we show a large class of well-separated distributions where
the expected robust loss of any algorithm is at least ,
whereas the max margin algorithm has expected standard loss .
This shows a gap in the standard and robust losses that cannot be obtained via
prior techniques. Additionally, we present an algorithm that, given an instance
where the robustness radius is much smaller than the gap between the classes,
gives a solution with expected robust loss is . This shows that
for very well-separated data, convergence rates of are
achievable, which is not the case otherwise. Our results apply to robustness
measured in any norm with (including )
Consistent Non-Parametric Methods for Maximizing Robustness
Learning classifiers that are robust to adversarial examples has received a
great deal of recent attention. A major drawback of the standard robust
learning framework is there is an artificial robustness radius that applies
to all inputs. This ignores the fact that data may be highly heterogeneous, in
which case it is plausible that robustness regions should be larger in some
regions of data, and smaller in others. In this paper, we address this
limitation by proposing a new limit classifier, called the neighborhood optimal
classifier, that extends the Bayes optimal classifier outside its support by
using the label of the closest in-support point. We then argue that this
classifier maximizes the size of its robustness regions subject to the
constraint of having accuracy equal to the Bayes optimal. We then present
sufficient conditions under which general non-parametric methods that can be
represented as weight functions converge towards this limit, and show that both
nearest neighbors and kernel classifiers satisfy them under certain conditions.Comment: accepted to Nuerips 202
Recommended from our members
Theoretical Foundations of Trustworthy Machine Learning
Machine learning models have become a ubiquitous part of society, and it has consequently become of paramount importance to understand how to design safe and reliable models. This dissertation attempts to take steps towards this direction by consider two specific problems in reliable machine learning: adversarial examples, which are small test-time perturbations to the input designed to cause misclassification, and data-copying, which occurs when a generative model simply memorizes its training data (giving poor generalization and dangerous security risks)
Structure from Voltage
Effective resistance (ER) is an attractive way to interrogate the structure
of graphs. It is an alternative to computing the eigen-vectors of the graph
Laplacian. Graph laplacians are used to find low dimensional structures in high
dimensional data. Here too, ER based analysis has advantages over eign-vector
based methods. Unfortunately Von Luxburg et al. (2010) show that, when vertices
correspond to a sample from a distribution over a metric space, the limit of
the ER between distant points converges to a trivial quantity that holds no
information about the structure of the graph. We show that by using scaling
resistances in a graph with vertices by , one gets a meaningful limit
of the voltages and of effective resistances. We also show that by adding a
"ground" node to a metric graph one gets a simple and natural way to compute
all of the distances from a chosen point to all other points
Effective resistance in metric spaces
Effective resistance (ER) is an attractive way to interrogate the structure
of graphs. It is an alternative to computing the eigenvectors of the graph
Laplacian.
One attractive application of ER is to point clouds, i.e. graphs whose
vertices correspond to IID samples from a distribution over a metric space.
Unfortunately, it was shown that the ER between any two points converges to a
trivial quantity that holds no information about the graph's structure as the
size of the sample increases to infinity.
In this study, we show that this trivial solution can be circumvented by
considering a region-based ER between pairs of small regions rather than pairs
of points and by scaling the edge weights appropriately with respect to the
underlying density in each region. By keeping the regions fixed, we show
analytically that the region-based ER converges to a non-trivial limit as the
number of points increases to infinity. Namely the ER on a metric space. We
support our theoretical findings with numerical experiments