889 research outputs found
Noise-adaptive Margin-based Active Learning and Lower Bounds under Tsybakov Noise Condition
We present a simple noise-robust margin-based active learning algorithm to
find homogeneous (passing the origin) linear separators and analyze its error
convergence when labels are corrupted by noise. We show that when the imposed
noise satisfies the Tsybakov low noise condition (Mammen, Tsybakov, and others
1999; Tsybakov 2004) the algorithm is able to adapt to unknown level of noise
and achieves optimal statistical rate up to poly-logarithmic factors. We also
derive lower bounds for margin based active learning algorithms under Tsybakov
noise conditions (TNC) for the membership query synthesis scenario (Angluin
1988). Our result implies lower bounds for the stream based selective sampling
scenario (Cohn 1990) under TNC for some fairly simple data distributions. Quite
surprisingly, we show that the sample complexity cannot be improved even if the
underlying data distribution is as simple as the uniform distribution on the
unit ball. Our proof involves the construction of a well separated hypothesis
set on the d-dimensional unit ball along with carefully designed label
distributions for the Tsybakov noise condition. Our analysis might provide
insights for other forms of lower bounds as well.Comment: 16 pages, 2 figures. An abridged version to appear in Thirtieth AAAI
Conference on Artificial Intelligence (AAAI), which is held in Phoenix, AZ
USA in 201
Online and Differentially-Private Tensor Decomposition
In this paper, we resolve many of the key algorithmic questions regarding
robustness, memory efficiency, and differential privacy of tensor
decomposition. We propose simple variants of the tensor power method which
enjoy these strong properties. We present the first guarantees for online
tensor power method which has a linear memory requirement. Moreover, we present
a noise calibrated tensor power method with efficient privacy guarantees. At
the heart of all these guarantees lies a careful perturbation analysis derived
in this paper which improves up on the existing results significantly.Comment: 19 pages, 9 figures. To appear at the 30th Annual Conference on
Advances in Neural Information Processing Systems (NIPS 2016), to be held at
Barcelona, Spain. Fix small typos in proofs of Lemmas C.5 and C.
Graph Connectivity in Noisy Sparse Subspace Clustering
Subspace clustering is the problem of clustering data points into a union of
low-dimensional linear/affine subspaces. It is the mathematical abstraction of
many important problems in computer vision, image processing and machine
learning. A line of recent work (4, 19, 24, 20) provided strong theoretical
guarantee for sparse subspace clustering (4), the state-of-the-art algorithm
for subspace clustering, on both noiseless and noisy data sets. It was shown
that under mild conditions, with high probability no two points from different
subspaces are clustered together. Such guarantee, however, is not sufficient
for the clustering to be correct, due to the notorious "graph connectivity
problem" (15). In this paper, we investigate the graph connectivity problem for
noisy sparse subspace clustering and show that a simple post-processing
procedure is capable of delivering consistent clustering under certain "general
position" or "restricted eigenvalue" assumptions. We also show that our
condition is almost tight with adversarial noise perturbation by constructing a
counter-example. These results provide the first exact clustering guarantee of
noisy SSC for subspaces of dimension greater then 3.Comment: 14 pages. To appear in The 19th International Conference on
Artificial Intelligence and Statistics, held at Cadiz, Spain in 201
- …