325 research outputs found
Strengths and Weaknesses of Quantum Fingerprinting
We study the power of quantum fingerprints in the simultaneous message
passing (SMP) setting of communication complexity. Yao recently showed how to
simulate, with exponential overhead, classical shared-randomness SMP protocols
by means of quantum SMP protocols without shared randomness
(-protocols). Our first result is to extend Yao's simulation to
the strongest possible model: every many-round quantum protocol with unlimited
shared entanglement can be simulated, with exponential overhead, by
-protocols. We apply our technique to obtain an efficient
-protocol for a function which cannot be efficiently solved
through more restricted simulations. Second, we tightly characterize the power
of the quantum fingerprinting technique by making a connection to arrangements
of homogeneous halfspaces with maximal margin. These arrangements have been
well studied in computational learning theory, and we use some strong results
obtained in this area to exhibit weaknesses of quantum fingerprinting. In
particular, this implies that for almost all functions, quantum fingerprinting
protocols are exponentially worse than classical deterministic SMP protocols.Comment: 13 pages, no figures, to appear in CCC'0
Quantum Communication Cannot Simulate a Public Coin
We study the simultaneous message passing model of communication complexity.
Building on the quantum fingerprinting protocol of Buhrman et al., Yao recently
showed that a large class of efficient classical public-coin protocols can be
turned into efficient quantum protocols without public coin. This raises the
question whether this can be done always, i.e. whether quantum communication
can always replace a public coin in the SMP model. We answer this question in
the negative, exhibiting a communication problem where classical communication
with public coin is exponentially more efficient than quantum communication.
Together with a separation in the other direction due to Bar-Yossef et al.,
this shows that the quantum SMP model is incomparable with the classical
public-coin SMP model.
In addition we give a characterization of the power of quantum fingerprinting
by means of a connection to geometrical tools from machine learning, a
quadratic improvement of Yao's simulation, and a nearly tight analysis of the
Hamming distance problem from Yao's paper.Comment: 12 pages LaTe
Sign rank versus VC dimension
This work studies the maximum possible sign rank of sign
matrices with a given VC dimension . For , this maximum is {three}. For
, this maximum is . For , similar but
slightly less accurate statements hold. {The lower bounds improve over previous
ones by Ben-David et al., and the upper bounds are novel.}
The lower bounds are obtained by probabilistic constructions, using a theorem
of Warren in real algebraic topology. The upper bounds are obtained using a
result of Welzl about spanning trees with low stabbing number, and using the
moment curve.
The upper bound technique is also used to: (i) provide estimates on the
number of classes of a given VC dimension, and the number of maximum classes of
a given VC dimension -- answering a question of Frankl from '89, and (ii)
design an efficient algorithm that provides an multiplicative
approximation for the sign rank.
We also observe a general connection between sign rank and spectral gaps
which is based on Forster's argument. Consider the adjacency
matrix of a regular graph with a second eigenvalue of absolute value
and . We show that the sign rank of the signed
version of this matrix is at least . We use this connection to
prove the existence of a maximum class with VC
dimension and sign rank . This answers a question
of Ben-David et al.~regarding the sign rank of large VC classes. We also
describe limitations of this approach, in the spirit of the Alon-Boppana
theorem.
We further describe connections to communication complexity, geometry,
learning theory, and combinatorics.Comment: 33 pages. This is a revised version of the paper "Sign rank versus VC
dimension". Additional results in this version: (i) Estimates on the number
of maximum VC classes (answering a question of Frankl from '89). (ii)
Estimates on the sign rank of large VC classes (answering a question of
Ben-David et al. from '03). (iii) A discussion on the computational
complexity of computing the sign-ran
Revealing and analyzing the shared structure of deep face embeddings
2022 Summer.Includes bibliographical references.Deep convolutional neural networks trained for face recognition are found to output face embeddings which share a fundamental structure. More specifically, one face verification model's embeddings (i.e. last--layer activations) can be compared directly to another model's embeddings after only a rotation or linear transformation, with little performance penalty. If only rotation is required to convert the bulk of embeddings between models, there is a strong sense in which those models are learning the same thing. In the most recent experiments, the structural similarity (and dissimilarity) of face embeddings is analyzed as a means of understanding face recognition bias. Bias has been identified in many face recognition models, often analyzed using distance measures between pairs of faces. By representing groups of faces as groups, and comparing them as groups, this shared embedding structure can be further understood. Specifically, demographic-specific subspaces are represented as points on a Grassmann manifold. Across 10 models, the geodesic distances between those points are expressive of demographic differences. By comparing how different groups of people are represented in the structure of embedding space, and how those structures vary with model designs, a new perspective on both representational similarity and face recognition bias is offered
Distribution-Independent Evolvability of Linear Threshold Functions
Valiant's (2007) model of evolvability models the evolutionary process of
acquiring useful functionality as a restricted form of learning from random
examples. Linear threshold functions and their various subclasses, such as
conjunctions and decision lists, play a fundamental role in learning theory and
hence their evolvability has been the primary focus of research on Valiant's
framework (2007). One of the main open problems regarding the model is whether
conjunctions are evolvable distribution-independently (Feldman and Valiant,
2008). We show that the answer is negative. Our proof is based on a new
combinatorial parameter of a concept class that lower-bounds the complexity of
learning from correlations.
We contrast the lower bound with a proof that linear threshold functions
having a non-negligible margin on the data points are evolvable
distribution-independently via a simple mutation algorithm. Our algorithm
relies on a non-linear loss function being used to select the hypotheses
instead of 0-1 loss in Valiant's (2007) original definition. The proof of
evolvability requires that the loss function satisfies several mild conditions
that are, for example, satisfied by the quadratic loss function studied in
several other works (Michael, 2007; Feldman, 2009; Valiant, 2010). An important
property of our evolution algorithm is monotonicity, that is the algorithm
guarantees evolvability without any decreases in performance. Previously,
monotone evolvability was only shown for conjunctions with quadratic loss
(Feldman, 2009) or when the distribution on the domain is severely restricted
(Michael, 2007; Feldman, 2009; Kanade et al., 2010
Beyond Normal: On the Evaluation of Mutual Information Estimators
Mutual information is a general statistical dependency measure which has
found applications in representation learning, causality, domain generalization
and computational biology. However, mutual information estimators are typically
evaluated on simple families of probability distributions, namely multivariate
normal distribution and selected distributions with one-dimensional random
variables. In this paper, we show how to construct a diverse family of
distributions with known ground-truth mutual information and propose a
language-independent benchmarking platform for mutual information estimators.
We discuss the general applicability and limitations of classical and neural
estimators in settings involving high dimensions, sparse interactions,
long-tailed distributions, and high mutual information. Finally, we provide
guidelines for practitioners on how to select appropriate estimator adapted to
the difficulty of problem considered and issues one needs to consider when
applying an estimator to a new data set.Comment: Accepted at NeurIPS 2023. Code available at
https://github.com/cbg-ethz/bm
- …