1,554 research outputs found
On the Succinct Representation of Equivalence Classes
Given a set of n elements that are partitioned into equivalence classes, we study the problem of assigning unique labels to these elements in order to support the query that asks whether the elements corresponding to two given labels belong to the same equivalence class. This problem has been studied by Katz et al., Alstrup et al., and Lewenstein et al.. Lewenstein et al. showed that with no auxiliary data structure, a label space of size nlg(n) is necessary and sufficient to represent the equivalence relation. They also showed that if the labels were to be assigned from the set [n], a data structure of square root of n bits is necessary and sufficient to represent the equivalence relation and to answer the equivalence query in O(lg(n)) time. In this thesis, we give an improved data structure that uses O(square root of n) bits and can answer queries in constant time, when the label space is of size n. Moreover, we study the case where we allow the label space to be of size cn for any constant c > 1. We show that with such a label space, a data structure of O(lg(n)) bits is necessary and sufficient to represent the equivalence relation and to answer the equivalence query in constant time. We believe that our work can trigger further work on tradeoffs between label space and auxiliary data structure space for other labeling problems
Zero-shot keyword spotting for visual speech recognition in-the-wild
Visual keyword spotting (KWS) is the problem of estimating whether a text
query occurs in a given recording using only video information. This paper
focuses on visual KWS for words unseen during training, a real-world, practical
setting which so far has received no attention by the community. To this end,
we devise an end-to-end architecture comprising (a) a state-of-the-art visual
feature extractor based on spatiotemporal Residual Networks, (b) a
grapheme-to-phoneme model based on sequence-to-sequence neural networks, and
(c) a stack of recurrent neural networks which learn how to correlate visual
features with the keyword representation. Different to prior works on KWS,
which try to learn word representations merely from sequences of graphemes
(i.e. letters), we propose the use of a grapheme-to-phoneme encoder-decoder
model which learns how to map words to their pronunciation. We demonstrate that
our system obtains very promising visual-only KWS results on the challenging
LRS2 database, for keywords unseen during training. We also show that our
system outperforms a baseline which addresses KWS via automatic speech
recognition (ASR), while it drastically improves over other recently proposed
ASR-free KWS methods.Comment: Accepted at ECCV-201
Hardness of Exact Distance Queries in Sparse Graphs Through Hub Labeling
A distance labeling scheme is an assignment of bit-labels to the vertices of
an undirected, unweighted graph such that the distance between any pair of
vertices can be decoded solely from their labels. An important class of
distance labeling schemes is that of hub labelings, where a node
stores its distance to the so-called hubs , chosen so that for
any there is belonging to some shortest
path. Notice that for most existing graph classes, the best distance labelling
constructions existing use at some point a hub labeling scheme at least as a
key building block. Our interest lies in hub labelings of sparse graphs, i.e.,
those with , for which we show a lowerbound of
for the average size of the hubsets.
Additionally, we show a hub-labeling construction for sparse graphs of average
size for some , where is the
so-called Ruzsa-Szemer{\'e}di function, linked to structure of induced
matchings in dense graphs. This implies that further improving the lower bound
on hub labeling size to would require a
breakthrough in the study of lower bounds on , which have resisted
substantial improvement in the last 70 years. For general distance labeling of
sparse graphs, we show a lowerbound of , where is the communication complexity of the
Sum-Index problem over . Our results suggest that the best achievable
hub-label size and distance-label size in sparse graphs may be
for some
Quantum Reverse Shannon Theorem
Dual to the usual noisy channel coding problem, where a noisy (classical or
quantum) channel is used to simulate a noiseless one, reverse Shannon theorems
concern the use of noiseless channels to simulate noisy ones, and more
generally the use of one noisy channel to simulate another. For channels of
nonzero capacity, this simulation is always possible, but for it to be
efficient, auxiliary resources of the proper kind and amount are generally
required. In the classical case, shared randomness between sender and receiver
is a sufficient auxiliary resource, regardless of the nature of the source, but
in the quantum case the requisite auxiliary resources for efficient simulation
depend on both the channel being simulated, and the source from which the
channel inputs are coming. For tensor power sources (the quantum generalization
of classical IID sources), entanglement in the form of standard ebits
(maximally entangled pairs of qubits) is sufficient, but for general sources,
which may be arbitrarily correlated or entangled across channel inputs,
additional resources, such as entanglement-embezzling states or backward
communication, are generally needed. Combining existing and new results, we
establish the amounts of communication and auxiliary resources needed in both
the classical and quantum cases, the tradeoffs among them, and the loss of
simulation efficiency when auxiliary resources are absent or insufficient. In
particular we find a new single-letter expression for the excess forward
communication cost of coherent feedback simulations of quantum channels (i.e.
simulations in which the sender retains what would escape into the environment
in an ordinary simulation), on non-tensor-power sources in the presence of
unlimited ebits but no other auxiliary resource. Our results on tensor power
sources establish a strong converse to the entanglement-assisted capacity
theorem.Comment: 35 pages, to appear in IEEE-IT. v2 has a fixed proof of the Clueless
Eve result, a new single-letter formula for the "spread deficit", better
error scaling, and an improved strong converse. v3 and v4 each make small
improvements to the presentation and add references. v5 fixes broken
reference
Hardness of decoding quantum stabilizer codes
In this article we address the computational hardness of optimally decoding a
quantum stabilizer code. Much like classical linear codes, errors are detected
by measuring certain check operators which yield an error syndrome, and the
decoding problem consists of determining the most likely recovery given the
syndrome. The corresponding classical problem is known to be NP-complete, and a
similar decoding problem for quantum codes is also known to be NP-complete.
However, this decoding strategy is not optimal in the quantum setting as it
does not take into account error degeneracy, which causes distinct errors to
have the same effect on the code. Here, we show that optimal decoding of
stabilizer codes is computationally much harder than optimal decoding of
classical linear codes, it is #P
Cross-Lingual Adaptation using Structural Correspondence Learning
Cross-lingual adaptation, a special case of domain adaptation, refers to the
transfer of classification knowledge between two languages. In this article we
describe an extension of Structural Correspondence Learning (SCL), a recently
proposed algorithm for domain adaptation, for cross-lingual adaptation. The
proposed method uses unlabeled documents from both languages, along with a word
translation oracle, to induce cross-lingual feature correspondences. From these
correspondences a cross-lingual representation is created that enables the
transfer of classification knowledge from the source to the target language.
The main advantages of this approach over other approaches are its resource
efficiency and task specificity.
We conduct experiments in the area of cross-language topic and sentiment
classification involving English as source language and German, French, and
Japanese as target languages. The results show a significant improvement of the
proposed method over a machine translation baseline, reducing the relative
error due to cross-lingual adaptation by an average of 30% (topic
classification) and 59% (sentiment classification). We further report on
empirical analyses that reveal insights into the use of unlabeled data, the
sensitivity with respect to important hyperparameters, and the nature of the
induced cross-lingual correspondences
- …