1,554 research outputs found

    On the Succinct Representation of Equivalence Classes

    Get PDF
    Given a set of n elements that are partitioned into equivalence classes, we study the problem of assigning unique labels to these elements in order to support the query that asks whether the elements corresponding to two given labels belong to the same equivalence class. This problem has been studied by Katz et al., Alstrup et al., and Lewenstein et al.. Lewenstein et al. showed that with no auxiliary data structure, a label space of size nlg(n) is necessary and sufficient to represent the equivalence relation. They also showed that if the labels were to be assigned from the set [n], a data structure of square root of n bits is necessary and sufficient to represent the equivalence relation and to answer the equivalence query in O(lg(n)) time. In this thesis, we give an improved data structure that uses O(square root of n) bits and can answer queries in constant time, when the label space is of size n. Moreover, we study the case where we allow the label space to be of size cn for any constant c > 1. We show that with such a label space, a data structure of O(lg(n)) bits is necessary and sufficient to represent the equivalence relation and to answer the equivalence query in constant time. We believe that our work can trigger further work on tradeoffs between label space and auxiliary data structure space for other labeling problems

    Zero-shot keyword spotting for visual speech recognition in-the-wild

    Full text link
    Visual keyword spotting (KWS) is the problem of estimating whether a text query occurs in a given recording using only video information. This paper focuses on visual KWS for words unseen during training, a real-world, practical setting which so far has received no attention by the community. To this end, we devise an end-to-end architecture comprising (a) a state-of-the-art visual feature extractor based on spatiotemporal Residual Networks, (b) a grapheme-to-phoneme model based on sequence-to-sequence neural networks, and (c) a stack of recurrent neural networks which learn how to correlate visual features with the keyword representation. Different to prior works on KWS, which try to learn word representations merely from sequences of graphemes (i.e. letters), we propose the use of a grapheme-to-phoneme encoder-decoder model which learns how to map words to their pronunciation. We demonstrate that our system obtains very promising visual-only KWS results on the challenging LRS2 database, for keywords unseen during training. We also show that our system outperforms a baseline which addresses KWS via automatic speech recognition (ASR), while it drastically improves over other recently proposed ASR-free KWS methods.Comment: Accepted at ECCV-201

    Hardness of Exact Distance Queries in Sparse Graphs Through Hub Labeling

    Full text link
    A distance labeling scheme is an assignment of bit-labels to the vertices of an undirected, unweighted graph such that the distance between any pair of vertices can be decoded solely from their labels. An important class of distance labeling schemes is that of hub labelings, where a node vGv \in G stores its distance to the so-called hubs SvVS_v \subseteq V, chosen so that for any u,vVu,v \in V there is wSuSvw \in S_u \cap S_v belonging to some shortest uvuv path. Notice that for most existing graph classes, the best distance labelling constructions existing use at some point a hub labeling scheme at least as a key building block. Our interest lies in hub labelings of sparse graphs, i.e., those with E(G)=O(n)|E(G)| = O(n), for which we show a lowerbound of n2O(logn)\frac{n}{2^{O(\sqrt{\log n})}} for the average size of the hubsets. Additionally, we show a hub-labeling construction for sparse graphs of average size O(nRS(n)c)O(\frac{n}{RS(n)^{c}}) for some 0<c<10 < c < 1, where RS(n)RS(n) is the so-called Ruzsa-Szemer{\'e}di function, linked to structure of induced matchings in dense graphs. This implies that further improving the lower bound on hub labeling size to n2(logn)o(1)\frac{n}{2^{(\log n)^{o(1)}}} would require a breakthrough in the study of lower bounds on RS(n)RS(n), which have resisted substantial improvement in the last 70 years. For general distance labeling of sparse graphs, we show a lowerbound of 12O(logn)SumIndex(n)\frac{1}{2^{O(\sqrt{\log n})}} SumIndex(n), where SumIndex(n)SumIndex(n) is the communication complexity of the Sum-Index problem over ZnZ_n. Our results suggest that the best achievable hub-label size and distance-label size in sparse graphs may be Θ(n2(logn)c)\Theta(\frac{n}{2^{(\log n)^c}}) for some 0<c<10<c < 1

    Quantum Reverse Shannon Theorem

    Get PDF
    Dual to the usual noisy channel coding problem, where a noisy (classical or quantum) channel is used to simulate a noiseless one, reverse Shannon theorems concern the use of noiseless channels to simulate noisy ones, and more generally the use of one noisy channel to simulate another. For channels of nonzero capacity, this simulation is always possible, but for it to be efficient, auxiliary resources of the proper kind and amount are generally required. In the classical case, shared randomness between sender and receiver is a sufficient auxiliary resource, regardless of the nature of the source, but in the quantum case the requisite auxiliary resources for efficient simulation depend on both the channel being simulated, and the source from which the channel inputs are coming. For tensor power sources (the quantum generalization of classical IID sources), entanglement in the form of standard ebits (maximally entangled pairs of qubits) is sufficient, but for general sources, which may be arbitrarily correlated or entangled across channel inputs, additional resources, such as entanglement-embezzling states or backward communication, are generally needed. Combining existing and new results, we establish the amounts of communication and auxiliary resources needed in both the classical and quantum cases, the tradeoffs among them, and the loss of simulation efficiency when auxiliary resources are absent or insufficient. In particular we find a new single-letter expression for the excess forward communication cost of coherent feedback simulations of quantum channels (i.e. simulations in which the sender retains what would escape into the environment in an ordinary simulation), on non-tensor-power sources in the presence of unlimited ebits but no other auxiliary resource. Our results on tensor power sources establish a strong converse to the entanglement-assisted capacity theorem.Comment: 35 pages, to appear in IEEE-IT. v2 has a fixed proof of the Clueless Eve result, a new single-letter formula for the "spread deficit", better error scaling, and an improved strong converse. v3 and v4 each make small improvements to the presentation and add references. v5 fixes broken reference

    Hardness of decoding quantum stabilizer codes

    Full text link
    In this article we address the computational hardness of optimally decoding a quantum stabilizer code. Much like classical linear codes, errors are detected by measuring certain check operators which yield an error syndrome, and the decoding problem consists of determining the most likely recovery given the syndrome. The corresponding classical problem is known to be NP-complete, and a similar decoding problem for quantum codes is also known to be NP-complete. However, this decoding strategy is not optimal in the quantum setting as it does not take into account error degeneracy, which causes distinct errors to have the same effect on the code. Here, we show that optimal decoding of stabilizer codes is computationally much harder than optimal decoding of classical linear codes, it is #P

    Cross-Lingual Adaptation using Structural Correspondence Learning

    Full text link
    Cross-lingual adaptation, a special case of domain adaptation, refers to the transfer of classification knowledge between two languages. In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation. The proposed method uses unlabeled documents from both languages, along with a word translation oracle, to induce cross-lingual feature correspondences. From these correspondences a cross-lingual representation is created that enables the transfer of classification knowledge from the source to the target language. The main advantages of this approach over other approaches are its resource efficiency and task specificity. We conduct experiments in the area of cross-language topic and sentiment classification involving English as source language and German, French, and Japanese as target languages. The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification) and 59% (sentiment classification). We further report on empirical analyses that reveal insights into the use of unlabeled data, the sensitivity with respect to important hyperparameters, and the nature of the induced cross-lingual correspondences
    corecore