32 research outputs found
Sampling correctors
In many situations, sample data is obtained from a noisy or imperfect source. In order to address such corruptions, this paper introduces the concept of a sampling corrector. Such algorithms use structure that the distribution is purported to have, in order to allow one to make "on-the-fly" corrections to samples drawn from probability distributions. These algorithms then act as filters between the noisy data and the end user.
We show connections between sampling correctors, distribution learning algorithms, and distribution property testing algorithms. We show that these connections can be utilized to expand the applicability of known distribution learning and property testing algorithms as well as to achieve improved algorithms for those tasks. As a first step, we show how to design sampling correctors using proper learning algorithms. We then focus on the question of whether algorithms for sampling correctors can be more efficient in terms of sample complexity than learning algorithms for the analogous families of distributions. When correcting monotonicity, we show that this is indeed the case when also granted query access to the cumulative distribution function. We also obtain sampling correctors for monotonicity without this stronger type of access, provided that the distribution be originally very close to monotone (namely, at a distance O(1/log2 n)). In addition to that, we consider a restricted error model that aims at capturing "missing data" corruptions. In this model, we show that distributions that are close to monotone have sampling correctors that are significantly more efficient than achievable by the learning approach. We then consider the question of whether an additional source of independent random bits is required by sampling correctors to implement the correction process. We show that for correcting close-to-uniform distributions and close-to-monotone distributions, no additional source of random bits is required, as the samples from the input source itself can be used to produce this randomness
Locally Decodable Codes for Edit Distance
Abstract. Locally decodable codes (LDC) [1,5] are error correcting codes that allow decoding (any) individual symbol of the message, by reading only few symbols of the codeword. Consider an application such as storage solutions for large data, where errors may occur in the disks (or some disks may just crush). In such an application, it is often de-sirable to recover only small portions of the data (have random access). Thus, in such applications, using LDC provides enormous efficiency gains over standard error correcting codes (ECCs), that need to read the en-tire encoded message to learn even a single bit of information. Typically, LDC’s, as well as standard ECC’s decode the encoded messaged if upto some bounded fraction of the symbols had been modified. This corre-sponds to decoding strings of bounded Hamming distance from a valid codeword. An often more realistic metric is the edit distance, measur-ing the shortest sequence of insertions and deletions (indel.) of symbols leading from one word to another. For example, (few) indel. modifica
Do Israelis understand the Hebrew bible?
The Hebrew Bible should be taught like a foreign language in Israel too, argues Ghil'ad Zuckermann, inter alia endorsing Avraham Ahuvia’s recently-launched translation of the Old Testament into what Zuckermann calls high-register 'Israeli'. According to Zuckermann, Tanakh RAM fulfills the mission of 'red 'el ha'am' not only in its Hebrew meaning (Go down to the people) but also – more importantly – in its Yiddish meaning ('red' meaning 'speak!', as opposed to its colorful communist sense). Ahuvia's translation is most useful and dignified. Given its high register, however, Zuckermann predicts that the future promises consequent translations into more colloquial forms of Israeli, a beautifully multi-layered and intricately multi-sourced language, of which to be proud.Ghil'ad Zuckerman
Testing non-uniform k-wise independent distributions over product spaces (extended abstract)
A distribution D over Σ1× ⋯ ×Σ n is called (non-uniform) k-wise independent if for any set of k indices {i 1, ..., i k } and for any z1zki1ik, PrXD[Xi1Xik=z1zk]=PrXD[Xi1=z1]PrXD[Xik=zk]. We study the problem of testing (non-uniform) k-wise independent distributions over product spaces. For the uniform case we show an upper bound on the distance between a distribution D from the set of k-wise independent distributions in terms of the sum of Fourier coefficients of D at vectors of weight at most k. Such a bound was previously known only for the binary field. For the non-uniform case, we give a new characterization of distributions being k-wise independent and further show that such a characterization is robust. These greatly generalize the results of Alon et al. [1] on uniform k-wise independence over the binary field to non-uniform k-wise independence over product spaces. Our results yield natural testing algorithms for k-wise independence with time and sample complexity sublinear in terms of the support size when k is a constant. The main technical tools employed include discrete Fourier transforms and the theory of linear systems of congruences.National Science Foundation (U.S.) (NSF grant 0514771)National Science Foundation (U.S.) (grant 0728645)National Science Foundation (U.S.) (Grant 0732334)Marie Curie International Reintegration Grants (Grant PIRG03-GA-2008-231077)Israel Science Foundation (Grant 1147/09)Israel Science Foundation (Grant 1675/09)Massachusetts Institute of Technology (Akamai Presidential Fellowship
Tight Upper and Lower Bounds for Leakage-Resilient, Locally Decodable and Updatable Non-Malleable Codes
In a recent result, Dachman-Soled et al.~(TCC \u2715) proposed a new notion called locally decodable and updatable non-malleable codes, which informally, provides the security guarantees of a non-malleable code while also allowing for efficient random access. They also considered locally decodable and updatable non-malleable codes that are leakage-resilient, allowing for adversaries who continually leak information in addition to tampering. Unfortunately, the locality of their construction in the continual setting was Omega(log n), meaning that if the original message size was n, then Omega(log n) positions of the codeword had to be accessed upon each decode and update instruction.
In this work, we ask whether super-constant locality is inherent in this setting. We answer the question affirmatively by showing tight upper and lower bounds. Specifically, in any threat model which allows for a rewind attack-wherein the attacker leaks a small amount of data, waits for the data to be overwritten and then writes the original data back-we show that a locally decodable and updatable non-malleable code with block size Chi in poly(lambda) number of bits requires locality delta(n) in omega(1), where n = poly(lambda) is message length and lambda is security parameter. On the other hand, we re-visit the threat model of Dachman-Soled et al.~(TCC \u2715)-which indeed allows the adversary to launch a rewind attack-and present a construction of a locally decodable and updatable non-malleable code with block size Chi in Omega(lambda^{1/mu}) number of bits (for constant 0 < mu < 1) with locality delta(n), for any delta(n) in omega(1), and n = poly(lambda)
How to Correct Errors in Multi-Server PIR
Suppose that there exist a user and servers . Each server holds a copy of a database , and the user holds a secret index . A b error correcting server PIR (Private Information Retrieval) scheme allows a user to retrieve correctly even if and or less servers return false answers while each server learns no information on in the information theoretic sense. Although there exists such a scheme with the total communication cost where , the decoding algorithm is very inefficient.
In this paper, we show an efficient decoding algorithm for this error correcting server PIR scheme. It runs in time