16 research outputs found
Property Testing with Online Adversaries
The online manipulation-resilient testing model, proposed by Kalemaj,
Raskhodnikova and Varma (ITCS 2022 and Theory of Computing 2023), studies
property testing in situations where access to the input degrades continuously
and adversarially. Specifically, after each query made by the tester is
answered, the adversary can intervene and either erase or corrupt data
points. In this work, we investigate a more nuanced version of the online model
in order to overcome old and new impossibility results for the original model.
We start by presenting an optimal tester for linearity and a lower bound for
low-degree testing of Boolean functions in the original model. We overcome the
lower bound by allowing batch queries, where the tester gets a group of queries
answered between manipulations of the data. Our batch size is small enough so
that function values for a single batch on their own give no information about
whether the function is of low degree. Finally, to overcome the impossibility
results of Kalemaj et al. for sortedness and the Lipschitz property of
sequences, we extend the model to include , i.e., adversaries that make
less than one erasure per query. For sortedness, we characterize the rate of
erasures for which online testing can be performed, exhibiting a sharp
transition from optimal query complexity to impossibility of testability (with
any number of queries). Our online tester works for a general class of local
properties of sequences. One feature of our results is that we get new (and in
some cases, simpler) optimal algorithms for several properties in the standard
property testing model.Comment: To be published in 15th Innovations in Theoretical Computer Science
(ITCS 2024
Recommended from our members
Mixture Models in Machine Learning
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings.
In this thesis, we look at three groups of problems. The first part is aimed at estimating the parameters of a mixture of simple distributions. We ask the following question: How many samples are necessary and sufficient to learn the latent parameters? We propose several approaches for this problem that include complex analytic tools to connect statistical distances between pairs of mixtures with the characteristic function. We show sufficient sample complexity guarantees for mixtures of popular distributions (including Gaussian, Poisson and Geometric). For many distributions, our results provide the first sample complexity guarantees for parameter estimation in the corresponding mixture. Using these techniques, we also provide improved lower bounds on the Total Variation distance between Gaussian mixtures with two components and demonstrate new results in some sequence reconstruction problems.
In the second part, we study Mixtures of Sparse Linear Regressions where the goal is to learn the best set of linear relationships between the scalar responses (i.e., labels) and the explanatory variables (i.e., features). We focus on a scenario where a learner is able to choose the features to get the labels. To tackle the high dimensionality of data, we further assume that the linear maps are also sparse , i.e., have only few prominent features among many. For this setting, we devise algorithms with sub-linear (as a function of the dimension) sample complexity guarantees that are also robust to noise.
In the final part, we study Mixtures of Sparse Linear Classifiers in the same setting as above. Given a set of features and the binary labels, the objective of this task is to find a set of hyperplanes in the space of features such that for any (feature, label) pair, there exists a hyperplane in the set that justifies the mapping. We devise efficient algorithms with sub-linear sample complexity guarantees for learning the unknown hyperplanes under similar sparsity assumptions as above. To that end, we propose several novel techniques that include tensor decomposition methods and combinatorial designs
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Classical and quantum sublinear algorithms
This thesis investigates the capabilities of classical and quantum sublinear algorithms through the lens of complexity theory. The formal classification of problems between “tractable” (by constructing efficient algorithms that solve them) and “intractable” (by proving no efficient algorithm can) is among the most fruitful lines of work in theoretical computer science, which includes, amongst an abundance of fundamental results and open problems, the notorious P vs. NP question.
This particular incarnation of the decision-versus-verification question stems from a choice of computational model: polynomial-time Turing machines. It is far from the only model worthy of investigation, however; indeed, measuring time up to polynomial factors is often too “coarse” for practical applications. We focus on quantum computation, a more complete model of physically realisable computation where quantum mechanical phenomena (such as interference and entanglement) may be used as computational resources; and sublinear algorithms, a formalisation of ultra-fast computation where merely reading or storing the entire input is impractical, e.g., when processing massive datasets such as social networks or large databases.
We begin our investigation by studying structural properties of local algorithms, a large class of sublinear algorithms that includes property testers and is characterised by the inability to even see most of the input. We prove that, in this setting, queries – the main complexity measure – can be replaced with random samples. Applying this
transformation yields, among other results, the state-of-the-art query lower bound for relaxed local decoders.
Focusing our attention onto property testers, we begin to chart the complexity�theoretic landscape arising from the classical vs. quantum and decision vs. verification questions in testing. We show that quantum hardware and communication with a powerful but untrusted prover are “orthogonal” resources, so that one cannot be substituted for the other. This implies all of the possible separations among the
analogues of QMA, MA and BQP in the property-testing setting.
We conclude with a study of zero-knowledge for (classical) streaming algorithms, which receive one-pass access to the entirety of their input but only have sublinear space. Inspired by cryptographic tools, we construct commitment protocols that are unconditionally secure in the streaming model and can be leveraged to obtain zero-knowledge streaming interactive proofs – and, in particular, show that zero-knowledge is achievable in this model
Technology and Testing
From early answer sheets filled in with number 2 pencils, to tests administered by mainframe computers, to assessments wholly constructed by computers, it is clear that technology is changing the field of educational and psychological measurement. The numerous and rapid advances have immediate impact on test creators, assessment professionals, and those who implement and analyze assessments. This comprehensive new volume brings together leading experts on the issues posed by technological applications in testing, with chapters on game-based assessment, testing with simulations, video assessment, computerized test development, large-scale test delivery, model choice, validity, and error issues. Including an overview of existing literature and ground-breaking research, each chapter considers the technological, practical, and ethical considerations of this rapidly-changing area. Ideal for researchers and professionals in testing and assessment, Technology and Testing provides a critical and in-depth look at one of the most pressing topics in educational testing today