Search CORE

19 research outputs found

Approximate degree in classical and quantum computing

Author: Bun Mark
Thaler Justin
Publication venue
Publication date: 22/02/2023
Field of study

In this book, the authors survey what is known about a particularly natural notion of approximation by polynomials, capturing pointwise approximation over the real numbers.FG-2022-18482 - Alfred P. Sloan Foundation; CNS-2046425 - National Science Foundation; CCF-1947889 - National Science FoundationAccepted manuscrip

Boston University Institutional Repository (OpenBU)

Recommended from our members

On the Learnability of Monotone Functions

Author: Lee Homin K.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

A longstanding lacuna in the field of computational learning theory is the learnability of succinctly representable monotone Boolean functions, i.e., functions that preserve the given order of the input. This thesis makes significant progress towards understanding both the possibilities and the limitations of learning various classes of monotone functions by carefully considering the complexity measures used to evaluate them. We show that Boolean functions computed by polynomial-size monotone circuits are hard to learn assuming the existence of one-way functions. Having shown the hardness of learning general polynomial-size monotone circuits, we show that the class of Boolean functions computed by polynomial-size depth-3 monotone circuits are hard to learn using statistical queries. As a counterpoint, we give a statistical query learning algorithm that can learn random polynomial-size depth-2 monotone circuits (i.e., monotone DNF formulas). As a preliminary step towards a fully polynomial-time, proper learning algorithm for learning polynomial-size monotone decision trees, we also show the relationship between the average depth of a monotone decision tree, its average sensitivity, and its variance. Finally, we return to monotone DNF formulas, and we show that they are teachable (a different model of learning) in the average case. We also show that non-monotone DNF formulas, juntas, and sparse GF2 formulas are teachable in the average case

Columbia University Academic Commons

Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

Author: Bun Mark
Gaboardi Marco
Hopkins Max
Impagliazzo Russell
Lei Rex
Pitassi Toniann
Sivakumar Satchit
Sorrell Jessica
Publication venue
Publication date: 24/03/2023
Field of study

The notion of replicable algorithms was introduced in Impagliazzo et al. [STOC '22] to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is run on a new i.i.d. sample drawn from the same distribution. Using replicable algorithms for data analysis can facilitate the verification of published results by ensuring that the results of an analysis will be the same with high probability, even when that analysis is performed on a new data set. In this work, we establish new connections and separations between replicability and standard notions of algorithmic stability. In particular, we give sample-efficient algorithmic reductions between perfect generalization, approximate differential privacy, and replicability for a broad class of statistical problems. Conversely, we show any such equivalence must break down computationally: there exist statistical problems that are easy under differential privacy, but that cannot be solved replicably without breaking public-key cryptography. Furthermore, these results are tight: our reductions are statistically optimal, and we show that any computational separation between DP and replicability must imply the existence of one-way functions. Our statistical reductions give a new algorithmic framework for translating between notions of stability, which we instantiate to answer several open questions in replicability and privacy. This includes giving sample-efficient replicable algorithms for various PAC learning, distribution estimation, and distribution testing problems, algorithmic amplification of

\delta

in approximate DP, conversions from item-level to user-level privacy, and the existence of private agnostic-to-realizable learning reductions under structured distributions.Comment: STOC 2023, minor typos fixe

arXiv.org e-Print Archive

On Tolerant Testing and Tolerant Junta Testing

Author: Levi Amit
Publication venue: 'University of Waterloo'
Publication date: 06/07/2020
Field of study

Over the past few decades property testing has became an active field of study in theoretical computer science. The algorithmic task is to determine, given access to an unknown large object (e.g., function, graph, probability distribution), whether it has some fixed property, or it is far from any object having the property. The approximate nature of these algorithms allows in many cases to achieve a significant saving in running time, and obtain \emph{sublinear} running time. Nevertheless, in various settings and applications, accepting only inputs that exactly have a certain property is too restrictive, and it is more beneficial to distinguish between inputs that are close to having the property, and those that are far from it. The framework of \emph{tolerant} testing tackles this exact problem. In this thesis, we will focus on one of the most fundamental properties of Boolean functions: the property of being a \emph{

k

-junta} (i.e., being dependent on at most

k

variables). The first chapter focuses on algorithms for tolerant junta testing. In particular, we show that there exists a \poly(k) query algorithm distinguishing functions close to

k

-juntas and functions that are far from

2k

-juntas. We also show how to obtain a trade-off between the ``tolerance" of the algorithm and its query complexity. The second chapter focuses on establishing a query lower bound for tolerant junta testing. In particular, we show that any non-adaptive tolerant junta tester, is required to make at least \Omega(k^2/\polylog k) queries. The third chapter considers tolerant testing in a more general context, and asks whether tolerant testing is strictly harder than standard testing. In particular, we show that for any constant

\ell\in \N

, there exists a property \calP_\ell such that \calP_\ell can be tested in

O(1)

queries, but any tolerant tester for \calP_\ell is required to make at least

\Omega(n/\log^{(\ell)}n)

queries (where

\log^{(\ell)}

denote the

\ell

times iterated log function). The final chapter focuses on applications. We show how to leverage the techniques developed in previous chapters to obtain results on tolerant isomorphism testing, unateness testing, and erasure resilient testing

University of Waterloo's Institutional Repository

Data-Driven Recommender Systems: Sequences of recommendations

Author: Mary Jérémie
Publication venue: HAL CCSD
Publication date: 24/11/2015
Field of study

This document is about some scalable and reliable methods for recommender systems from a machine learner point of view. In particular it adresses some difficulties from the non stationary case

Thèses en Ligne

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

LIPIcs, Volume 251, ITCS 2023, Complete Volume

Author: Tauman Kalai Yael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 251, ITCS 2023, Complete Volum

Dagstuhl Research Online Publication Server

Recommended from our members

Privacy and the Complexity of Simple Queries

Author: Ullman Jonathan Robert
Publication venue: 'Harvard University Botany Libraries'
Publication date: 16/09/2013
Field of study

As both the scope and scale of data collection increases, an increasingly large amount of sensitive personal information is being analyzed. In this thesis, we study the feasibility of effectively carrying out such analyses while respecting the privacy concerns of all parties involved. In particular, we consider algorithms that satisfy differential privacy [30], a stringent notion of privacy that guarantees no individual’s data has a signiﬁcant inﬂuence on the information released about the database. Over the past decade, there has been tremendous progress in understanding when accurate data analysis is compatible with differential privacy, with both elegant algorithms and striking impossibility results. However, if we ask further when accurate and computationally efﬁcient data analysis is compatible with differential privacy then our understanding lags far behind. In this thesis, we make several contributions to understanding the complexity of differentially private data analysis: We show a sharp upper bound on the number of linear queries that can be accurately answered while satisfying differential privacy by an efﬁcient algorithm, assuming the existence of cryptographic traitor-tracing schemes. We show even stronger computational barriers for algorithms that generate private synthetic data—a new database that consists of “fake” records but preserves certain statistical properties of the original database. Under cryptographic assumptions, any efﬁcient differentially private algorithm that generates synthetic data cannot preserve even extremely simple properties of the database, even the pairwise correlations between the attributes. On the positive side, we design new algorithms for the widely-used class of marginal queries that are faster and require less data. Computational inefﬁciency is not the only barrier to effective privacy-preserving data analysis. Another potential obstacle is that many of the existing differentially private algorithms do not guarantee privacy for the data analyst, which would lead researchers with sensitive or proprietary queries to seek other means of access to the database. We also contribute to our understanding of privacy for the analyst: We design new algorithms for answering large sets of queries that guarantee differential privacy for the database and ensure differential privacy for the analysts, even if all other analysts collude.Engineering and Applied Science

Harvard University - DASH

Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

Author: Ferreira N.
Oliveira M.
Publication venue: CFE and CMStatistics networks
Publication date: 01/01/2015
Field of study

The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

Repositório Institucional do ISCTE-IUL