8 research outputs found

    How to Sell Information Optimally: An Algorithmic Study

    Get PDF
    We investigate the algorithmic problem of selling information to agents who face a decision-making problem under uncertainty. We adopt the model recently proposed by Bergemann et al. [Bergemann et al., 2018], in which information is revealed through signaling schemes called experiments. In the single-agent setting, any mechanism can be represented as a menu of experiments. Our results show that the computational complexity of designing the revenue-optimal menu depends heavily on the way the model is specified. When all the parameters of the problem are given explicitly, we provide a polynomial time algorithm that computes the revenue-optimal menu. For cases where the model is specified with a succinct implicit description, we show that the tractability of the problem is tightly related to the efficient implementation of a Best Response Oracle: when it can be implemented efficiently, we provide an additive FPTAS whose running time is independent of the number of actions. On the other hand, we provide a family of problems, where it is computationally intractable to construct a best response oracle, and we show that it is NP-hard to get even a constant fraction of the optimal revenue. Moreover, we investigate a generalization of the original model by Bergemann et al. [Bergemann et al., 2018] that allows multiple agents to compete for useful information. We leverage techniques developed in the study of auction design (see e.g. [Yang Cai et al., 2012; Saeed Alaei et al., 2012; Yang Cai et al., 2012; Yang Cai et al., 2013; Yang Cai et al., 2013]) to design a polynomial time algorithm that computes the revenue-optimal mechanism for selling information

    Multiclass Learnability Beyond the PAC Framework: Universal Rates and Partial Concept Classes

    Full text link
    In this paper we study the problem of multiclass classification with a bounded number of different labels kk, in the realizable setting. We extend the traditional PAC model to a) distribution-dependent learning rates, and b) learning rates under data-dependent assumptions. First, we consider the universal learning setting (Bousquet, Hanneke, Moran, van Handel and Yehudayoff, STOC '21), for which we provide a complete characterization of the achievable learning rates that holds for every fixed distribution. In particular, we show the following trichotomy: for any concept class, the optimal learning rate is either exponential, linear or arbitrarily slow. Additionally, we provide complexity measures of the underlying hypothesis class that characterize when these rates occur. Second, we consider the problem of multiclass classification with structured data (such as data lying on a low dimensional manifold or satisfying margin conditions), a setting which is captured by partial concept classes (Alon, Hanneke, Holzman and Moran, FOCS '21). Partial concepts are functions that can be undefined in certain parts of the input space. We extend the traditional PAC learnability of total concept classes to partial concept classes in the multiclass setting and investigate differences between partial and total concepts

    Is Selling Complete Information (Approximately) Optimal?

    Get PDF
    We study the problem of selling information to a data-buyer who faces a decision problem under uncertainty. We consider the classic Bayesian decision-theoretic model pioneered by Blackwell [Bla51, Bla53]. Initially, the data buyer has only partial information about the payoff-relevant state of the world. A data seller offers additional information about the state of the world. The information is revealed through signaling schemes, also referred to as experiments. In the single-agent setting, any mechanism can be represented as a menu of experiments. A recent paper by Bergemann et al. [BBS18] present a complete characterization of the revenue-optimal mechanism in a binary state and binary action environment. By contrast, no characterization is known for the case with more actions. In this paper, we consider more general environments and study arguably the simplest mechanism, which only sells the fully informative experiment. In the environment with binary state and m ≥ 3 actions, we provide an O(m)-approximation to the optimal revenue by selling only the fully informative experiment and show that the approximation ratio is tight up to an absolute constant factor. An important corollary of our lower bound is that the size of the optimal menu must grow at least linearly in the number of available actions, so no universal upper bound exists for the size of the optimal menu in the general single-dimensional setting. We also provide a sufficient condition under which selling only the fully informative experiment achieves the optimal revenue. For multi-dimensional environments, we prove that even in arguably the simplest matching utility environment with 3 states and 3 actions, the ratio between the optimal revenue and the revenue by selling only the fully informative experiment can grow immediately to a polynomial of the number of agent types. Nonetheless, if the distribution is uniform, we show that selling only the fully informative experiment is indeed the optimal mechanism

    Replicability in Reinforcement Learning

    Full text link
    We initiate the mathematical study of replicability as an algorithmic property in the context of reinforcement learning (RL). We focus on the fundamental setting of discounted tabular MDPs with access to a generative model. Inspired by Impagliazzo et al. [2022], we say that an RL algorithm is replicable if, with high probability, it outputs the exact same policy after two executions on i.i.d. samples drawn from the generator when its internal randomness is the same. We first provide an efficient ρ\rho-replicable algorithm for (ε,δ)(\varepsilon, \delta)-optimal policy estimation with sample and time complexity O~(N3log(1/δ)(1γ)5ε2ρ2)\widetilde O\left(\frac{N^3\cdot\log(1/\delta)}{(1-\gamma)^5\cdot\varepsilon^2\cdot\rho^2}\right), where NN is the number of state-action pairs. Next, for the subclass of deterministic algorithms, we provide a lower bound of order Ω(N3(1γ)3ε2ρ2)\Omega\left(\frac{N^3}{(1-\gamma)^3\cdot\varepsilon^2\cdot\rho^2}\right). Then, we study a relaxed version of replicability proposed by Kalavasis et al. [2023] called TV indistinguishability. We design a computationally efficient TV indistinguishable algorithm for policy estimation whose sample complexity is O~(N2log(1/δ)(1γ)5ε2ρ2)\widetilde O\left(\frac{N^2\cdot\log(1/\delta)}{(1-\gamma)^5\cdot\varepsilon^2\cdot\rho^2}\right). At the cost of exp(N)\exp(N) running time, we transform these TV indistinguishable algorithms to ρ\rho-replicable ones without increasing their sample complexity. Finally, we introduce the notion of approximate-replicability where we only require that two outputted policies are close under an appropriate statistical divergence (e.g., Renyi) and show an improved sample complexity of O~(Nlog(1/δ)(1γ)5ε2ρ2)\widetilde O\left(\frac{N\cdot\log(1/\delta)}{(1-\gamma)^5\cdot\varepsilon^2\cdot\rho^2}\right).Comment: to be published in neurips 202

    Replicable Clustering

    Full text link
    We design replicable algorithms in the context of statistical clustering under the recently introduced notion of replicability from Impagliazzo et al. [2022]. According to this definition, a clustering algorithm is replicable if, with high probability, its output induces the exact same partition of the sample space after two executions on different inputs drawn from the same distribution, when its internal randomness is shared across the executions. We propose such algorithms for the statistical kk-medians, statistical kk-means, and statistical kk-centers problems by utilizing approximation routines for their combinatorial counterparts in a black-box manner. In particular, we demonstrate a replicable O(1)O(1)-approximation algorithm for statistical Euclidean kk-medians (kk-means) with poly(d)\operatorname{poly}(d) sample complexity. We also describe an O(1)O(1)-approximation algorithm with an additional O(1)O(1)-additive error for statistical Euclidean kk-centers, albeit with exp(d)\exp(d) sample complexity. In addition, we provide experiments on synthetic distributions in 2D using the kk-means++ implementation from sklearn as a black-box that validate our theoretical results.Comment: to be published in NeurIPS 202

    Replicable Bandits

    Full text link
    In this paper, we introduce the notion of replicable policies in the context of stochastic bandits, one of the canonical problems in interactive learning. A policy in the bandit environment is called replicable if it pulls, with high probability, the exact same sequence of arms in two different and independent executions (i.e., under independent reward realizations). We show that not only do replicable policies exist, but also they achieve almost the same optimal (non-replicable) regret bounds in terms of the time horizon. More specifically, in the stochastic multi-armed bandits setting, we develop a policy with an optimal problem-dependent regret bound whose dependence on the replicability parameter is also optimal. Similarly, for stochastic linear bandits (with finitely and infinitely many arms) we develop replicable policies that achieve the best-known problem-independent regret bounds with an optimal dependency on the replicability parameter. Our results show that even though randomization is crucial for the exploration-exploitation trade-off, an optimal balance can still be achieved while pulling the exact same arms in two different rounds of executions

    Optimal Learners for Realizable Regression: PAC Learning and Online Learning

    Full text link
    In this work, we aim to characterize the statistical complexity of realizable regression both in the PAC learning setting and the online learning setting. Previous work had established the sufficiency of finiteness of the fat shattering dimension for PAC learnability and the necessity of finiteness of the scaled Natarajan dimension, but little progress had been made towards a more complete characterization since the work of Simon 1997 (SICOMP '97). To this end, we first introduce a minimax instance optimal learner for realizable regression and propose a novel dimension that both qualitatively and quantitatively characterizes which classes of real-valued predictors are learnable. We then identify a combinatorial dimension related to the Graph dimension that characterizes ERM learnability in the realizable setting. Finally, we establish a necessary condition for learnability based on a combinatorial dimension related to the DS dimension, and conjecture that it may also be sufficient in this context. Additionally, in the context of online learning we provide a dimension that characterizes the minimax instance optimal cumulative loss up to a constant factor and design an optimal online learner for realizable regression, thus resolving an open question raised by Daskalakis and Golowich in STOC '22
    corecore