36 research outputs found

    Discovering Valuable Items from Massive Data

    Full text link
    Suppose there is a large collection of items, each with an associated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several important problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween items, expressed as a kernel function. GP-Select uses Gaussian process prediction to balance exploration (estimating the unknown value of items) and exploitation (selecting items of high value). We extend GP-Select to be able to discover sets that simultaneously have high utility and are diverse. Our preference for diversity can be specified as an arbitrary monotone submodular function that quantifies the diminishing returns obtained when selecting similar items. Furthermore, we exploit the structure of the model updates to achieve an order of magnitude (up to 40X) speedup in our experiments without resorting to approximations. We provide strong guarantees on the performance of GP-Select and apply it to three real-world case studies of industrial relevance: (1) Refreshing a repository of prices in a Global Distribution System for the travel industry, (2) Identifying diverse, binding-affine peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale recommender system by recommending items to users

    Liquid Welfare guarantees for No-Regret Learning in Sequential Budgeted Auctions

    Full text link
    We study the liquid welfare in repeated first-price auctions with budget limited buyers. We focus on first-price auctions, which are commonly used in many settings, and consider liquid welfare, a natural and well-studied generalization of social welfare for the case of budget-limited buyers. We use a behavioral model for the buyers, assuming a learning style guarantee: the resulting utility of each buyer is within a γ\gamma factor (where γ≥1\gamma \ge 1) of the utility achievable by shading her value with the same factor at each iteration. We show a γ+1/2+O(1/γ)\gamma + 1/2 + O(1/\gamma) price of anarchy for liquid welfare assuming buyers have additive valuations. This positive result is in stark contrast to repeated second-price auctions, where even with γ=1\gamma=1, the resulting liquid welfare can be arbitrarily smaller than the maximum liquid welfare. We prove a lower bound of γ\gamma on the liquid welfare loss under the above assumption in first-price auctions, making our bound asymptotically tight. For the case when γ=1\gamma = 1 our theorem implies a price of anarchy upper bound that is about 2.42.4; we show a lower bound of 22 for that case. We also give a learning algorithm that the players can use to achieve the guarantee needed for our liquid welfare result. Our algorithm achieves utility within a γ=O(1)\gamma = O(1) factor of the optimal utility even when a buyer's values and the bids of the other buyers are chosen adversarially, assuming the buyer's budget grows linearly with time. The competitiveness guarantee of the learning algorithm deteriorates somewhat as the budget grows slower than linearly with time. Finally, we extend our liquid welfare results for the case where buyers have submodular valuations over the set of items they win across iterations with a slightly worse price of anarchy bound of γ+1+O(1/γ)\gamma + 1 + O(1/\gamma) compared to the guarantee for the additive case

    SEQUENTIAL DECISION MAKING WITH LIMITED RESOURCES

    Get PDF
    One of the goals of Artificial Intelligence (AI) is to enable multiple agents to interact, co-ordinate and compete with each other to realize various goals. Typically, this is achieved via a system which acts as a mediator to control the agents' behavior via incentives. Such systems are ubiquitous and include online systems for shopping (e.g., Amazon), ride-sharing (e.g., Uber, Lyft) and Internet labor markets (e.g., Mechanical Turk). The main algorithmic challenge in such systems is to ensure that they can operate under a variety of informational constraints such as uncertainty in the input, committing to actions based on partial information or being unaffected by noisy input. The mathematical framework used to study such systems are broadly called \emph{sequential decision making} problems where the algorithm does not receive the entire input at once; it obtains parts of the input by interacting (also called "actions") with the environment. In this thesis, we answer the question, under what informational constraints can we design efficient algorithms for sequential decision making problems. The first part of the thesis deals with the Online Matching problem. Here, the algorithm deals with two prominent constraints: uncertainty in the input and choice of actions being restricted by a combinatorial constraint. We design several new algorithms for many variants of this problem and provide provable guarantees. We also show their efficacy on the ride-share application using a real-world dataset. In the second part of the thesis, we consider the Multi-armed bandit problem with additional informational constraints. In this setting, the algorithm does not receive the entire input and needs to make decisions based on partial observations. Additionally, the set of possible actions is controlled by global resource constraints that bind across time. We design new algorithms for multiple variants of this problem that are worst-case optimal. We provide a general reduction framework to the classic multi-armed bandits problem without any constraints. We complement some of the results with preliminary numerical experiments

    Approximation algorithms for geometric dispersion

    Get PDF
    The most basic form of the max-sum dispersion problem (MSD) is as follows: given n points in R^q and an integer k, select a set of k points such that the sum of the pairwise distances within the set is maximal. This is a prominent diversity problem, with wide applications in web search and information retrieval, where one needs to find a small and diverse representative subset of a large dataset. The problem has recently received a great deal of attention in the computational geometry and operations research communities; and since it is NP-hard, research has focused on efficient heuristics and approximation algorithms. Several classes of distance functions have been considered in the literature. Many of the most common distances used in applications are induced by a norm in a real vector space. The focus of this thesis is on MSD over these geometric instances. We provide for it simple and fast polynomial-time approximation schemes (PTASs), as well as improved constant-factor approximation algorithms. We pay special attention to the class of negative-type distances, a class that includes Euclidean and Manhattan distances, among many others. In order to exploit the properties of this class, we apply several techniques and results from the theory of isometric embeddings. We explore the following variations of the MSD problem: matroid and matroid-intersection constraints, knapsack constraints, and the mixed-objective problem that maximizes a combination of the sum of pairwise distances with a submodular monotone function. In addition to approximation algorithms, we present a core-set for geometric instances of low dimension, and we discuss the efficient implementation of some of our algorithms for massive datasets, using the streaming and distributed models of computation

    On Connections Between Machine Learning And Information Elicitation, Choice Modeling, And Theoretical Computer Science

    Get PDF
    Machine learning, which has its origins at the intersection of computer science and statistics, is now a rapidly growing area of research that is being integrated into almost every discipline in science and business such as economics, marketing and information retrieval. As a consequence of this integration, it is necessary to understand how machine learning interacts with these disciplines and to understand fundamental questions that arise at the resulting interfaces. The goal of my thesis research is to study these interdisciplinary questions at the interface of machine learning and other disciplines including mechanism design/information elicitation, preference/choice modeling, and theoretical computer science
    corecore