1,758 research outputs found

    Self-Paced Multi-Task Learning

    Full text link
    In this paper, we propose a novel multi-task learning (MTL) framework, called Self-Paced Multi-Task Learning (SPMTL). Different from previous works treating all tasks and instances equally when training, SPMTL attempts to jointly learn the tasks by taking into consideration the complexities of both tasks and instances. This is inspired by the cognitive process of human brain that often learns from the easy to the hard. We construct a compact SPMTL formulation by proposing a new task-oriented regularizer that can jointly prioritize the tasks and the instances. Thus it can be interpreted as a self-paced learner for MTL. A simple yet effective algorithm is designed for optimizing the proposed objective function. An error bound for a simplified formulation is also analyzed theoretically. Experimental results on toy and real-world datasets demonstrate the effectiveness of the proposed approach, compared to the state-of-the-art methods

    A Hierarchical Bayesian Trust Model based on Reputation and Group Behaviour

    No full text
    In many systems, agents must rely on their peers to achieve their goals. However, when trusted to perform an action, an agent may betray that trust by not behaving as required. Agents must therefore estimate the behaviour of their peers, so that they may identify reliable interaction partners. To this end, we present a Bayesian trust model (HABIT) for assessing trust based on direct experience and (potentially unreliable) reputation. Although existing approaches claim to achieve this, most rely on heuristics with little theoretical foundation. In contrast, HABIT is based on principled statistical techniques; can be used with any representation of behaviour; and can assess trust based on observed similarities between groups of agents. In this paper, we describe the theoretical aspects of the model and present experimental results in which HABIT was shown to be up to twice as accurate at predicting trustee performance as an existing state-of-the-art trust model

    The Sample Complexity of Auctions with Side Information

    Full text link
    Traditionally, the Bayesian optimal auction design problem has been considered either when the bidder values are i.i.d, or when each bidder is individually identifiable via her value distribution. The latter is a reasonable approach when the bidders can be classified into a few categories, but there are many instances where the classification of bidders is a continuum. For example, the classification of the bidders may be based on their annual income, their propensity to buy an item based on past behavior, or in the case of ad auctions, the click through rate of their ads. We introduce an alternate model that captures this aspect, where bidders are a priori identical, but can be distinguished based (only) on some side information the auctioneer obtains at the time of the auction. We extend the sample complexity approach of Dhangwatnotai et al. and Cole and Roughgarden to this model and obtain almost matching upper and lower bounds. As an aside, we obtain a revenue monotonicity lemma which may be of independent interest. We also show how to use Empirical Risk Minimization techniques to improve the sample complexity bound of Cole and Roughgarden for the non-identical but independent value distribution case.Comment: A version of this paper appeared in STOC 201

    DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

    Full text link
    Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions both to an individual's initial features and to intermediate outcomes and features at each subsequent stage, which are affected by decisions in prior stages. Examples include personalized first- and second-line treatments of chronic conditions like diabetes, cancer, and depression, which adapt to patient response to first-line treatment, disease progression, and individual characteristics. While existing literature mostly focuses on estimating the optimal DTR from offline data such as from sequentially randomized trials, we study the problem of developing the optimal DTR in an online manner, where the interaction with each individual affect both our cumulative reward and our data collection for future learning. We term this the DTR bandit problem. We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear. We demonstrate our algorithm and its benefits both in synthetic experiments and in a case study of adaptive treatment of major depressive disorder using real-world data
    • ā€¦
    corecore