67,143 research outputs found

    A Direct Reputation Model for VO Formation

    No full text
    We show that reputation is a basic ingredient in the Virtual Organisation (VO) formation process. Agents can use their experiences gained in direct past interactions to model other’s reputation and deciding on either join a VO or determining who is the most suitable set of partners. Reputation values are computed using a reinforcement learning algorithm, so agents can learn and adapt their reputation models of their partners according to their recent behaviour. Our approach is especially powerful if the agent participates in a VO in which the members can change their behaviour to exploit their partners. The reputation model presented in this paper deals with the questions of deception and fraud that have been ignored in current models of VO formation

    The Influence of Regret Proneness, Evidence Strengthening, and Perceived Responsibility on Verdict Preference

    Get PDF
    In the present study, we investigated perceived responsibility, evidence strengthening, and defendant gender in the context of a criminal trial involving DNA. Evidence was introduced post-trial and varied as strengthening the defendant’s guilt v. innocence. We also examined perceptions of perceived responsibility for verdict in order to more closely evaluate the role of regret in decision-making. Results indicated that DNA evidence is perceived as reliable, regardless of whether it strengthened guilt or innocence. In addition, greater confidence in verdict was observed when evidence strengthened the guilt of a female defendant vs. a male defendant. Finally, jurors experiencing high levels of regret perceived DNA evidence more selectively compared to jurors with low levels of regret, supporting the importance of identifying individual difference factors prior to trial

    An Efficient Bandit Algorithm for Realtime Multivariate Optimization

    Full text link
    Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to display, etc. Such optimization is a combinatorial problem over an exponentially large decision space. Randomized experiments do not scale well to this setting, and therefore, in practice, one is typically limited to optimizing a single aspect of a web page at a time. This represents a missed opportunity in both the speed of experimentation and the exploitation of possible interactions between layout decisions. Here we focus on multivariate optimization of interactive web pages. We formulate an approach where the possible interactions between different components of the page are modeled explicitly. We apply bandit methodology to explore the layout space efficiently and use hill-climbing to select optimal content in realtime. Our algorithm also extends to contextualization and personalization of layout selection. Simulation results show the suitability of our approach to large decision spaces with strong interactions between content. We further apply our algorithm to optimize a message that promotes adoption of an Amazon service. After only a single week of online optimization, we saw a 21% conversion increase compared to the median layout. Our technique is currently being deployed to optimize content across several locations at Amazon.com.Comment: KDD'17 Audience Appreciation Awar

    Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms

    Full text link
    Contextual bandit algorithms have become popular for online recommendation systems such as Digg, Yahoo! Buzz, and news recommendation in general. \emph{Offline} evaluation of the effectiveness of new algorithms in these applications is critical for protecting online user experiences but very challenging due to their "partial-label" nature. Common practice is to create a simulator which simulates the online environment for the problem at hand and then run an algorithm against this simulator. However, creating simulator itself is often difficult and modeling bias is usually unavoidably introduced. In this paper, we introduce a \emph{replay} methodology for contextual bandit algorithm evaluation. Different from simulator-based approaches, our method is completely data-driven and very easy to adapt to different applications. More importantly, our method can provide provably unbiased evaluations. Our empirical results on a large-scale news article recommendation dataset collected from Yahoo! Front Page conform well with our theoretical results. Furthermore, comparisons between our offline replay and online bucket evaluation of several contextual bandit algorithms show accuracy and effectiveness of our offline evaluation method.Comment: 10 pages, 7 figures, revised from the published version at the WSDM 2011 conferenc

    Analysis of Different Types of Regret in Continuous Noisy Optimization

    Get PDF
    The performance measure of an algorithm is a crucial part of its analysis. The performance can be determined by the study on the convergence rate of the algorithm in question. It is necessary to study some (hopefully convergent) sequence that will measure how "good" is the approximated optimum compared to the real optimum. The concept of Regret is widely used in the bandit literature for assessing the performance of an algorithm. The same concept is also used in the framework of optimization algorithms, sometimes under other names or without a specific name. And the numerical evaluation of convergence rate of noisy algorithms often involves approximations of regrets. We discuss here two types of approximations of Simple Regret used in practice for the evaluation of algorithms for noisy optimization. We use specific algorithms of different nature and the noisy sphere function to show the following results. The approximation of Simple Regret, termed here Approximate Simple Regret, used in some optimization testbeds, fails to estimate the Simple Regret convergence rate. We also discuss a recent new approximation of Simple Regret, that we term Robust Simple Regret, and show its advantages and disadvantages.Comment: Genetic and Evolutionary Computation Conference 2016, Jul 2016, Denver, United States. 201

    Learning what matters - Sampling interesting patterns

    Get PDF
    In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.Comment: PAKDD 2017, extended versio
    • …
    corecore