Search CORE

14,311 research outputs found

Reinstated episodic context guides sampling-based decisions for reward.

Author: Aaron M Bornstein
AM Bornstein
B Lau
BD Bernheim
D Shohamy
DH Brainard
G Schwarz
GE Wimmer
I Erev
KA Norman
Kenneth A Norman
MN Shadlen
MW Howard
ND Daw
PB Sederberg
R Epstein
SJ Gershman
SJ Gershman
SM Polyn
SM Smith
TEJ Behrens
Publication venue: eScholarship, University of California
Publication date: 01/07/2017
Field of study

How does experience inform decisions? In episodic sampling, decisions are guided by a few episodic memories of past choices. This process can yield choice patterns similar to model-free reinforcement learning; however, samples can vary from trial to trial, causing decisions to vary. Here we show that context retrieved during episodic sampling can cause choice behavior to deviate sharply from the predictions of reinforcement learning. Specifically, we show that, when a given memory is sampled, choices (in the present) are influenced by the properties of other decisions made in the same context as the sampled event. This effect is mediated by fMRI measures of context retrieval on each trial, suggesting a mechanism whereby cues trigger retrieval of context, which then triggers retrieval of other decisions from that context. This result establishes a new avenue by which experience can guide choice and, as such, has broad implications for the study of decisions

Crossref

eScholarship - University of California

Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems

Author: Ho Chien-Ju
Slivkins Aleksandrs
Vaughan Jennifer Wortman
Publication venue
Publication date: 02/09/2015
Field of study

Crowdsourcing markets have emerged as a popular platform for matching available workers with tasks to complete. The payment for a particular task is typically set by the task's requester, and may be adjusted based on the quality of the completed work, for example, through the use of "bonus" payments. In this paper, we study the requester's problem of dynamically adjusting quality-contingent payments for tasks. We consider a multi-round version of the well-known principal-agent model, whereby in each round a worker makes a strategic choice of the effort level which is not directly observable by the requester. In particular, our formulation significantly generalizes the budget-free online task pricing problems studied in prior work. We treat this problem as a multi-armed bandit problem, with each "arm" representing a potential contract. To cope with the large (and in fact, infinite) number of arms, we propose a new algorithm, AgnosticZooming, which discretizes the contract space into a finite number of regions, effectively treating each region as a single arm. This discretization is adaptively refined, so that more promising regions of the contract space are eventually discretized more finely. We analyze this algorithm, showing that it achieves regret sublinear in the time horizon and substantially improves over non-adaptive discretization (which is the only competing approach in the literature). Our results advance the state of art on several different topics: the theory of crowdsourcing markets, principal-agent problems, multi-armed bandits, and dynamic pricing.Comment: This is the full version of a paper in the ACM Conference on Economics and Computation (ACM-EC), 201

arXiv.org e-Print Archive

CiteSeerX

The Allocation of Software Development Resources In ‘Open Source’ Production Mode

Author: Jean-Michel Dalle
Paul David
Publication venue
Publication date
Field of study

This paper aims to develop a stochastic simulation structure capable of describing the decentralized, micro-level decisions that allocate programming resources both within and among open source/free software (OS/FS) projects, and that thereby generate an array of OS/FS system products each of which possesses particular qualitative attributes. The core or behavioral kernel of simulation tool presented here represents the effects of the reputational reward structure of OS/FS communities (as characterized by Raymond 1998) to be the key mechanism governing the probabilistic allocation of agents’ individual contributions among the constituent components of an evolving software system. In this regard, our approach follows the institutional analysis approach associated with studies of academic researchers in “open science” communities. For the purposes of this first step, the focus of the analysis is confined to showing the ways in which the specific norms of the reward system and organizational rules can shape emergent properties of successive releases of code for a given project, such as its range of functions and reliability. The global performance of the OS/FS mode, in matching the functional and other characteristics of the variety of software systems that are produced with the needs of users in various sectors of the economy and polity, obviously, is a matter of considerable importance that will bear upon the long-term viability and growth of this mode of organizing production and distribution. Our larger objective, therefore, is to arrive at a parsimonious characterization of the workings of OS/FS communities engaged across a number of projects, and their collective productive performance in dimensions that are amenable to “social welfare” evaluation. Seeking that goal will pose further new and interesting problems for study, a number of which are identified in the essay’s conclusion. Yet, it is argued that that these too will be found to be tractable within the framework provided by refining and elaborating on the core (“proof of concept”) model that is presented in this paper.

Research Papers in Economics

Decision-making model for adaptive impedance control of teleoperation systems

Author: Corredor Javier
Peer Angelika
Sofroni Jorge
Sofrony Jorge
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

© 2008-2011 IEEE. This paper presents a haptic assistance strategy for teleoperation that makes a task and situation-specific compromise between improving tracking performance or human-machine interaction in partially structured environments via the scheduling of the parameters of an admittance controller. The proposed assistance strategy builds on decision-making models and combines one of them with impedance control techniques that are standard in bilateral teleoperation systems. Even though several decision-making models have been proposed in cognitive science, their application to assisted teleoperation and assisted robotics has hardly been explored yet. Experimental data supports the Drift-Diffusion model as a suitable scheduling strategy for haptic shared control, in which the assistance mechanism can be adapted via the parameters of reward functions. Guidelines to tune the decision making model are presented. The influence of the reward structure on the realized haptic assistances is evaluated in a user study and results are compared to the no assistance and human assistance case

Crossref

UWE Bristol Research Repository

Instructional control in choice tasks: the relation between type of schedule and relative expected values

Author: Arantes Joana
Keating José
Martínez Héctor
Viúdez Álvaro
Publication venue: Elsevier
Publication date: 01/08/2022
Field of study

The present work aims improve our understanding of the boundaries of instructional control. It does so by solving contradictory results obtained on two different fields: Three studies conducted on the description-experience gap field, showing that instructions are neglected when personal experience is available, and several others conducted on the experimental analysis of behavior paradigm getting to the opposite conclusion. Two factors were studied: the type of schedule, and the relative expected values between options. The present work showed that (1) positive evidence of instructional control was found in a choice task with probability schedules and different expected values between options; (2) negative evidence of instructional control was found in a choice task with VI schedules and similar expected values between options; and (3) these results, together with previous research, suggest that relative expected values are a fundamental factor on understanding the presence of instructional control in choice tasks. We conclude that the relevance of this factor relies on its capacity to make participants' decisions easier: all else being equal, adding descriptions enables participants to better discriminate optimal behavior in choice tasks.- This study was conducted at Psychology University Center for Bio- logical and Agricultural Sciences, University of Guadalajara, and sup- ported by the Portuguese Foundation for Science and Technology and the Portuguese Ministery of Education and Science through national funds and when applicable co -financed by FEDER under the PT2020 Partnership Agreement (UID/PSI/01662/2013)

Universidade do Minho: RepositoriUM

Probability matching on a simple simulated foraging task:The effects of reward persistence and accumulation on choice behavior

Author: Ellerby Zack W.
Tunney Richard
Publication venue: 'University of Economics and Human Sciences in Warsaw'
Publication date: 30/06/2019
Field of study

Over a series of decisions between two or more probabilistically rewarded options, humans have a tendency to diversify their choices, even when this will lead to diminished overall reward. In the extreme case of probability matching, this tendency is expressed through allocation of choices in proportion to their likelihood of reward. Research suggests that this behaviour is an instinctive response, driven by heuristics, and that it may be overruled through the application of sufficient deliberation and self-control. However, if this is the case, then how and why did this response become established? The present study explores the hypothesis that diversification of choices, and potentially probability matching, represents an overextension of a historically normative foraging strategy. This is done through examining choice behaviour on a simple simulated foraging task, designed to model the natural process of accumulation of unharvested resources over time. Behaviour was then directly compared with that observed on a standard fixed probability task (cf. Ellerby & Tunney, 2017). Results indicated a convergence of choice patterns on the simulated foraging task, between participants who acted intuitively and those who took a more strategic approach. These findings are also compared with those of another similarly motivated study (Schulze, van Ravenzwaaij, & Newell, 2017)

Crossref

Repository@Nottingham

Aston Publications Explorer

CGAMES'2009

Author
Publication venue: University of Wolverhampton, School of Computing and Information Technology
Publication date: 01/01/2009
Field of study

Wolverhampton Intellectual Repository and E-theses

Investigations of reward-based crowdfunding success:A marketing perspective

Author: Zhao L.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications