Search CORE

202,863 research outputs found

Structure Learning in Human Sequential Decision-Making

Author: A Fel'dbaum
A Gelman
A Johnson
A Smith
AC Courville
AD Horowitz
AJ Yu
C Anderson
C Watkins
D Acuna
D Heckerman
DA Braun
Daniel E. Acuña
I Erev
J Anderson
J Banks
JB Tenenbaum
JB Tenenbaum
JC Gittins
JC Gittins
L Kaelbling
M Steyvers
M Steyvers
MD Lee
MJA Strens
MS Yi
N Gans
ND Daw
P Poupart
P Whittle
Paul Schrater
R Dearden
R Howard
RE Bellman
RE Bellman
RE Neapolitan
RJ Meyer
RS Sutton
SJ Gershman
TEJ Behrens
Tim Behrens
W Edwards
W Edwards
W Schultz
W Schultz
Y Brackbill
Y Sakai
Y Sakai
Publication venue: Public Library of Science
Publication date: 01/12/2010
Field of study

Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Data-Driven Research On Engineering Design Thinking And Behaviors In Computer-Aided Systems Design: Analysis, Modeling, And Prediction

Author: Rahman Molla Hafizur
Publication venue: ScholarWorks@UARK
Publication date: 01/08/2022
Field of study

Research on design thinking and design decision-making is vital for discovering and utilizing beneficial design patterns, strategies, and heuristics of human designers in solving engineering design problems. It is also essential for the development of new algorithms embedded with human intelligence and can facilitate human-computer interactions. However, modeling design thinking is challenging because it takes place in the designer’s mind, which is intricate, implicit, and tacit. For an in-depth understanding of design thinking, fine-grained design behavioral data are important because they are the critical link in studying the relationship between design thinking, design decisions, design actions, and design performance. Therefore, the research in my dissertation aims to develop a new research platform and new research approaches to enable fine-grained data-driven methodology that helps foundation ally understand the designers’ thinking and decision-making strategies in engineering design. To achieve this goal, my research has focused on modeling, analysis, and prediction of design thinking and designers’ sequential decision-making behaviors. In the modeling work, different design behaviors, including design action preferences, one step sequential decision behavior, contextual behavior, long short-term memory behavior, and reflective thinking behavior, are characterized and computationally modeled using statis tical and machine learning techniques. For example, to model designers’ sequential decision making, a novel approach is developed by integrating the Function-Behavior-Structure (FBS) design process model into deep learning methods, e.g., the long short-term memory (LSTM) model and the gated recurrent unit (GRU) model. In the work on analysis, this dissertation focuses primarily on different clustering analysis techniques. Based on the behaviors modeled, designers showing similar behavioral patterns can be clustered, from which the common design patterns can be identified. Another analysis performed in this dissertation is on the comparative study of different sequential learning techniques, e.g., deep learning models versus Markov chain models, in modeling sequential decision-making behaviors of human designers. This study compares the prediction accuracy of different models and helps us obtain a better understanding of the performance of deep-learning models in modeling sequential design decisions. Finally, in the work related to prediction, this dissertation aims to predict sequential design decisions and actions. We first test the model that integrates the FBS model with various deep-learning models for the prediction and evaluate the performance of the model. Then, to improve the accuracy of the prediction, we develop two approaches that directly and indirectly combine designer-related attributes (static data) and designers’ action sequences (dynamic data) within the deep learning-based framework. The results show that with ap propriate configurations, the deep-learning model with both static data and dynamic data outperforms the models that only rely on the design action sequence. Finally, I developed an artificial design agent using reinforcement learning with a data-driven reward mechanism based on the Markov chain model to mimic human design behavior. The model also helps validate the hypothesis that the design knowledge learned by the agent from one design problem is transferable to new design problems. To support fine-grained design behavioral data collection and validate the proposed approaches, we develop a computer-aided design (CAD)-based research platform in the application context of renewable engineering systems design. Data are collected through three design case studies, i.e., a solarized home design problem, a solarized parking lot design problem, and a design challenge on solarizing the University of Arkansas (UARK) campus. The contribution of this dissertation can be summarized in the following aspects. First, a novel research platform is developed that can collect fine-grained design behavior data in support of design thinking research. Second, new research approaches are developed to characterize design behaviors from multiple dimensions in a latent space of design thinking. We refer to such a latent representation of design thinking as design embedding. Furthermore, using deep learning techniques, several different predictive models are developed that can successfully predict human sequential design decisions with prediction accuracy higher than traditional sequential learning models. Third, by analyzing designers’ one-step sequential design behaviors, common and beneficial design patterns are identified. These patterns are found to exist in many high-performing designers in the three respective design problems studied. Fourth, new knowledge has been obtained on the ability of deep learning-based models versus traditional sequential learning models to predict sequential design decisions of human designers. Finally, a novel research approach is developed that helps test the hypothesis of transferability of design knowledge. In general, this dissertation creates a new avenue for investigating designers’ thinking and decision-making behaviors in systems design context based on the data collected from a CAD environment and tested the capability of various deep-learning algorithms in predicting human sequential design decisions

ScholarWorks@UARK

UARK (University of Arkansas )

What to choose next? A paradigm for testing human sequential decision making

Author: Clarke A.M.
Herzog M.H.
Tartaglia E.M.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Many of the decisions we make in our everyday lives are sequential and entail sparse rewards. While sequential decision-making has been extensively investigated in theory (e.g., by reinforcement learning models) there is no systematic experimental paradigm to test it. Here, we developed such a paradigm and investigated key components of reinforcement learning models: the eligibility trace (i.e., the memory trace of previous decision steps), the external reward, and the ability to exploit the statistics of the environment's structure (model-free vs. model-based mechanisms). We show that the eligibility trace decays not with sheer time, but rather with the number of discrete decision steps made by the participants. We further show that, unexpectedly, neither monetary rewards nor the environment's spatial regularity significantly modulate behavioral performance. Finally, we found that model-free learning algorithms describe human performance better than model-based algorithms. © 2017 Tartaglia, Clarke and Herzog

Bilkent University Institutional Repository

POMDP Model Learning for Human Robot Collaboration

Author: Lin Hai
Wu Bo
Zheng Wei
Publication venue
Publication date: 29/03/2018
Field of study

Recent years have seen human robot collaboration (HRC) quickly emerged as a hot research area at the intersection of control, robotics, and psychology. While most of the existing work in HRC focused on either low-level human-aware motion planning or HRC interface design, we are particularly interested in a formal design of HRC with respect to high-level complex missions, where it is of critical importance to obtain an accurate and meanwhile tractable human model. Instead of assuming the human model is given, we ask whether it is reasonable to learn human models from observed perception data, such as the gesture, eye movements, head motions of the human in concern. As our initial step, we adopt a partially observable Markov decision process (POMDP) model in this work as mounting evidences have suggested Markovian properties of human behaviors from psychology studies. In addition, POMDP provides a general modeling framework for sequential decision making where states are hidden and actions have stochastic outcomes. Distinct from the majority of POMDP model learning literature, we do not assume that the state, the transition structure or the bound of the number of states in POMDP model is given. Instead, we use a Bayesian non-parametric learning approach to decide the potential human states from data. Then we adopt an approach inspired by probably approximately correct (PAC) learning to obtain not only an estimation of the transition probability but also a confidence interval associated to the estimation. Then, the performance of applying the control policy derived from the estimated model is guaranteed to be sufficiently close to the true model. Finally, data collected from a driver-assistance test-bed are used to train the model, which illustrates the effectiveness of the proposed learning method

arXiv.org e-Print Archive

Crossref

Individual Differences In Value-Based Decision-Making: Learning And Time Preference

Author: Pehlivanova Marieta
Publication venue: ScholarlyCommons
Publication date: 01/01/2017
Field of study

Human decisions are strongly influenced by past experience or by the subjective values attributed to available choice options. Although decision processes show some common trends across individuals, they also vary considerably between individuals. The research presented in this dissertation focuses on two domains of decision-making, related to learning and time preference, and examines factors that explain decision-making differences between individuals. First, we focus on a form of reinforcement learning in a dynamic environment. Across three experiments, we investigated whether individual differences in learning were associated with differences in cognitive abilities, personality, and age. Participants made sequential predictions about an on-screen location in a video game. Consistent with previous work, participants showed high variability in their ability to implement normative strategies related to surprise and uncertainty. We found that higher cognitive ability, but not personality, was associated with stronger reliance on the normative factors that should govern learning. Furthermore, learning in older adults (age 60+) was less influenced by uncertainty, but also less influenced by reward, a non-normative factor that has substantial effects on learning across the lifespan. Second, we focus on delay discounting, the tendency to prefer smaller rewards delivered soon over larger rewards delivered after a delay. Delay discounting has been used as a behavioral measure of impulsivity and is associated with many undesirable real-life outcomes. Specifically, we examined how neuroanatomy is associated with individual differences in delay discounting in a large adolescent sample. Using a novel multivariate method, we identified networks where cortical thickness varied consistently across individuals and brain regions. Cortical thickness in several of these networks, including regions such as ventromedial prefrontal cortex, orbitofrontal cortex, and temporal pole, was negatively associated with delay discounting. Furthermore, this brain data predicted differences beyond those typically accounted for by other cognitive variables related to delay discounting. These results suggest that cortical thickness may be a useful brain phenotype of delay discounting and carry unique information about impulsivity. Collectively, this research furthers our understanding of how cognitive abilities, brain structure and healthy aging relate to individual differences in value-based decision-making

ScholarlyCommons@Penn

Towards a theory of heuristic and optimal planning for sequential information search

Author: Jones M.
Meder B.
Nelson J.
Publication venue: 'Center for Open Science'
Publication date: 01/01/2018
Field of study

MPG.PuRe

Analyzing collaborative learning processes automatically

Author: A. C. Graesser
A. King
A. King
A. King
A. M. O'Donnell
A. Stolcke
A. Weinberger
A. Weinberger
A. Yeh
Armin Weinberger
B. Goodman
B. Weiner
B. Wever De
C. P. Rosé
C. Rosé
Carolyn Rosé
D. Kuhn
D. Lewis
D. Litman
E. B. Page
E. B. Page
E. Schegloff
F. Fischer
F. Henri
Frank Fischer
G. Erkens
G. Gweon
G. Salomon
I. H. Witten
I. Kollar
I. Kollar
J. F. Voss
J. Fuernkranz
J. L. Fleiss
J. Piaget
J. Pol van der
J. W. Pennebaker
J. W. Pennebaker
J. W. Pennebaker
J. Wiebe
Jaime Arguello
K. Krippendorf
K. Krippendorff
K. VanLehn
Karsten Stegmann
M. Berkowitz
M. Evens
M. T. H. Chi
N. M. Webb
P. Dillenbourg
P. Dönmez
P. Foltz
R. Kumar
R. Luckin
R. Wegerif
S. D. Teasley
S. Leitão
T. Landauer
V. Aleven
V. Carvalho
V. Vapnik
Yi-Chia Wang
Yue Cui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in

Crossref

Open Access LMU