Search CORE

29 research outputs found

Schematic example of independent and joint clustering agents.

Author: Michael J. Frank (1806388)
Nicholas T. Franklin (5122826)
Publication venue
Publication date
Field of study

Top Left: The independent clustering agent groups each context into two clusters, associated with a reward (R) and mapping (ϕ) function, respectively. Planning involves combining these functions to generate a policy. The clustering prior induces a parsimony bias such that new contexts are more likely to be assigned to more popular clusters. Arrows denote assignment of context into clusters and creation of policies from component functions. Top Right: The joint clustering agent assigns each context into a cluster linked to both functions (i.e., assumes a holistic task structure), and hence the policy is determined by this cluster assignment. In this example, both agents generate the same two policies for the three contexts but the independent clustering agent generalizes the reward function across all three contexts. Bottom: An example mapping (left) and reward function (right) for a gridworld task.</p

FigShare

Performance of independent vs. joint clustering in predicting a sequence XR, measured in bits of information gained by observation of each item.

Author: Michael J. Frank (1806388)
Nicholas T. Franklin (5122826)
Publication venue
Publication date
Field of study

Left: Relative performance of independent clustering over joint clustering as a function of mutual information between the rewards and transitions. Right: Noise in observation of XT sequences parametrically increases advantage for independent clustering. Green line shows relative performance in sequences with no residual uncertainty in R given T (perfect correspondence), orange line shows relative performance for a sequence with residual uncertainty H(R|T) > 0bits.</p

FigShare

Simulation 1.

Author: Michael J. Frank (1806388)
Nicholas T. Franklin (5122826)
Publication venue
Publication date
Field of study

A: Schematic representation of the task domain. Four contexts (blue circles) were simulated, each paired with a unique combination of one of two goal locations (reward functions) and one of two mappings. B: Number of steps taken by each agent shown across trials within a single context (left) and over all trials (right). Fewer steps reflect better performance. C: KL-divergence of the models’ estimates of the reward (left) and mapping (right) functions as a function of time. Lower KL-divergence represents better function estimates. Time shown as the number of trials in a context (left) and the number of steps in a context collapsed across trials (right) for clarity.</p

FigShare

Meta-agent.

Author: Michael J. Frank (1806388)
Nicholas T. Franklin (5122826)
Publication venue
Publication date
Field of study

A: On each trial, the meta-agent samples the policy of joint or independent actor based on model evidence for each strategy. Both agents, and their model evidences, are updated at each time step. B: Overall performance of independent, joint and meta agents on simulation 1 C: Overall performance of independent, joint and meta agents on simulation 2 D,E: Probability of the selecting the policy joint clustering over time in simulation 1 (D) and simulation 2 (E).</p

FigShare

Simulation 2.

Author: Michael J. Frank (1806388)
Nicholas T. Franklin (5122826)
Publication venue
Publication date
Field of study

A: Schematic representation of the second task domain. Eight contexts (blue circles) were simulated, each paired with a combination of one of four orthogonal reward functions and one of four mappings, such that each pairing was repeated across two contexts, providing a discoverable relationship. B: Number of steps taken by each agent shown across trials within a single context (left) and over all trials (right). C: KL-divergence of the models’ estimates of the reward (left) and mapping (right) functions as a function of time.</p

FigShare

“Diabolic rooms problem”.

Author: Michael J. Frank (1806388)
Nicholas T. Franklin (5122826)
Publication venue
Publication date
Field of study

A: Schamatic diagram of rooms problem. Agents enter a room and choose a door to navigate to the next room. Choosing the correct door (green) leads to the next room while choosing the other two doors leads to the start of the task. The agent learns three mappings across rooms B: Distribution of steps taken to solve the task by the three agents (left) and median of the distributions (right). C,D: Regression of the number of steps to complete the task as a function of grid area (C) and the number of rooms in the task (D) for the joint and independent clustering agents.</p

FigShare

Probabilistic selection task data_ Exp1A_Fig2a_Fig3a

Author: Anne G. E. Collins (4011860)
Heleen A. Slagter (364860)
Iris Schutte (4011857)
J. Leon Kenemans (355027)
Michael J. Frank (1806388)
Publication venue
Publication date: 10/05/2017
Field of study

Data corresponds to experiment 1A and Figure 2A/3A

Dryad Digital Repository (Duke University)

FigShare

Reaction time data_ Exp2_Fig5_Schutte_etal

Author: Anne G. E. Collins (4011860)
Heleen A. Slagter (364860)
Iris Schutte (4011857)
J. Leon Kenemans (355027)
Michael J. Frank (1806388)
Publication venue
Publication date: 10/05/2017
Field of study

Data corresponds to Experiment 2 and Figure

Dryad Digital Repository (Duke University)

FigShare

Probabilistic selection task data_ Exp1B_Fig2b_Fig3b_Schutte_etal

Author: Anne G. E. Collins (4011860)
Heleen A. Slagter (364860)
Iris Schutte (4011857)
J. Leon Kenemans (355027)
Michael J. Frank (1806388)
Publication venue
Publication date: 10/05/2017
Field of study

Data corresponds to Experiment 1B and Figure 2B/3B

Dryad Digital Repository (Duke University)

FigShare

Hiragana stimuli used in the probabilistic learning task in experiments 1a and 1b and experiment 3.

Author: Anne G. E. Collins (4011860)
Heleen A. Slagter (364860)
Iris Schutte (4011857)
J. Leon Kenemans (355027)
Michael J. Frank (1806388)
Publication venue
Publication date
Field of study

Each pair of stimuli was randomly presented in separate trials. During each trial participants chose one of the stimuli of the pair. Feedback following participant’s choice was determined probabilistically. Reward probability (indicated below each stimulus) differed between characters. Task version 1 and 2 differed with respect to the characters associated with more probable positive and more probable negative feedback, respectively. Specifically, in task version 2, Hiragana stimuli were switched within the AB, CD and EF pair. Half of the participants were subjected to version 1 and the other half to task version 2. In experiment 3, a task version was used for which only the Hiragana stimuli within the AB pair were switched.</p

FigShare