595 research outputs found
A second look at memory: Different Approaches to Understanding Diversity in Memory and Cognition
Memory lies at the heart of human cognitive abilities. Therefore, understanding it from neural, psychological and computational viewpoints is of key importance for computational neuroscience, psychology and beyond. In this thesis, I explore two prominent, but different, memory systems: episodic memory and working memory. First, I propose a modification to a recent reinforcement learning algorithm for decision making in which single memories of events, i.e., episodic memories, are integrated to compute the long run value of actions. I argue that these memories are recalled and that their contributions are weighted based on context. Further, I propose that predictions made by this algorithm are combined with those that come from a standard, model-free, reinforcement learning algorithm. I suggest that humans can flexibly choose between these two sources of information to make decisions and guide actions. I show that the resulting combined model best fits data on human choices, outperforming previously proposed models. To complement these algorithmic and psychological suggestions, I present a generative model of the world according to which this sort of episodic recall is an appropriate method for making inferences and predictions of future rewards. Contrary to other suggestions for reward-based learning, this generative model can model events that not only drift continuously in time, but can also suddenly change to new or repeated events. Turning to working memory, I use information theoretic analyses to show that dynamic synapses, whose strengths adjust with usage, can increase its capacity. I argue that these components should be included in the study of working memory. The thesis ends with an explanation of the connections between these memory systems
Discovering Valuable Items from Massive Data
Suppose there is a large collection of items, each with an associated cost
and an inherent utility that is revealed only once we commit to selecting it.
Given a budget on the cumulative cost of the selected items, how can we pick a
subset of maximal value? This task generalizes several important problems such
as multi-arm bandits, active search and the knapsack problem. We present an
algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween
items, expressed as a kernel function. GP-Select uses Gaussian process
prediction to balance exploration (estimating the unknown value of items) and
exploitation (selecting items of high value). We extend GP-Select to be able to
discover sets that simultaneously have high utility and are diverse. Our
preference for diversity can be specified as an arbitrary monotone submodular
function that quantifies the diminishing returns obtained when selecting
similar items. Furthermore, we exploit the structure of the model updates to
achieve an order of magnitude (up to 40X) speedup in our experiments without
resorting to approximations. We provide strong guarantees on the performance of
GP-Select and apply it to three real-world case studies of industrial
relevance: (1) Refreshing a repository of prices in a Global Distribution
System for the travel industry, (2) Identifying diverse, binding-affine
peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale
recommender system by recommending items to users
Context-Aware Hierarchical Online Learning for Performance Maximization in Mobile Crowdsourcing
In mobile crowdsourcing (MCS), mobile users accomplish outsourced human
intelligence tasks. MCS requires an appropriate task assignment strategy, since
different workers may have different performance in terms of acceptance rate
and quality. Task assignment is challenging, since a worker's performance (i)
may fluctuate, depending on both the worker's current personal context and the
task context, (ii) is not known a priori, but has to be learned over time.
Moreover, learning context-specific worker performance requires access to
context information, which may not be available at a central entity due to
communication overhead or privacy concerns. Additionally, evaluating worker
performance might require costly quality assessments. In this paper, we propose
a context-aware hierarchical online learning algorithm addressing the problem
of performance maximization in MCS. In our algorithm, a local controller (LC)
in the mobile device of a worker regularly observes the worker's context,
her/his decisions to accept or decline tasks and the quality in completing
tasks. Based on these observations, the LC regularly estimates the worker's
context-specific performance. The mobile crowdsourcing platform (MCSP) then
selects workers based on performance estimates received from the LCs. This
hierarchical approach enables the LCs to learn context-specific worker
performance and it enables the MCSP to select suitable workers. In addition,
our algorithm preserves worker context locally, and it keeps the number of
required quality assessments low. We prove that our algorithm converges to the
optimal task assignment strategy. Moreover, the algorithm outperforms simpler
task assignment strategies in experiments based on synthetic and real data.Comment: 18 pages, 10 figure
Autonomous Drug Design with Multi-Armed Bandits
Recent developments in artificial intelligence and automation support a new
drug design paradigm: autonomous drug design. Under this paradigm, generative
models can provide suggestions on thousands of molecules with specific
properties, and automated laboratories can potentially make, test and analyze
molecules with minimal human supervision. However, since still only a limited
number of molecules can be synthesized and tested, an obvious challenge is how
to efficiently select among provided suggestions in a closed-loop system. We
formulate this task as a stochastic multi-armed bandit problem with multiple
plays, volatile arms and similarity information. To solve this task, we adapt
previous work on multi-armed bandits to this setting, and compare our solution
with random sampling, greedy selection and decaying-epsilon-greedy selection
strategies. According to our simulation results, our approach has the potential
to perform better exploration and exploitation of the chemical space for
autonomous drug design
- …