713 research outputs found
Goal-Space Planning with Subgoal Models
This paper investigates a new approach to model-based reinforcement learning
using background planning: mixing (approximate) dynamic programming updates and
model-free updates, similar to the Dyna architecture. Background planning with
learned models is often worse than model-free alternatives, such as Double DQN,
even though the former uses significantly more memory and computation. The
fundamental problem is that learned models can be inaccurate and often generate
invalid states, especially when iterated many steps. In this paper, we avoid
this limitation by constraining background planning to a set of (abstract)
subgoals and learning only local, subgoal-conditioned models. This goal-space
planning (GSP) approach is more computationally efficient, naturally
incorporates temporal abstraction for faster long-horizon planning and avoids
learning the transition dynamics entirely. We show that our GSP algorithm can
learn significantly faster than a Double DQN baseline in a variety of
situations
Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval
Neural networks with deep architectures have demonstrated significant
performance improvements in computer vision, speech recognition, and natural
language processing. The challenges in information retrieval (IR), however, are
different from these other application areas. A common form of IR involves
ranking of documents--or short passages--in response to keyword-based queries.
Effective IR systems must deal with query-document vocabulary mismatch problem,
by modeling relationships between different query and document terms and how
they indicate relevance. Models should also consider lexical matches when the
query contains rare terms--such as a person's name or a product model
number--not seen during training, and to avoid retrieving semantically related
but irrelevant results. In many real-life IR tasks, the retrieval involves
extremely large collections--such as the document index of a commercial Web
search engine--containing billions of documents. Efficient IR methods should
take advantage of specialized IR data structures, such as inverted index, to
efficiently retrieve from large collections. Given an information need, the IR
system also mediates how much exposure an information artifact receives by
deciding whether it should be displayed, and where it should be positioned,
among other results. Exposure-aware IR systems may optimize for additional
objectives, besides relevance, such as parity of exposure for retrieved items
and content publishers. In this thesis, we present novel neural architectures
and methods motivated by the specific needs and challenges of IR tasks.Comment: PhD thesis, Univ College London (2020
- …