816 research outputs found
Multi-Armed Bandits for Intelligent Tutoring Systems
We present an approach to Intelligent Tutoring Systems which adaptively
personalizes sequences of learning activities to maximize skills acquired by
students, taking into account the limited time and motivational resources. At a
given point in time, the system proposes to the students the activity which
makes them progress faster. We introduce two algorithms that rely on the
empirical estimation of the learning progress, RiARiT that uses information
about the difficulty of each exercise and ZPDES that uses much less knowledge
about the problem.
The system is based on the combination of three approaches. First, it
leverages recent models of intrinsically motivated learning by transposing them
to active teaching, relying on empirical estimation of learning progress
provided by specific activities to particular students. Second, it uses
state-of-the-art Multi-Arm Bandit (MAB) techniques to efficiently manage the
exploration/exploitation challenge of this optimization process. Third, it
leverages expert knowledge to constrain and bootstrap initial exploration of
the MAB, while requiring only coarse guidance information of the expert and
allowing the system to deal with didactic gaps in its knowledge. The system is
evaluated in a scenario where 7-8 year old schoolchildren learn how to
decompose numbers while manipulating money. Systematic experiments are
presented with simulated students, followed by results of a user study across a
population of 400 school children
How Technology Impacts and Compares to Humans in Socially Consequential Arenas
One of the main promises of technology development is for it to be adopted by
people, organizations, societies, and governments -- incorporated into their
life, work stream, or processes. Often, this is socially beneficial as it
automates mundane tasks, frees up more time for other more important things, or
otherwise improves the lives of those who use the technology. However, these
beneficial results do not apply in every scenario and may not impact everyone
in a system the same way. Sometimes a technology is developed which produces
both benefits and inflicts some harm. These harms may come at a higher cost to
some people than others, raising the question: {\it how are benefits and harms
weighed when deciding if and how a socially consequential technology gets
developed?} The most natural way to answer this question, and in fact how
people first approach it, is to compare the new technology to what used to
exist. As such, in this work, I make comparative analyses between humans and
machines in three scenarios and seek to understand how sentiment about a
technology, performance of that technology, and the impacts of that technology
combine to influence how one decides to answer my main research question.Comment: Doctoral thesis proposal. arXiv admin note: substantial text overlap
with arXiv:2110.08396, arXiv:2108.12508, arXiv:2006.1262
Learning how to act: making good decisions with machine learning
This thesis is about machine learning and statistical approaches
to decision making. How can we learn from data to anticipate the
consequence of, and optimally select, interventions or actions?
Problems such as deciding which medication to prescribe to
patients, who should be released on bail, and how much to charge
for insurance are ubiquitous, and have far reaching impacts on
our lives. There are two fundamental approaches to learning how
to act: reinforcement learning, in which an agent directly
intervenes in a system and learns from the outcome, and
observational causal inference, whereby we seek to infer the
outcome of an intervention from observing the system.
The goal of this thesis to connect and unify these key
approaches. I introduce causal bandit problems: a synthesis that
combines causal graphical models, which were developed for
observational causal inference, with multi-armed bandit problems,
which are a subset of reinforcement learning problems that are
simple enough to admit formal analysis. I show that knowledge of
the causal structure allows us to transfer information learned
about the outcome of one action to predict the outcome of an
alternate action, yielding a novel form of structure between
bandit arms that cannot be exploited by existing algorithms. I
propose an algorithm for causal bandit problems and prove bounds
on the simple regret demonstrating it is close to mini-max
optimal and better than algorithms that do not use the additional
causal information
Learning User Preferences to Incentivize Exploration in the Sharing Economy
We study platforms in the sharing economy and discuss the need for
incentivizing users to explore options that otherwise would not be chosen. For
instance, rental platforms such as Airbnb typically rely on customer reviews to
provide users with relevant information about different options. Yet, often a
large fraction of options does not have any reviews available. Such options are
frequently neglected as viable choices, and in turn are unlikely to be
evaluated, creating a vicious cycle. Platforms can engage users to deviate from
their preferred choice by offering monetary incentives for choosing a different
option instead. To efficiently learn the optimal incentives to offer, we
consider structural information in user preferences and introduce a novel
algorithm - Coordinated Online Learning (CoOL) - for learning with structural
information modeled as convex constraints. We provide formal guarantees on the
performance of our algorithm and test the viability of our approach in a user
study with data of apartments on Airbnb. Our findings suggest that our approach
is well-suited to learn appropriate incentives and increase exploration on the
investigated platform.Comment: Longer version of AAAI'18 paper. arXiv admin note: text overlap with
arXiv:1702.0284
Assessing and improving recommender systems to deal with user cold-start problem
Recommender systems are in our everyday life. The recommendation methods have as
main purpose to predict preferences for new items based on userŠs past preferences. The
research related to this topic seeks among other things to discuss user cold-start problem,
which is the challenge of recommending to users with few or no preferences records.
One way to address cold-start issues is to infer the missing data relying on side information.
Side information of different types has been explored in researches. Some
studies use social information combined with usersŠ preferences, others user click behavior,
location-based information, userŠs visual perception, contextual information, etc. The
typical approach is to use side information to build one prediction model for each cold
user. Due to the inherent complexity of this prediction process, for full cold-start user in
particular, the performance of most recommender systems falls a great deal. We, rather,
propose that cold users are best served by models already built in system.
In this thesis we propose 4 approaches to deal with user cold-start problem using
existing models available for analysis in the recommender systems. We cover the follow
aspects:
o Embedding social information into traditional recommender systems: We investigate
the role of several social metrics on pairwise preference recommendations and
provide the Ąrst steps towards a general framework to incorporate social information
in traditional approaches.
o Improving recommendation with visual perception similarities: We extract networks
connecting users with similar visual perception and use them to come up with
prediction models that maximize the information gained from cold users.
o Analyzing the beneĄts of general framework to incorporate networked information
into recommender systems: Representing different types of side information as a
user network, we investigated how to incorporate networked information into recommender
systems to understand the beneĄts of it in the context of cold user
recommendation.
o Analyzing the impact of prediction model selection for cold users: The last proposal
consider that without side information the system will recommend to cold users
based on the switch of models already built in system.
We evaluated the proposed approaches in terms of prediction quality and ranking
quality in real-world datasets under different recommendation domains. The experiments
showed that our approaches achieve better results than the comparison methods.Tese (Doutorado)Sistemas de recomendação fazem parte do nosso dia-a-dia. Os métodos usados nesses
sistemas tem como objetivo principal predizer as preferências por novos itens baseado no
perĄl do usuário. As pesquisas relacionadas a esse tópico procuram entre outras coisas
tratar o problema do cold-start do usuário, que é o desaĄo de recomendar itens para
usuários que possuem poucos ou nenhum registro de preferências no sistema.
Uma forma de tratar o cold-start do usuário é buscar inferir as preferências dos usuários
a partir de informações adicionais. Dessa forma, informações adicionais de diferentes tipos
podem ser exploradas nas pesquisas. Alguns estudos usam informação social combinada
com preferências dos usuários, outros se baseiam nos clicks ao navegar por sites Web,
informação de localização geográĄca, percepção visual, informação de contexto, etc. A
abordagem típica desses sistemas é usar informação adicional para construir um modelo
de predição para cada usuário. Além desse processo ser mais complexo, para usuários
full cold-start (sem preferências identiĄcadas pelo sistema) em particular, a maioria dos
sistemas de recomendação apresentam um baixo desempenho. O trabalho aqui apresentado,
por outro lado, propõe que novos usuários receberão recomendações mais acuradas
de modelos de predição que já existem no sistema.
Nesta tese foram propostas 4 abordagens para lidar com o problema de cold-start
do usuário usando modelos existentes nos sistemas de recomendação. As abordagens
apresentadas trataram os seguintes aspectos:
o Inclusão de informação social em sistemas de recomendação tradicional: foram investigados
os papéis de várias métricas sociais em um sistema de recomendação de
preferências pairwise fornecendo subsidíos para a deĄnição de um framework geral
para incluir informação social em abordagens tradicionais.
o Uso de similaridade por percepção visual: usando a similaridade por percepção
visual foram inferidas redes, conectando usuários similares, para serem usadas na
seleção de modelos de predição para novos usuários.
o Análise dos benefícios de um framework geral para incluir informação de redes
de usuários em sistemas de recomendação: representando diferentes tipos de informação
adicional como uma rede de usuários, foi investigado como as redes de
usuários podem ser incluídas nos sistemas de recomendação de maneira a beneĄciar
a recomendação para usuários cold-start.
o Análise do impacto da seleção de modelos de predição para usuários cold-start:
a última abordagem proposta considerou que sem a informação adicional o sistema
poderia recomendar para novos usuários fazendo a troca entre os modelos já
existentes no sistema e procurando aprender qual seria o mais adequado para a
recomendação.
As abordagens propostas foram avaliadas em termos da qualidade da predição e da
qualidade do ranking em banco de dados reais e de diferentes domínios. Os resultados
obtidos demonstraram que as abordagens propostas atingiram melhores resultados que os
métodos do estado da arte
Tutoring Students with Adaptive Strategies
Adaptive learning is a crucial part in intelligent tutoring systems. It provides students with appropriate tutoring interventions, based on students’ characteristics, status, and other related features, in order to optimize their learning outcomes. It is required to determine students’ knowledge level or learning progress, based on which it then uses proper techniques to choose the optimal interventions. In this dissertation work, I focus on these aspects related to the process in adaptive learning: student modeling, k-armed bandits, and contextual bandits. Student modeling. The main objective of student modeling is to develop cognitive models of students, including modeling content skills and knowledge about learning. In this work, we investigate the effect of prerequisite skill in predicting students’ knowledge in post skills, and we make use of the prerequisite performance in different student models. As a result, this makes them superior to traditional models. K-armed bandits. We apply k-armed bandit algorithms to personalize interventions for students, to optimize their learning outcomes. Due to the lack of diverse interventions and small difference of intervention effectiveness in educational experiments, we also propose a simple selection strategy, and compare it with several k-armed bandit algorithms. Contextual bandits. In contextual bandit problem, additional side information, also called context, can be used to determine which action to select. First, we construct a feature evaluation mechanism, which determines which feature to be combined with bandits. Second, we propose a new decision tree algorithm, which is capable of detecting aptitude treatment effect for students. Third, with combined bandits with the decision tree, we apply the contextual bandits to make personalization in two different types of data, simulated data and real experimental data
- …