4,850 research outputs found
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
Lifelong learning offers a promising paradigm of building a generalist agent
that learns and adapts over its lifespan. Unlike traditional lifelong learning
problems in image and text domains, which primarily involve the transfer of
declarative knowledge of entities and concepts, lifelong learning in
decision-making (LLDM) also necessitates the transfer of procedural knowledge,
such as actions and behaviors. To advance research in LLDM, we introduce
LIBERO, a novel benchmark of lifelong learning for robot manipulation.
Specifically, LIBERO highlights five key research topics in LLDM: 1) how to
efficiently transfer declarative knowledge, procedural knowledge, or the
mixture of both; 2) how to design effective policy architectures and 3)
effective algorithms for LLDM; 4) the robustness of a lifelong learner with
respect to task ordering; and 5) the effect of model pretraining for LLDM. We
develop an extendible procedural generation pipeline that can in principle
generate infinitely many tasks. For benchmarking purpose, we create four task
suites (130 tasks in total) that we use to investigate the above-mentioned
research topics. To support sample-efficient learning, we provide high-quality
human-teleoperated demonstration data for all tasks. Our extensive experiments
present several insightful or even unexpected discoveries: sequential
finetuning outperforms existing lifelong learning methods in forward transfer,
no single visual encoder architecture excels at all types of knowledge
transfer, and naive supervised pretraining can hinder agents' performance in
the subsequent LLDM. Check the website at https://libero-project.github.io for
the code and the datasets
The sustainable delivery of sexual violence prevention education in schools
Sexual violence is a crime that cannot be ignored: it causes our communities significant consequences including heavy economic costs, and evidence of its effects can be seen in our criminal justice system, public health system, Accident Compensation Corporation (ACC), and education system, particularly in our schools. Many agencies throughout New Zealand work to end sexual violence. Auckland-based Rape Prevention Education: Whakatu Mauri (RPE) is one such agency, and is committed to preventing sexual violence by providing a range of programmes and initiatives, information, education, and advocacy to a broad range of audiences. Up until early 2014 RPE employed one or two full-time positions dedicated to co-ordinating and training a large pool (up to 15) of educators on casual contracts to deliver their main school-based programmes, BodySafe – approximately 450 modules per year, delivered to some 20 high schools. Each year several of the contract educators, many of whom were tertiary students, found secure full time employment elsewhere. To retain sufficient contract educators to deliver its BodySafe contract meant that RPE had to recruit, induct and train new educators two to three times every year. This model was expensive, resource intense, and ultimately untenable. The Executive Director and core staff at RPE wanted to develop a more efficient and stable model of delivery that fitted its scarce resources. To enable RPE to know what the most efficient model was nationally and internationally, with Ministry of Justice funding, RPE commissioned Massey University to undertake this report reviewing national and international research on sexual violence prevention education (SVPE)
The sustainable delivery of sexual violence prevention education in schools
Sexual violence is a crime that cannot be ignored: it causes our communities significant
consequences including heavy economic costs, and evidence of its effects can be seen in our
criminal justice system, public health system, Accident Compensation Corporation (ACC),
and education system, particularly in our schools. Many agencies throughout New Zealand
work to end sexual violence. Auckland-based Rape Prevention Education: Whakatu Mauri
(RPE) is one such agency, and is committed to preventing sexual violence by providing a
range of programmes and initiatives, information, education, and advocacy to a broad range
of audiences.
Up until early 2014 RPE employed one or two full-time positions dedicated to co-ordinating
and training a large pool (up to 15) of educators on casual contracts to deliver their main
school-based programmes, BodySafe – approximately 450 modules per year, delivered to
some 20 high schools. Each year several of the contract educators, many of whom were
tertiary students, found secure full time employment elsewhere. To retain sufficient
contract educators to deliver its BodySafe contract meant that RPE had to recruit, induct
and train new educators two to three times every year. This model was expensive, resource
intense, and ultimately untenable. The Executive Director and core staff at RPE wanted to
develop a more efficient and stable model of delivery that fitted its scarce resources.
To enable RPE to know what the most efficient model was nationally and internationally,
with Ministry of Justice funding, RPE commissioned Massey University to undertake this
report reviewing national and international research on sexual violence prevention
education (SVPE). [Background from Executive Summary.]Rape Prevention Education: Whakatu Maur
Recommended from our members
Rigorous Experimentation For Reinforcement Learning
Scientific fields make advancements by leveraging the knowledge created by others to push the boundary of understanding. The primary tool in many fields for generating knowledge is empirical experimentation. Although common, generating accurate knowledge from empirical experiments is often challenging due to inherent randomness in execution and confounding variables that can obscure the correct interpretation of the results. As such, researchers must hold themselves and others to a high degree of rigor when designing experiments. Unfortunately, most reinforcement learning (RL) experiments lack this rigor, making the knowledge generated from experiments dubious. This dissertation proposes methods to address central issues in RL experimentation.
Evaluating the performance of an RL algorithm is the most common type of experiment in RL literature. Most performance evaluations are often incapable of answering a specific research question and produce misleading results. Thus, the first issue we address is how to create a performance evaluation procedure that holds up to scientific standards.
Despite the prevalence of performance evaluation, these types of experiments produce limited knowledge, e.g., they can only show how well an algorithm worked and not why, and they require significant amounts of time and computational resources. As an alternative, this dissertation proposes that scientific testing, the process of conducting carefully controlled experiments designed to further the knowledge and understanding of how an algorithm works, should be the primary form of experimentation.
Lastly, this dissertation provides a case study using policy gradient methods, showing how scientific testing can replace performance evaluation as the primary form of experimentation. As a result, this dissertation can motivate others in the field to adopt more rigorous experimental practices
Reinforcement learning for sequential decision-making: a data driven approach for finance
This work presents a variety of reinforcement learning applications to the
domain of nance. It composes of two-part. The rst one represents a technical
overview of the basic concepts in machine learning, which are required
to understand and work with the reinforcement learning paradigm and are
shared among the domains of applications. Chapter 1 outlines the fundamental
principle of machine learning reasoning before introducing the neural
network model as a central component of every algorithm presented in this
work. Chapter 2 introduces the idea of reinforcement learning from its roots,
focusing on the mathematical formalism generally employed in every application.
We focus on integrating the reinforcement learning framework with the
neural network, and we explain their critical role in the eld's development.
After the technical part, we present our original contribution, articulated
in three di erent essays. The narrative line follows the idea of introducing
the use of varying reinforcement learning algorithms through a trading application
(Brini and Tantari, 2021) in Chapter 3. Then in Chapter 4 we
focus on one of the presented reinforcement learning algorithms and aim at
improving its performance and scalability in solving the trading problem by
leveraging prior knowledge of the setting. In Chapter 5 of the second part,
we use the same reinforcement learning algorithm to solve the problem of
exchanging liquidity in a system of banks that can borrow and lend money,
highlighting the
exibility and the e ectiveness of the reinforcement learning
paradigm in the broad nancial domain. We conclude with some remarks
and ideas for further research in reinforcement learning applied to nance
Batch Reinforcement Learning from Crowds
A shortcoming of batch reinforcement learning is its requirement for rewards
in data, thus not applicable to tasks without reward functions. Existing
settings for lack of reward, such as behavioral cloning, rely on optimal
demonstrations collected from humans. Unfortunately, extensive expertise is
required for ensuring optimality, which hinder the acquisition of large-scale
data for complex tasks. This paper addresses the lack of reward in a batch
reinforcement learning setting by learning a reward function from preferences.
Generating preferences only requires a basic understanding of a task. Being a
mental process, generating preferences is faster than performing
demonstrations. So preferences can be collected at scale from non-expert humans
using crowdsourcing. This paper tackles a critical challenge that emerged when
collecting data from non-expert humans: the noise in preferences. A novel
probabilistic model is proposed for modelling the reliability of labels, which
utilizes labels collaboratively. Moreover, the proposed model smooths the
estimation with a learned reward function. Evaluation on Atari datasets
demonstrates the effectiveness of the proposed model, followed by an ablation
study to analyze the relative importance of the proposed ideas.Comment: 16 pages. Accepted by ECML-PKDD 202
Creativity: can artistic perspectives contribute to management?
Today creativity is considered as a necessity in all aspects of management. This working paper mirrors the artistic and managerial conceptions of creativity. Although there are shared points in both applications, however deep-seated and radically opposed traits account for the divergence between the two fields. This exploratory analysis opens up new research questions and insights into practices
- …