Search CORE

1,156 research outputs found

A Linearly Constrained Nonparametric Framework for Imitation Learning

Author: Caldwell DG
Huang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2020
Field of study

In recent years, a myriad of advanced results have been reported in the community of imitation learning, ranging from parametric to non-parametric, probabilistic to non-probabilistic and Bayesian to frequentist approaches. Meanwhile, ample applications (e.g., grasping tasks and humanrobot collaborations) further show the applicability of imitation learning in a wide range of domains. While numerous literature is dedicated to the learning of human skills in unconstrained environments, the problem of learning constrained motor skills, however, has not received equal attention. In fact, constrained skills exist widely in robotic systems. For instance, when a robot is demanded to write letters on a board, its end-effector trajectory must comply with the plane constraint from the board. In this paper, we propose linearly constrained kernelized movement primitives (LC-KMP) to tackle the problem of imitation learning with linear constraints. Specifically, we propose to exploit the probabilistic properties of multiple demonstrations, and subsequently incorporate them into a linearly constrained optimization problem, which finally leads to a non-parametric solution. In addition, a connection between our framework and the classical model predictive control is provided. Several examples including simulated writing and locomotion tasks are presented to show the effectiveness of our framework

Crossref

White Rose Research Online

Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis

Author: Damianou Andreas
Ek Carl Henrik
Lawrence Neil D.
Publication venue
Publication date: 17/04/2016
Field of study

Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce. We further show experimental results on several different types of multi-view data sets and for different kinds of tasks, including exploratory data analysis, generation, ambiguity modelling through latent priors and classification.Comment: 49 pages including appendi

arXiv.org e-Print Archive

Explore Bristol Research

On Nonparametric Guidance for Learning Autoencoder Representations

Author: Adams Ryan Prescott
Larochelle Hugo
Snoek Jasper
Publication venue
Publication date: 25/10/2011
Field of study

Unsupervised discovery of latent representations, in addition to being useful for density modeling, visualisation and exploratory data analysis, is also increasingly important for learning features relevant to discriminative tasks. Autoencoders, in particular, have proven to be an effective way to learn latent codes that reflect meaningful variations in data. A continuing challenge, however, is guiding an autoencoder toward representations that are useful for particular tasks. A complementary challenge is to find codes that are invariant to irrelevant transformations of the data. The most common way of introducing such problem-specific guidance in autoencoders has been through the incorporation of a parametric component that ties the latent representation to the label information. In this work, we argue that a preferable approach relies instead on a nonparametric guidance mechanism. Conceptually, it ensures that there exists a function that can predict the label information, without explicitly instantiating that function. The superiority of this guidance mechanism is confirmed on two datasets. In particular, this approach is able to incorporate invariance information (lighting, elevation, etc.) from the small NORB object recognition dataset and yields state-of-the-art performance for a single layer, non-convolutional network.Comment: 9 pages, 12 figure

arXiv.org e-Print Archive

Harvard University - DASH

Strategic Intellectual Property Rights Policy and North-South Technology Transfer

Author: Alireza Naghavi
Publication venue
Publication date
Field of study

This paper analyzes welfare implications of protecting intellectual property rights (IPR) in the framework of TRIPS for developing countries (South) through its impact on innovation, market structure and technology transfer. In a North-South trade environment, the South sets its IPR policy strategically to manipulate multinationals’ decisions on innovation and location. Firms can protect their technology by exporting or risk spillovers by undertaking FDI to avoid tariffs. A stringent IPR regime is always optimal for the South as it triggers technology transfer by inducing FDI in less R&D-intensive industries and stimulates innovation by pushing multinationals to deter entry in high-technology sectors.Intellectual property rights, Technology transfer, Multinational firms, Foreign direct investment, North-South trade

Research Papers in Economics

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning

Author: Abdolmaleki Abbas
Bousmalis Konstantinos
Byravan Arunkumar
Gyorgy Andras
Hadsell Raia
Heess Nicolas
Huang Sandy H.
Mishra Shruti
Riedmiller Martin
Shahriari Bobak
Springenberg Jost Tobias
Szepesvari Csaba
TB Dhruva
Vezzani Giulia
Publication venue
Publication date: 15/06/2021
Field of study

Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) algorithms can, in one way or another, be understood as introducing additional objectives, or constraints, in the policy optimization step. This includes ideas as far ranging as exploration bonuses, entropy regularization, and regularization toward teachers or data priors when learning from experts or in offline RL. Often, task reward and auxiliary objectives are in conflict with each other and it is therefore natural to treat these examples as instances of multi-objective (MO) optimization problems. We study the principles underlying MORL and introduce a new algorithm, Distillation of a Mixture of Experts (DiME), that is intuitive and scale-invariant under some conditions. We highlight its strengths on standard MO benchmark problems and consider case studies in which we recast offline RL and learning from experts as MO problems. This leads to a natural algorithmic formulation that sheds light on the connection between existing approaches. For offline RL, we use the MO perspective to derive a simple algorithm, that optimizes for the standard RL objective plus a behavioral cloning term. This outperforms state-of-the-art on two established offline RL benchmarks

arXiv.org e-Print Archive

Bayesian Nonparametric Learning of Cloth Models for Real-time State Estimation

Author: Ikeda Kazushi
Koganti Nishanth
Shibata Tomohiro
Tamei Tomoya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/05/2017
Field of study

Robotic solutions to clothing assistance can significantly improve quality of life for the elderly and disabled. Real-time estimation of the human-cloth relationship is crucial for efficient learning of motor skills for robotic clothing assistance. The major challenge involved is cloth-state estimation due to inherent nonrigidity and occlusion. In this study, we present a novel framework for real-time estimation of the cloth state using a low-cost depth sensor, making it suitable for a feasible social implementation. The framework relies on the hypothesis that clothing articles are constrained to a low-dimensional latent manifold during clothing tasks. We propose the use of manifold relevance determination (MRD) to learn an offline cloth model that can be used to perform informed cloth-state estimation in real time. The cloth model is trained using observations from a motion capture system and depth sensor. MRD provides a principled probabilistic framework for inferring the accurate motion-capture state when only the noisy depth sensor feature state is available in real time. The experimental results demonstrate that our framework is capable of learning consistent task-specific latent features using few data samples and has the ability to generalize to unseen environmental settings. We further present several factors that affect the predictive performance of the learned cloth-state model

Kyutacar : Kyushu Institute of Technology Academic Repository