Search CORE

212 research outputs found

Classifying Options for Deep Reinforcement Learning

Author: Arulkumaran Kai
Bharath Anil Anthony
Dilokthanakul Nat
Shanahan Murray
Publication venue
Publication date: 23/05/2016
Field of study

In this paper we combine one method for hierarchical reinforcement learning - the options framework - with deep Q-networks (DQNs) through the use of different "option heads" on the policy network, and a supervisory network for choosing between the different options. We utilise our setup to investigate the effects of architectural constraints in subtasks with positive and negative transfer, across a range of network capacities. We empirically show that our augmented DQN has lower sample complexity when simultaneously learning subtasks with negative transfer, without degrading performance when learning subtasks with positive transfer.Comment: IJCAI 2016 Workshop on Deep Reinforcement Learning: Frontiers and Challenge

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Functional Knowledge Transfer with Self-supervised Representation Learning

Author: Chhipa Prakash Chandra
Chippa Meenakshi Subhash
Chopra Muskaan
De Kanjar
Gupta Varun
Liwicki Marcus
Mengi Gopal
Saini Rajkumar
Uchida Seiichi
Upadhyay Richa
Publication venue
Publication date: 10/07/2023
Field of study

This work investigates the unexplored usability of self-supervised representation learning in the direction of functional knowledge transfer. In this work, functional knowledge transfer is achieved by joint optimization of self-supervised learning pseudo task and supervised learning task, improving supervised learning task performance. Recent progress in self-supervised learning uses a large volume of data, which becomes a constraint for its applications on small-scale datasets. This work shares a simple yet effective joint training framework that reinforces human-supervised task learning by learning self-supervised representations just-in-time and vice versa. Experiments on three public datasets from different visual domains, Intel Image, CIFAR, and APTOS, reveal a consistent track of performance improvements on classification tasks during joint optimization. Qualitative analysis also supports the robustness of learnt representations. Source code and trained models are available on GitHub.Comment: Accepted at IEEE International Conference on Image Processing (ICIP 2023

arXiv.org e-Print Archive

Deep Polyphonic ADSR Piano Note Transcription

Author: Böck Sebastian
Kelz Rainer
Widmer Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2019
Field of study

We investigate a late-fusion approach to piano transcription, combined with a strong temporal prior in the form of a handcrafted Hidden Markov Model (HMM). The network architecture under consideration is compact in terms of its number of parameters and easy to train with gradient descent. The network outputs are fused over time in the final stage to obtain note segmentations, with an HMM whose transition probabilities are chosen based on a model of attack, decay, sustain, release (ADSR) envelopes, commonly used for sound synthesis. The note segments are then subject to a final binary decision rule to reject too weak note segment hypotheses. We obtain state-of-the-art results on the MAPS dataset, and are able to outperform other approaches by a large margin, when predicting complete note regions from onsets to offsets.Comment: 5 pages, 2 figures, published as ICASSP'1

arXiv.org e-Print Archive

Crossref

Deep Convolutional Neural Networks for MultilabelPrediction Using RGBD Data

Author: Wigand Liesl
Publication venue
Publication date: 09/03/2018
Field of study

Robotics relies heavily on the system's ability to perceive the world around the robot accurately and quickly. In a narrow setting as in manufacturing this goal is relatively simple. To make robotics feasible in more dynamic settings we must handle more objects, more attributes, and events that may be out of the scope of what a system has been exposed to previously. To this end, the present work focuses on automatic feature formation from RGB-D data, using deep convolutional neural networks, in order to recognize, not only objects but also attributes which are more applicable across objects, including those objects which have not been seen previously. Progress is shown in relation to more standard systems and near real-time classification of multiple targets is achieved

University of Nevada, Reno ScholarWorks Repository

Bag of Tricks for Training Data Extraction from Language Models

Author: Du Chao
Huang Yan
Kang Bingyi
Lin Min
Liu Qian
Pang Tianyu
Yan Shuicheng
Yu Weichen
Publication venue
Publication date: 01/06/2023
Field of study

With the advance of language models, privacy protection is receiving more attention. Training data extraction is therefore of great importance, as it can serve as a potential tool to assess privacy leakage. However, due to the difficulty of this task, most of the existing methods are proof-of-concept and still not effective enough. In this paper, we investigate and benchmark tricks for improving training data extraction using a publicly available dataset. Because most existing extraction methods use a pipeline of generating-then-ranking, i.e., generating text candidates as potential training data and then ranking them based on specific criteria, our research focuses on the tricks for both text generation (e.g., sampling strategy) and text ranking (e.g., token-level criteria). The experimental results show that several previously overlooked tricks can be crucial to the success of training data extraction. Based on the GPT-Neo 1.3B evaluation results, our proposed tricks outperform the baseline by a large margin in most cases, providing a much stronger baseline for future research. The code is available at https://github.com/weichen-yu/LM-Extraction.Comment: ICML 202

arXiv.org e-Print Archive