Search CORE

373 research outputs found

Budgeted Knowledge Transfer for State-wise Heterogeneous RL Agents

Author: Farshidian Farbod
Nili Ahmadabadi Majid
Talebpour Zeynab
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/05/2016
Field of study

In this paper we introduce a budgeted knowledge transfer algorithm for non-homogeneous reinforcement learning agents. Here the source and the target agents are completely identical except in their state representations. The algorithm uses functional space (Q-value space) as the transfer-learning media. In this method, the target agent’s functional points (Q-values) are estimated in an automatically selected lower-dimension subspace in order to accelerate knowledge transfer. The target agent searches that subspace using an exploration policy and selects actions accordingly during the period of its knowledge transfer in order to facilitate gaining an appropriate estimate of its Q-table. We show both analytically and empirically that this method decreases the required learning budget for the target agent

Infoscience - École polytechnique fédérale de Lausanne

Budgeted Knowledge Transfer for State-Wise Heterogeneous RL Agents

Author: A.W. Moore
M.E. Taylor
M.E. Taylor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Advances in Intelligent Vehicle Control

Author
Publication venue: 'MDPI AG'
Publication date: 06/12/2022
Field of study

This book is a printed edition of the Special Issue Advances in Intelligent Vehicle Control that was published in the journal Sensors. It presents a collection of eleven papers that covers a range of topics, such as the development of intelligent control algorithms for active safety systems, smart sensors, and intelligent and efficient driving. The contributions presented in these papers can serve as useful tools for researchers who are interested in new vehicle technology and in the improvement of vehicle control systems

Directory of Open Access Books (DOAB)

INDIRECT TASK-ORIENTED COMMUNICATION DESIGN FOR CONTROL AND DECISION MAKING IN MULTI-AGENT SYSTEMS

Author: Mostaani Arsham
Publication venue: University of Luxembourg, Luxembourg
Publication date: 12/05/2023
Field of study

Open Repository and Bibliography - Luxembourg

Adversarial jamming attacks and defense strategies via adaptive deep reinforcement learning

Author: Gursoy M. Cenk
Velipasalar Senem
Wang Feng
Zhong Chen
Publication venue
Publication date: 12/07/2020
Field of study

As the applications of deep reinforcement learning (DRL) in wireless communications grow, sensitivity of DRL based wireless communication strategies against adversarial attacks has started to draw increasing attention. In order to address such sensitivity and alleviate the resulting security concerns, we in this paper consider a victim user that performs DRL-based dynamic channel access, and an attacker that executes DRLbased jamming attacks to disrupt the victim. Hence, both the victim and attacker are DRL agents and can interact with each other, retrain their models, and adapt to opponents' policies. In this setting, we initially develop an adversarial jamming attack policy that aims at minimizing the accuracy of victim's decision making on dynamic channel access. Subsequently, we devise defense strategies against such an attacker, and propose three defense strategies, namely diversified defense with proportional-integral-derivative (PID) control, diversified defense with an imitation attacker, and defense via orthogonal policies. We design these strategies to maximize the attacked victim's accuracy and evaluate their performances.Comment: 13 pages, 24 figure

arXiv.org e-Print Archive

Decision-Making with Multi-Step Expert Advice on the Web

Author: Philipp Patrick Raoul
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2019
Field of study

This thesis deals with solving multi-step tasks by using advice from experts, which are algorithms to solve individual steps of such tasks. We contribute with methods for maximizing the number of correct task solutions by selecting and combining experts for individual task instances and methods for automating the process of solving tasks on the Web, where experts are available as Web services. Multi-step tasks frequently occur in Natural Language Processing (NLP) or Computer Vision, and as research progresses an increasing amount of exchangeable experts for the same steps are available on the Web. Service provider platforms such as Algorithmia monetize expert access by making expert services available via their platform and having customers pay for single executions. Such experts can be used to solve diverse tasks, which often consist of multiple steps and thus require pipelines of experts to generate hypotheses. We perceive two distinct problems for solving multi-step tasks with expert services: (1) Given that the task is sufficiently complex, no single pipeline generates correct solutions for all possible task instances. One thus must learn how to construct individual expert pipelines for individual task instances in order to maximize the number of correct solutions, while also taking into account the costs adhered to executing an expert. (2) To automatically solve multi-step tasks with expert services, we need to discover, execute and compose expert pipelines. With mostly textual descriptions of complex functionalities and input parameters, Web automation entails to integrate available expert services and data, interpreting user-specified task goals or efficiently finding correct service configurations. In this thesis, we present solutions to both problems: (1) We enable to learn well-performing expert pipelines assuming available reference data sets (comprising a number of task instances and solutions), where we distinguish between centralized and decentralized decision-making. We formalize the problem as specialization of a Markov Decision Process (MDP), which we refer to as Expert Process (EP) and integrate techniques from Statistical Relational Learning (SRL) or Multiagent coordination. (2) We develop a framework for automatically discovering, executing and composing expert pipelines by exploiting methods developed for the Semantic Web. We lift the representations of experts with structured vocabularies modeled with the Resource Description Framework (RDF) and extend EPs to Semantic Expert Processes (SEPs) to enable the data-driven execution of experts in Web-based architectures. We evaluate our methods in different domains, namely Medical Assistance with tasks in Image Processing and Surgical Phase Recognition, and NLP for textual data on the Web, where we deal with the task of Named Entity Recognition and Disambiguation (NERD)

KITopen

Active Learning for Reducing Labeling Effort in Text Classification Tasks

Author: Jacobs Pieter Floris
Maillette De Buy Wenniger Gideon
Schomaker Lambert
Wiering Marco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/09/2021
Field of study

Labeling data can be an expensive task as it is usually performed manually by domain experts. This is cumbersome for deep learning, as it is dependent on large labeled datasets. Active learning (AL) is a paradigm that aims to reduce labeling effort by only using the data which the used model deems most informative. Little research has been done on AL in a text classification setting and next to none has involved the more recent, state-of-the-art Natural Language Processing (NLP) models. Here, we present an empirical study that compares different uncertainty-based algorithms with BERT

_{base}

as the used classifier. We evaluate the algorithms on two NLP classification datasets: Stanford Sentiment Treebank and KvK-Frontpages. Additionally, we explore heuristics that aim to solve presupposed problems of uncertainty-based AL; namely, that it is unscalable and that it is prone to selecting outliers. Furthermore, we explore the influence of the query-pool size on the performance of AL. Whereas it was found that the proposed heuristics for AL did not improve performance of AL; our results show that using uncertainty-based AL with BERT

_{base}

outperforms random sampling of data. This difference in performance can decrease as the query-pool size gets larger.Comment: Accepted as a conference paper at the joint 33rd Benelux Conference on Artificial Intelligence and the 30th Belgian Dutch Conference on Machine Learning (BNAIC/BENELEARN 2021). This camera-ready version submitted to BNAIC/BENELEARN, adds several improvements including a more thorough discussion of related work plus an extended discussion section. 28 pages including references and appendice

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen