Search CORE

887 research outputs found

CopyCAT: Taking Control of Neural Policies with Constant Attacks

Author: Geist Matthieu
Hussenot Léonard
Pietquin Olivier
Publication venue
Publication date: 21/01/2020
Field of study

We propose a new perspective on adversarial attacks against deep reinforcement learning agents. Our main contribution is CopyCAT, a targeted attack able to consistently lure an agent into following an outsider's policy. It is pre-computed, therefore fast inferred, and could thus be usable in a real-time scenario. We show its effectiveness on Atari 2600 games in the novel read-only setting. In this setting, the adversary cannot directly modify the agent's state -- its representation of the environment -- but can only attack the agent's observation -- its perception of the environment. Directly modifying the agent's state would require a write-access to the agent's inner workings and we argue that this assumption is too strong in realistic settings.Comment: AAMAS 202

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Investigating Circumferential Non-Uniformities in Throughflow Calculations using an Harmonic Reconstruction

Author: Léonard Olivier
Thomas Jean-Philippe
Publication venue: 'ASME International'
Publication date: 01/06/2008
Field of study

peer reviewedThe ﬂow ﬁeld in a multistage turbomachine is very complex. It is 3D, unsteady and turbulent. Even if modern simulation tools can describe most of the ﬂow features, the computation time needed and the extraction of useful information remain severe drawbacks to systematic use of URANS codes in a design procedure. In this context the throughﬂow simulation proved to be more convenient. Nevertheless the need for empiricism limits the potential of throughﬂow solvers. As an alternative, Admaczyck (1984) proposed three averaging operators (ensemble, time and passage) that lead to the average-passage model, linking the unsteady turbulent ﬂow ﬁeld to a steady ﬂow ﬁeld in a typical blade passage. This model involves additional terms that respectively bring back the mean effect of turbulence, deterministic unsteadiness and aperiodicity on the mean periodic ﬂow. These terms need to be modelled; it is the closure problem. Harmonic closure, which consists in solving a linearized perturbation system in the frequency domain, revealed to be an efficient method to approximate deterministic stresses (He and Ning, 1998, Stridh, 2005, Vilmin, 2006). A fourth averaging can be performed, a circumferential averaging, giving rise to the throughﬂow model. Additional terms appear: the so-called circumferential stresses. It has been proven that these terms play an important role in the description of the ﬂow (Jennions, 1986, Perrin, 1995), being at least as considerable as deterministic stresses. Introducing these terms in a throughﬂow simulation allows to reproduce the averaged 3D steady ﬂow ﬁeld (Simon, 2007). The aim of the present contribution is to prove that harmonic method can potentially be used to reconstruct circumferential stresses. The importance of circumferential stresses in a throughﬂow simulation is ﬁrst highlighted on a single stage low speed compressor testcase, for viscous and non-viscous ﬂow ﬁelds. The second step is the characterization of the frequency spectrum of the circumferential perturbation ﬁeld. Next are compared the stresses associated to a Fourier reconstruction of the perturbation ﬁeld with the real ones. Finally the approximated circumferential stresses are injected into a throughﬂow simulation tool to analyse and demonstrate their capability to reproduce a 3D averaged ﬂow ﬁeld

Open Repository and Bibliography - Liège

Compressor and Turbine Blade Design by Optimization

Author: Duysinx Pierre
Léonard Olivier
Rothilde André
Publication venue
Publication date: 01/05/1999
Field of study

Compressor and turbine blade design involves thermodynamical, aerodynamical and mechanical aspects, resulting in an important number of iterations. Inverse methods and optimization procedures help the designer in this long and eventually frustrating process. In this paper an optimization procedure is presented which solves two types of two-dimensional or quasi-three-dimensional problems: the inverse problem, for which a target velocity distribution is imposed, and a more global problem, in which the aerodynamic load is maximized

Open Repository and Bibliography - Liège

Compaction behavior of out-of-autoclave prepreg materials

Author: Cinquin Jacques
Olivier Philippe
Serrano Léonard
Publication venue: 'AIP Publishing'
Publication date: 01/01/2017
Field of study

The main challenges with composite parts manufacturing are related to the curing means, mainly autoclaves, the length of their cycles and their operating costs. In order to decrease this dependency, out of autoclave materials have been considered as a solution for high production rate parts such as spars, flaps, etc… However, most out-of-autoclave process do not possess the same maturity as their counterpart, especially concerning part quality1. Some pre-cure processes such as compaction and ply lay-up are usually less of a concern for autoclave manufacturing: the pressure applied during the cycle participates to reduce the potential defects (porosity caused by a poor quality lay-up, bad compaction, entrapped air or humidity…). For out-of-autoclave parts, those are crucial steps which may have many consequences on the final quality of the laminate2. In order to avoid this quality loss, those steps must be well understood

Open Archive Toulouse Archive Ouverte

HAL-INSA Toulouse

Primal Wasserstein Imitation Learning

Author: Dadashi Robert
Geist Matthieu
Hussenot Léonard
Pietquin Olivier
Publication venue
Publication date: 17/03/2021
Field of study

Imitation Learning (IL) methods seek to match the behavior of an agent with that of an expert. In the present work, we propose a new IL method based on a conceptually simple algorithm: Primal Wasserstein Imitation Learning (PWIL), which ties to the primal form of the Wasserstein distance between the expert and the agent state-action distributions. We present a reward function which is derived offline, as opposed to recent adversarial IL algorithms that learn a reward function through interactions with the environment, and which requires little fine-tuning. We show that we can recover expert behavior on a variety of continuous control tasks of the MuJoCo domain in a sample efficient manner in terms of agent interactions and of expert interactions with the environment. Finally, we show that the behavior of the agent we train matches the behavior of the expert with the Wasserstein distance, rather than the commonly used proxy of performance.Comment: Published in International Conference on Learning Representations (ICLR 2021

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Vibrated polar disks: spontaneous motion, binary collisions, and collective dynamics

Author: Chaté Hugues
Dauchot Olivier
Deseigne Julien
Léonard Sébastien
Publication venue
Publication date: 01/01/2012
Field of study

We study the spontaneous motion, binary collisions, and collective dynamics of "polar disks", i.e. purpose-built particles which, when vibrated between two horizontal plates, move coherently along a direction strongly correlated to their intrinsic polarity. The motion of our particles, although nominally three-dimensional and complicated, is well accounted for by a two-dimensional persistent random walk. Their binary collisions are spatiotemporally extended events during which multiple actual collisions happen, yielding a weak average effective alignment. We show that this well-controlled, "dry active matter" system can display collective motion with orientationally-ordered regions of the order of the system size. We provide evidence of strong number density in the most ordered regimes observed. These results are discussed in the light of the limitations of our system, notably those due to the inevitable presence of walls.Comment: 13 pages, 10 figures, 4 movie

arXiv.org e-Print Archive

Étude et réalisation d'un système auteur avec simulateur de systèmes d'équations différentielles intégré : profCOMP et simEDO

Author: Léonard Claude
Marchand Olivier
Publication venue
Publication date: 01/01/1988
Field of study

Repository of the University of Namur

Offline Reinforcement Learning as Anti-Exploration

Author: Bachem Olivier
Dadashi Robert
Geist Matthieu
Hussenot Léonard
Pietquin Olivier
Rezaeifar Shideh
Vieillard Nino
Publication venue
Publication date: 11/06/2021
Field of study

Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, without interactions with the system. An agent in this setting should avoid selecting actions whose consequences cannot be predicted from the data. This is the converse of exploration in RL, which favors such actions. We thus take inspiration from the literature on bonus-based exploration to design a new offline RL agent. The core idea is to subtract a prediction-based exploration bonus from the reward, instead of adding it for exploration. This allows the policy to stay close to the support of the dataset. We connect this approach to a more common regularization of the learned policy towards the data. Instantiated with a bonus based on the prediction error of a variational autoencoder, we show that our agent is competitive with the state of the art on a set of continuous control locomotion and manipulation tasks

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Association for the Advancement of Artificial Intelligence: AAAI Publications

Offline Reinforcement Learning with Pseudometric Learning

Author: Dadashi Robert
Geist Matthieu
Hussenot Léonard
Pietquin Olivier
Rezaeifar Shideh
Vieillard Nino
Publication venue
Publication date: 02/06/2021
Field of study

Offline Reinforcement Learning methods seek to learn a policy from logged transitions of an environment, without any interaction. In the presence of function approximation, and under the assumption of limited coverage of the state-action space of the environment, it is necessary to enforce the policy to visit state-action pairs close to the support of logged transitions. In this work, we propose an iterative procedure to learn a pseudometric (closely related to bisimulation metrics) from logged transitions, and use it to define this notion of closeness. We show its convergence and extend it to the function approximation setting. We then use this pseudometric to define a new lookup based bonus in an actor-critic algorithm: PLOFF. This bonus encourages the actor to stay close, in terms of the defined pseudometric, to the support of logged transitions. Finally, we evaluate the method on hand manipulation and locomotion tasks.Comment: ICML 202

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Model-based verification of a security protocol for conditional access to services

Author: Charles Pecheur
E. Koerner
Eckhart Koerner
Guy Leduc
Luc Léonard
Olivier Bonaventure
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1999
Field of study

peer reviewedWe use the formal language LOTOS to specify and verify the robustness of the Equicrypt protocol under design in the European OKAPI project for conditional access to multimedia services. We state some desired security properties and formalize them. We describe a generic intruder process and its modelling, and show that some properties are falsified in the presence of this intruder. The diagnostic sequences can be used almost directly to exhibit the scenarios of possible attacks on the protocol. Finally, we propose an improvement of the protocol which satisfies our properties

CiteSeerX

Open Repository and Bibliography - Liège

DIAL UCLouvain