65 research outputs found
Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces
Quality Diversity (QD) algorithms have been proposed to search for a large
collection of both diverse and high-performing solutions instead of a single
set of local optima. While early QD algorithms view the objective and
descriptor functions as black-box functions, novel tools have been introduced
to use gradient information to accelerate the search and improve overall
performance of those algorithms over continuous input spaces. However a broad
range of applications involve discrete spaces, such as drug discovery or image
generation. Exploring those spaces is challenging as they are combinatorially
large and gradients cannot be used in the same manner as in continuous spaces.
We introduce map-elites with a Gradient-Informed Discrete Emitter (ME-GIDE),
which extends QD optimisation with differentiable functions over discrete
search spaces. ME-GIDE leverages the gradient information of the objective and
descriptor functions with respect to its discrete inputs to propose
gradient-informed updates that guide the search towards a diverse set of high
quality solutions. We evaluate our method on challenging benchmarks including
protein design and discrete latent space illumination and find that our method
outperforms state-of-the-art QD algorithms in all benchmarks
PASTA: Pretrained Action-State Transformer Agents
Self-supervised learning has brought about a revolutionary paradigm shift in
various computing domains, including NLP, vision, and biology. Recent
approaches involve pre-training transformer models on vast amounts of unlabeled
data, serving as a starting point for efficiently solving downstream tasks. In
the realm of reinforcement learning, researchers have recently adapted these
approaches by developing models pre-trained on expert trajectories, enabling
them to address a wide range of tasks, from robotics to recommendation systems.
However, existing methods mostly rely on intricate pre-training objectives
tailored to specific downstream applications. This paper presents a
comprehensive investigation of models we refer to as Pretrained Action-State
Transformer Agents (PASTA). Our study uses a unified methodology and covers an
extensive set of general downstream tasks including behavioral cloning, offline
RL, sensor failure robustness, and dynamics change adaptation. Our goal is to
systematically compare various design choices and provide valuable insights to
practitioners for building robust models. Key highlights of our study include
tokenization at the action and state component level, using fundamental
pre-training objectives like next token prediction, training models across
diverse domains simultaneously, and using parameter efficient fine-tuning
(PEFT). The developed models in our study contain fewer than 10 million
parameters and the application of PEFT enables fine-tuning of fewer than 10,000
parameters during downstream adaptation, allowing a broad community to use
these models and reproduce our experiments. We hope that this study will
encourage further research into the use of transformers with first-principles
design choices to represent RL trajectories and contribute to robust policy
learning
The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers
In the context of neuroevolution, Quality-Diversity algorithms have proven
effective in generating repertoires of diverse and efficient policies by
relying on the definition of a behavior space. A natural goal induced by the
creation of such a repertoire is trying to achieve behaviors on demand, which
can be done by running the corresponding policy from the repertoire. However,
in uncertain environments, two problems arise. First, policies can lack
robustness and repeatability, meaning that multiple episodes under slightly
different conditions often result in very different behaviors. Second, due to
the discrete nature of the repertoire, solutions vary discontinuously. Here we
present a new approach to achieve behavior-conditioned trajectory generation
based on two mechanisms: First, MAP-Elites Low-Spread (ME-LS), which constrains
the selection of solutions to those that are the most consistent in the
behavior space. Second, the Quality-Diversity Transformer (QDT), a
Transformer-based model conditioned on continuous behavior descriptors, which
trains on a dataset generated by policies from a ME-LS repertoire and learns to
autoregressively generate sequences of actions that achieve target behaviors.
Results show that ME-LS produces consistent and robust policies, and that its
combination with the QDT yields a single policy capable of achieving diverse
behaviors on demand with high accuracy.Comment: 10+7 page
A competitive integration model of exogenous and endogenous eye movements
We present a model of the eye movement system in which the programming of an eye movement is the result of the competitive integration of information in the superior colliculi (SC). This brain area receives input from occipital cortex, the frontal eye fields, and the dorsolateral prefrontal cortex, on the basis of which it computes the location of the next saccadic target. Two critical assumptions in the model are that cortical inputs are not only excitatory, but can also inhibit saccades to specific locations, and that the SC continue to influence the trajectory of a saccade while it is being executed. With these assumptions, we account for many neurophysiological and behavioral findings from eye movement research. Interactions within the saccade map are shown to account for effects of distractors on saccadic reaction time (SRT) and saccade trajectory, including the global effect and oculomotor capture. In addition, the model accounts for express saccades, the gap effect, saccadic reaction times for antisaccades, and recorded responses from neurons in the SC and frontal eye fields in these tasks. © The Author(s) 2010
Développement et analyse d'une méthode sans maillage pour les systèmes hyperboliques du premier ordre
CHATENAY MALABRY-Ecole centrale (920192301) / SudocSudocFranceF
Rôle des afférences musculaires du groupe I dans le contrôle de la posture et du mouvement chez l'homme
PARIS-BIUSJ-Thèses (751052125) / SudocPARIS-BIUSJ-Physique recherche (751052113) / SudocSudocFranceF
RECOVERY OF DIFFERENTIATION/INTEGRATION COMPATIBILITY OF MESHLESS OPERATORS VIA LOCAL ADAPTATION OF THE POINT CLOUD IN THE CONTEXT OF NODAL INTEGRATION
International audienc
CAD-Free Soft Handle Parameterization Tool for Adjoint-Based Optimization Methods
<p>Soft Handle CAD – Free Parameterization Tool (Basic idea) Aims to keep rich design space and enforce smoothness to the resulting shape. Handles/Parameters are selected appropriately. Shape changes are gracefully driven by the movement of the handles, whilst enforcing smoothness to the shape</p
- …