Search CORE

Théo Guesser

Author: Jaunet Theo
Vuillemot Romain
Wolf Christian
Publication venue: HAL CCSD
Publication date: 18/10/2020
Field of study

International audienc

DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning

Author: Jaunet Theo
Vuillemot Romain
Wolf Christian
Publication venue: HAL CCSD
Publication date: 29/05/2020
Field of study

International audienceWe present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co-correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible for humans. Using DRLViz, experts are assisted to interpret decisions using memory reduction interactions, and to investigate the role of parts of the memory when errors have been made (e.g. wrong direction). We report on DRLViz applied in the context of video games simulators (ViZDoom) for a navigation scenario with item gathering tasks. We also report on experts evaluation using DRLViz, and applicability of DRLViz to other scenarios and navigation problems beyond simulation games, as well as its contribution to black box models interpretability and explainability in the field of visual analytics

RLMViz: Interpréter la Mémoire du Deep Reinforcement Learning

Author: Jaunet Theo
Vuillemot Romain
Wolf Christian
Publication venue: HAL CCSD
Publication date: 17/05/2019
Field of study

National audienceWe present RLMViz, a visual analytics interface to interpret the internal memory of an agent (e.g., a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated before each action of the agent moving in an environment. This memory is not trivial to understand, and is referred to as a black box, which only inputs (images) and outputs (actions) are understood, but not its inner workings. Using RLMViz, experts can form hypothesis on this memory and derive rules based on the agent's decisions to interpret them, and gain an understanding towards why errors have been made and improve future training process. We report on the main features of RLMViz which are memory navigation and contextualization techniques using time-lines juxtapositions. We also present our early findings using the VizDoom simulator, a standard benchmark for DRL navigation scenarios

What if we Reduce the Memory of an Artificial Doom Player?

Author: Jaunet Theo
Vuillemot Romain
Wolf Christian
Publication venue: HAL CCSD
Publication date: 20/10/2019
Field of study

International audienc

SIM2REALVIZ: Visualiser le Sim2Real Gap pour l'Estimation de pose de Robot

Author: Bono Guillaume
Jaunet Theo
Vuillemot Romain
Wolf Christian
Publication venue: Association of Asia Pacific Physical Societies
Publication date: 14/12/2021
Field of study

The Robotics community has started to heavily rely on increasingly realistic 3D simulators for large-scale training of robots on massive amounts of data. But once robots are deployed in the real-world, the simulation gap, as well as changes in the real-world (e.g. lights, objects displacements) leads to errors. In this paper, we introduce SIM2REALVIZ, a visual analytics tool to assist experts in understanding and reducing this gap for robot ego-pose estimation tasks, i. e. the estimation of a robot’s position using trained models. SIM2REALVIZ displays details of a given model and the performance of its instances in both simulation and real-world. Experts can identify environment differences that impact model predictions at a given location and explore through direct interactions with the model hypothesis to fix it. We detail the design of the tool, and case studies related to the exploit of the regression to the mean bias and how it can be addressed, and how models are perturbed by vanishing landmarks such as bikes.La communauté robotique a commencé à s'appuyer fortement sur des simulateurs 3D de plus en plus réalistes pour l'entraînement à grande échelle des robots sur des quantités massives de données. Mais une fois les robots déployés dans le monde réel, le décalage de la simulation, ainsi que les changements dans le monde réel (par exemple, les lumières, les déplacements d'objets) conduisent à des erreurs. Dans cet article, nous présentons SIM2REALVIZ, un outil d'analyse visuelle pour aider les experts à comprendre et à réduire cet écart pour les tâches d'estimation de l'ego-pose des robots, c'est-à-dire l'estimation de la position d'un robot à l'aide de modèles entraînés. SIM2REALVIZ affiche les détails d'un modèle donné et les performances de ses instances en simulation et dans le monde réel. Les experts peuvent identifier les différences d'environnement qui ont un impact sur les prédictions du modèle à un endroit donné et explorer par des interactions directes avec l'hypothèse du modèle pour la corriger. Nous détaillons la conception de l'outil, ainsi que des études de cas liées à l'exploitation du biais de régression à la moyenne et à la façon dont il peut être traité, et à la façon dont les modèles sont perturbés par des points de repère disparus tels que les vélos.Traduit avec www.DeepL.com/Translator (version gratuite

SwimTrack: Swimmers and Stroke Rate Detection in Elite Race Videos

Author: Duffner Stefan
Jacquelin Nicolas
Jaunet Theo
Vuillemot Romain
Publication venue: HAL CCSD
Publication date: 13/01/2023
Field of study

We present SwimTrack, a series of 5 multimedia tasks related to swimming video analysis from elite competition live recordings. These tasks are related to video, image, and audio analysis which may be achieved independently. But when solved altogether, they form a grand challenge to provide sport federations and coaches with novel methods to asses and enhance swimmers' performance, in particular related to stroke rate and length analysis. We share a unique collection of video footage that contains all swimming race types, recorded from a spectator point of view with variations such as lighting reflections, background clutter, noise from the motion of waves, and different point of views on swimmers. SwimTrack is the first challenge of this kind for a total of 4 swimming elite competitions. We sought to include a larger and even more diverse set of videos as well as additional mini-challenges once more recordings will be available in a next version

arXiv.org e-Print Archive

VisQA: X-raying Vision and Language Reasoning in Transformers

Author: Antipov Grigory
Baccouche Moez
Jaunet Theo
Kervadec Corentin
Vuillemot Romain
Wolf Christian
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 20/07/2021
Field of study

International audienceVisual Question Answering systems target answering open-ended textual questions given input images. They are a testbed for learning high-level reasoning with a primary use in HCI, for instance assistance for the visually impaired. Recent research has shown that state-of-the-art models tend to produce answers exploiting biases and shortcuts in the training data, and sometimes do not even look at the input image, instead of performing the required reasoning steps. We present VisQA, a visual analytics tool thatexplores this question of reasoning vs. bias exploitation. It exposes the key element of state-of-the-art neural models --- attention maps in transformers. Our working hypothesis is that reasoning steps leading to model predictions are observable from attention distributions, which are particularly useful for visualization. The design process of VisQA was motivated by well-known bias examples from the fields of deep learning and vision-language reasoning and evaluated in two ways. First, as a result of a collaboration of three fields, machine learning, vision and language reasoning, and data analytics, the work lead to a better understanding of bias exploitation of neural models for VQA, which eventually resulted in an impact on its design and training through the proposition of a method for the transfer of reasoning patterns from an oracle model. Second, we also report on the design of VisQA, and a goal-oriented evaluation of VisQA targeting the analysis of a model decision process from multiple experts, providing evidence that it makes the inner workings of models accessible to users