23 research outputs found
Bayesian Optimization for Developmental Robotics with Meta-Learning by Parameters Bounds Reduction
In robotics, methods and softwares usually require optimizations of
hyperparameters in order to be efficient for specific tasks, for instance
industrial bin-picking from homogeneous heaps of different objects. We present
a developmental framework based on long-term memory and reasoning modules
(Bayesian Optimisation, visual similarity and parameters bounds reduction)
allowing a robot to use meta-learning mechanism increasing the efficiency of
such continuous and constrained parameters optimizations. The new optimization,
viewed as a learning for the robot, can take advantage of past experiences
(stored in the episodic and procedural memories) to shrink the search space by
using reduced parameters bounds computed from the best optimizations realized
by the robot with similar tasks of the new one (e.g. bin-picking from an
homogenous heap of a similar object, based on visual similarity of objects
stored in the semantic memory). As example, we have confronted the system to
the constrained optimizations of 9 continuous hyperparameters for a
professional software (Kamido) in industrial robotic arm bin-picking tasks, a
step that is needed each time to handle correctly new object. We used a
simulator to create bin-picking tasks for 8 different objects (7 in simulation
and one with real setup, without and with meta-learning with experiences coming
from other similar objects) achieving goods results despite a very small
optimization budget, with a better performance reached when meta-learning is
used (84.3% vs 78.9% of success overall, with a small budget of 30 iterations
for each optimization) for every object tested (p-value=0.036).Comment: Accepted at the IEEE International Conference on Development and
Learning and Epigenetic Robotics 2020 (ICDL-Epirob 2020
The MediaEval 2016 Emotional Impact of Movies Task
Volume: 1739 Host publication title: MediaEval 2016 Multimedia Benchmark Workshop Host publication sub-title: Working Notes Proceedings of the MediaEval 2016 WorkshopNon peer reviewe
Classification of Emotional Speech Based on an Automatically Elaborated Hierarchical Classifier
International audienceCurrent machine-based techniques for vocal emotion recognition only consider a finite number of clearly labeled emotional classes whereas the kinds of emotional classes and their number are typically application dependent. Previous studies have shown that multistage classification scheme, because of ambiguous nature of affect classes, helps to improve emotion classification accuracy. However, these multistage classification schemes were manually elaborated by taking into account the underlying emotional classes to be discriminated. In this paper, we propose an automatically elaborated hierarchical classification scheme (ACS), which is driven by an evidence theory-based embedded feature-selection scheme (ESFS), for the purpose of application-dependent emotion recognition. Experimented on the Berlin dataset with 68 features and six emotion states, this automatically elaborated hierarchical classifier (ACS) showed its effectiveness, displaying a 71.38% classification accuracy rate compared to a 71.52% classification rate achieved by our previously dimensional model-driven but still manually elaborated multistage classifier (DEC). Using the DES dataset with five emotion states, our ACS achieved a 76.74% recognition rate compared to a 81.22% accuracy rate displayed by a manually elaborated multistage classification scheme (DEC)
Analyse de signaux sonores par les lois de Zipf et Zipf Inverse
- Nous présentons dans cet article un ensemble de codages de signaux sonores que nous avons développés afin d'adapter à ce type de signaux, l'analyse par les lois de Zipf et Zipf Inverse. L'efficacité de ces lois à décrire les phénomènes physiques n'est plus à démontrer, et à motiver nos investigations concernant le problème de la caractérisation de signaux sonores. Afin de valider notre approche, la méthode a été évaluée sur des signaux sonores médicaux, correspondant à des bruits xiphoïdiens
IRIM at TRECVID 2013: Semantic Indexing and Instance Search
International audienceThe IRIM group is a consortium of French teams working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2013 semantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classiffication, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried different fusion strategies. The best IRIM run has a Mean Inferred Average Precision of 0.2796, which ranked us 4th out of 26 participants
IRIM at TRECVID 2012: Semantic Indexing and Instance Search
International audienceThe IRIM group is a consortium of French teams work- ing on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2012 se- mantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likeli- hood of a video shot to contain a target concept. These scores are then used for producing a ranked list of im- ages or shots that are the most likely to contain the tar- get concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classi cation, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of dif- ferent descriptors and tried di erent fusion strategies. The best IRIM run has a Mean Inferred Average Pre- cision of 0.2378, which ranked us 4th out of 16 partici- pants. For the instance search task, our approach uses two steps. First individual methods of participants are used to compute similrity between an example image of in- stance and keyframes of a video clip. Then a two-step fusion method is used to combine these individual re- sults and obtain a score for the likelihood of an instance to appear in a video clip. These scores are used to ob- tain a ranked list of clips the most likely to contain the queried instance. The best IRIM run has a MAP of 0.1192, which ranked us 29th on 79 fully automatic runs
Analyse de signaux vidéos et sonores (application à l'étude de signaux médicaux)
The work deals with the study of multimedia sequences containing images and sounds. The analysis of images sequences consists in the tracking of moving objects in order to allow the study of their properties. The investigations have to enable the understanding of sounds when correlated to events in the image sequence. One generic method, based on the combination of regions and contours tracking, and one method adapted to homogeneous objects, based on level set theory, are proposed. The analysis of audio data consists in the development of an identification system based on the study of the structure of signals thanks to their coding and Zipf laws modeling. These methods have been evaluated on medical sequences within the framework of the gastro-oesophageal reflux pathology study, in collaboration with the Acoustique et Motricité Digestive research team of the University of Tours.La problématique considérée concerne l'étude de séquences multimédia constituées d'images et de sons dont il s'agit d'étudier les corrélations de manière à aider à la compréhension de l'origine des bruits. L'analyse des séquences d'images consiste à suivre les objets en mouvement de manière à permettre leur étude. Une méthode générique, reposant sur une combinaison de suivi de régions et de contours, et une méthode adaptée aux objets homogènes, reposant sur la théorie des ensembles de niveaux, sont proposées. L'analyse des données sonores consiste en l'élaboration d'un système d'identification reposant sur des données sonores consiste en l'élaboration d'un système d'identification reposant sur l'étude de la structure des signaux grâce à des codages adaptés et à leur modélisation par les lois de Zipf. Ces méthodes ont été évaluées sur des séquences acoustico-radiologiques dans le cadre de l'étude de la pathologie du reflux gastro-oesophagien, en collaboration avec l'équipe Acoustique et Motricité Digestive de l'Université de Tours.TOURS-BU Sciences Pharmacie (372612104) / SudocTOURS-Polytech'Informat.Product. (372612209) / SudocPARIS-BIUP (751062107) / SudocSudocFranceF
ESFS: A new embedded feature selection method based on SFS
Feature subset selection is an important subject when training classifiers in Machine Learning (ML) problems. Too many input features in a ML problem may lead to the so-called "curse of dimensionality", which describes the fact that the complexity of the classifier parameters adjustment during training increases exponentially with the number of features. Thus, ML algorithms are known to suffer from important decrease of the prediction accuracy when faced with many features that are not necessary. In this paper, we introduce a novel embedded feature selection method, called ESFS, which is inspired from the wrapper method SFS since it relies on the simple principle to add incrementally most relevant features. Its originality concerns the use of mass functions from the evidence theory that allows to merge elegantly the information carried by features, in an embedded way, and so leading to a lower computational cost than original SFS. This approach has successfully been applied to the emergent domain of emotion classification in audio signals
Discriminative Transfer Learning Using Similarities and Dissimilarities
International audienc