14 research outputs found

    Generation of Tactile Data from 3D Vision and Target Robotic Grasps

    Get PDF
    Tactile perception is a rich source of information for robotic grasping: it allows a robot to identify a grasped object and assess the stability of a grasp, among other things. However, the tactile sensor must come into contact with the target object in order to produce readings. As a result, tactile data can only be attained if a real contact is made. We propose to overcome this restriction by employing a method that models the behaviour of a tactile sensor using 3D vision and grasp information as a stimulus. Our system regresses the quantified tactile response that would be experienced if this grasp were performed on the object. We experiment with 16 items and 4 tactile data modalities to show that our proposal learns this task with low error.This work was supported in part by the Spanish Government and the FEDER Funds (BES-2016-078290, PRX19/00289, RTI2018-094279-B-100) and in part by the European Commission (COMMANDIA SOE2/P1/F0638), action supported by Interreg-V Sudoe

    On discovering and learning structure under limited supervision

    Full text link
    Les formes, les surfaces, les événements et les objets (vivants et non vivants) constituent le monde. L'intelligence des agents naturels, tels que les humains, va au-delà de la simple reconnaissance de formes. Nous excellons à construire des représentations et à distiller des connaissances pour comprendre et déduire la structure du monde. Spécifiquement, le développement de telles capacités de raisonnement peut se produire même avec une supervision limitée. D'autre part, malgré son développement phénoménal, les succès majeurs de l'apprentissage automatique, en particulier des modèles d'apprentissage profond, se situent principalement dans les tâches qui ont accès à de grands ensembles de données annotées. Dans cette thèse, nous proposons de nouvelles solutions pour aider à combler cette lacune en permettant aux modèles d'apprentissage automatique d'apprendre la structure et de permettre un raisonnement efficace en présence de tâches faiblement supervisés. Le thème récurrent de la thèse tente de s'articuler autour de la question « Comment un système perceptif peut-il apprendre à organiser des informations sensorielles en connaissances utiles sous une supervision limitée ? » Et il aborde les thèmes de la géométrie, de la composition et des associations dans quatre articles distincts avec des applications à la vision par ordinateur (CV) et à l'apprentissage par renforcement (RL). Notre première contribution ---Pix2Shape---présente une approche basée sur l'analyse par synthèse pour la perception. Pix2Shape exploite des modèles génératifs probabilistes pour apprendre des représentations 3D à partir d'images 2D uniques. Le formalisme qui en résulte nous offre une nouvelle façon de distiller l'information d'une scène ainsi qu'une représentation puissantes des images. Nous y parvenons en augmentant l'apprentissage profond non supervisé avec des biais inductifs basés sur la physique pour décomposer la structure causale des images en géométrie, orientation, pose, réflectance et éclairage. Notre deuxième contribution ---MILe--- aborde les problèmes d'ambiguïté dans les ensembles de données à label unique tels que ImageNet. Il est souvent inapproprié de décrire une image avec un seul label lorsqu'il est composé de plus d'un objet proéminent. Nous montrons que l'intégration d'idées issues de la littérature linguistique cognitive et l'imposition de biais inductifs appropriés aident à distiller de multiples descriptions possibles à l'aide d'ensembles de données aussi faiblement étiquetés. Ensuite, nous passons au paradigme d'apprentissage par renforcement, et considérons un agent interagissant avec son environnement sans signal de récompense. Notre troisième contribution ---HaC--- est une approche non supervisée basée sur la curiosité pour apprendre les associations entre les modalités visuelles et tactiles. Cela aide l'agent à explorer l'environnement de manière autonome et à utiliser davantage ses connaissances pour s'adapter aux tâches en aval. La supervision dense des récompenses n'est pas toujours disponible (ou n'est pas facile à concevoir), dans de tels cas, une exploration efficace est utile pour générer un comportement significatif de manière auto-supervisée. Pour notre contribution finale, nous abordons l'information limitée contenue dans les représentations obtenues par des agents RL non supervisés. Ceci peut avoir un effet néfaste sur la performance des agents lorsque leur perception est basée sur des images de haute dimension. Notre approche a base de modèles combine l'exploration et la planification sans récompense pour affiner efficacement les modèles pré-formés non supervisés, obtenant des résultats comparables à un agent entraîné spécifiquement sur ces tâches. Il s'agit d'une étape vers la création d'agents capables de généraliser rapidement à plusieurs tâches en utilisant uniquement des images comme perception.Shapes, surfaces, events, and objects (living and non-living) constitute the world. The intelligence of natural agents, such as humans is beyond pattern recognition. We excel at building representations and distilling knowledge to understand and infer the structure of the world. Critically, the development of such reasoning capabilities can occur even with limited supervision. On the other hand, despite its phenomenal development, the major successes of machine learning, in particular, deep learning models are primarily in tasks that have access to large annotated datasets. In this dissertation, we propose novel solutions to help address this gap by enabling machine learning models to learn the structure and enable effective reasoning in the presence of weakly supervised settings. The recurring theme of the thesis tries to revolve around the question of "How can a perceptual system learn to organize sensory information into useful knowledge under limited supervision?" And it discusses the themes of geometry, compositions, and associations in four separate articles with applications to computer vision (CV) and reinforcement learning (RL). Our first contribution ---Pix2Shape---presents an analysis-by-synthesis based approach(also referred to as inverse graphics) for perception. Pix2Shape leverages probabilistic generative models to learn 3D-aware representations from single 2D images. The resulting formalism allows us to perform a novel view synthesis of a scene and produce powerful representations of images. We achieve this by augmenting unsupervised learning with physically based inductive biases to decompose a scene structure into geometry, pose, reflectance and lighting. Our Second contribution ---MILe--- addresses the ambiguity issues in single-labeled datasets such as ImageNet. It is often inappropriate to describe an image with a single label when it is composed of more than one prominent object. We show that integrating ideas from Cognitive linguistic literature and imposing appropriate inductive biases helps in distilling multiple possible descriptions using such weakly labeled datasets. Next, moving into the RL setting, we consider an agent interacting with its environment without a reward signal. Our third Contribution ---HaC--- is a curiosity based unsupervised approach to learning associations between visual and tactile modalities. This aids the agent to explore the environment in an analogous self-guided fashion and further use this knowledge to adapt to downstream tasks. In the absence of reward supervision, intrinsic movitivation is useful to generate meaningful behavior in a self-supervised manner. In our final contribution, we address the representation learning bottleneck in unsupervised RL agents that has detrimental effect on the performance on high-dimensional pixel based inputs. Our model-based approach combines reward-free exploration and planning to efficiently fine-tune unsupervised pre-trained models, achieving comparable results to task-specific baselines. This is a step towards building agents that can generalize quickly on more than a single task using image inputs alone

    Proceedings of the 11th International Conference on Kinanthropology

    Get PDF
    The 11th International Conference on Kinantropology was held on the Nov 29 – Dec 1, 2017 in Brno and was organized by the Faculty of Sports Studies, Masaryk University and the Faculty of Kinesiology, University of Zagreb. This year was divided into several themes: sports medicine, sport and social science, sport training, healthy lifestyle and healthy ageing, sports management, analysis of human movement. Part of the conference was also a symposium Atletika and Ortoreha that gathered specialists in physiotherapy

    Stability of recording and neural tuning during intracortical brain-computer interface arm control

    Get PDF
    For intracortical brain-computer interface (BCI) controlled neuroprosthetic arms to become a valuable assistive technology for people with upper-limb paralysis they will need to be able to adjust to a number of changes in neural activity that have not previously been well characterized. These include recording instabilities, and changes in neural tuning when interacting with objects. Here I present characterizations of these problems in two human subjects, along with a few solutions. I quantified the rate at which recorded units become unstable within and between days to inform the design of self-recalibrating decoders. These decoders will provide BCI users with consistent performance, even as units become unstable, by updating to incorporate new units before too many original units have become unstable. Using the quantification of stability, I also examined whether unit characteristics could predict how long a unit would be stable. I found that units with high firing rates, large peak-to-peak voltages, and more accurate tuning were most likely to remain stable. Using this result, future work should be able to create decoders that preferentially rely on stable units in order to enable high-performance BCI control for longer. I addressed difficulties that the first subject was having using the hand to interact with objects. I identified the source of the problem as an increased firing rate across much of the population when the hand approached objects. I then developed a method to remove the increase in firing rate before decoding so that the arm kinematics became predictable when the hand approached objects. Finally, I studied the representation of desired grasp force in primary motor cortex to enable BCI users to grasp a variety of objects, from light, fragile objects to heavy, sturdy objects. I found that primary motor cortex represents grasp force in a predictable manner during grasping, but that the tuning to grasp force is not apparent while the user carries an object. These results will enable the creation of BCI decoders that can apply the appropriate amount of force when grasping, and avoid dropping objects while transporting them
    corecore