112 research outputs found

    Audio-Haptic Physically Based Simulation and Perception of Contact Structures

    Get PDF
    We propose a multimodal architecture in which audio and haptic textures are simulated in real-time using physical models. Experiments evaluating audio-haptic interaction in textures perception show that auditory cues significantly influence the haptic perception of virtual textures

    A Parametric Sound Object Model for Sound Texture Synthesis

    Get PDF
    This thesis deals with the analysis and synthesis of sound textures based on parametric sound objects. An overview is provided about the acoustic and perceptual principles of textural acoustic scenes, and technical challenges for analysis and synthesis are considered. Four essential processing steps for sound texture analysis are identifi ed, and existing sound texture systems are reviewed, using the four-step model as a guideline. A theoretical framework for analysis and synthesis is proposed. A parametric sound object synthesis (PSOS) model is introduced, which is able to describe individual recorded sounds through a fi xed set of parameters. The model, which applies to harmonic and noisy sounds, is an extension of spectral modeling and uses spline curves to approximate spectral envelopes, as well as the evolution of parameters over time. In contrast to standard spectral modeling techniques, this representation uses the concept of objects instead of concatenated frames, and it provides a direct mapping between sounds of diff erent length. Methods for automatic and manual conversion are shown. An evaluation is presented in which the ability of the model to encode a wide range of di fferent sounds has been examined. Although there are aspects of sounds that the model cannot accurately capture, such as polyphony and certain types of fast modulation, the results indicate that high quality synthesis can be achieved for many different acoustic phenomena, including instruments and animal vocalizations. In contrast to many other forms of sound encoding, the parametric model facilitates various techniques of machine learning and intelligent processing, including sound clustering and principal component analysis. Strengths and weaknesses of the proposed method are reviewed, and possibilities for future development are discussed

    On discovering and learning structure under limited supervision

    Full text link
    Les formes, les surfaces, les événements et les objets (vivants et non vivants) constituent le monde. L'intelligence des agents naturels, tels que les humains, va au-delà de la simple reconnaissance de formes. Nous excellons à construire des représentations et à distiller des connaissances pour comprendre et déduire la structure du monde. Spécifiquement, le développement de telles capacités de raisonnement peut se produire même avec une supervision limitée. D'autre part, malgré son développement phénoménal, les succès majeurs de l'apprentissage automatique, en particulier des modèles d'apprentissage profond, se situent principalement dans les tâches qui ont accès à de grands ensembles de données annotées. Dans cette thèse, nous proposons de nouvelles solutions pour aider à combler cette lacune en permettant aux modèles d'apprentissage automatique d'apprendre la structure et de permettre un raisonnement efficace en présence de tâches faiblement supervisés. Le thème récurrent de la thèse tente de s'articuler autour de la question « Comment un système perceptif peut-il apprendre à organiser des informations sensorielles en connaissances utiles sous une supervision limitée ? » Et il aborde les thèmes de la géométrie, de la composition et des associations dans quatre articles distincts avec des applications à la vision par ordinateur (CV) et à l'apprentissage par renforcement (RL). Notre première contribution ---Pix2Shape---présente une approche basée sur l'analyse par synthèse pour la perception. Pix2Shape exploite des modèles génératifs probabilistes pour apprendre des représentations 3D à partir d'images 2D uniques. Le formalisme qui en résulte nous offre une nouvelle façon de distiller l'information d'une scène ainsi qu'une représentation puissantes des images. Nous y parvenons en augmentant l'apprentissage profond non supervisé avec des biais inductifs basés sur la physique pour décomposer la structure causale des images en géométrie, orientation, pose, réflectance et éclairage. Notre deuxième contribution ---MILe--- aborde les problèmes d'ambiguïté dans les ensembles de données à label unique tels que ImageNet. Il est souvent inapproprié de décrire une image avec un seul label lorsqu'il est composé de plus d'un objet proéminent. Nous montrons que l'intégration d'idées issues de la littérature linguistique cognitive et l'imposition de biais inductifs appropriés aident à distiller de multiples descriptions possibles à l'aide d'ensembles de données aussi faiblement étiquetés. Ensuite, nous passons au paradigme d'apprentissage par renforcement, et considérons un agent interagissant avec son environnement sans signal de récompense. Notre troisième contribution ---HaC--- est une approche non supervisée basée sur la curiosité pour apprendre les associations entre les modalités visuelles et tactiles. Cela aide l'agent à explorer l'environnement de manière autonome et à utiliser davantage ses connaissances pour s'adapter aux tâches en aval. La supervision dense des récompenses n'est pas toujours disponible (ou n'est pas facile à concevoir), dans de tels cas, une exploration efficace est utile pour générer un comportement significatif de manière auto-supervisée. Pour notre contribution finale, nous abordons l'information limitée contenue dans les représentations obtenues par des agents RL non supervisés. Ceci peut avoir un effet néfaste sur la performance des agents lorsque leur perception est basée sur des images de haute dimension. Notre approche a base de modèles combine l'exploration et la planification sans récompense pour affiner efficacement les modèles pré-formés non supervisés, obtenant des résultats comparables à un agent entraîné spécifiquement sur ces tâches. Il s'agit d'une étape vers la création d'agents capables de généraliser rapidement à plusieurs tâches en utilisant uniquement des images comme perception.Shapes, surfaces, events, and objects (living and non-living) constitute the world. The intelligence of natural agents, such as humans is beyond pattern recognition. We excel at building representations and distilling knowledge to understand and infer the structure of the world. Critically, the development of such reasoning capabilities can occur even with limited supervision. On the other hand, despite its phenomenal development, the major successes of machine learning, in particular, deep learning models are primarily in tasks that have access to large annotated datasets. In this dissertation, we propose novel solutions to help address this gap by enabling machine learning models to learn the structure and enable effective reasoning in the presence of weakly supervised settings. The recurring theme of the thesis tries to revolve around the question of "How can a perceptual system learn to organize sensory information into useful knowledge under limited supervision?" And it discusses the themes of geometry, compositions, and associations in four separate articles with applications to computer vision (CV) and reinforcement learning (RL). Our first contribution ---Pix2Shape---presents an analysis-by-synthesis based approach(also referred to as inverse graphics) for perception. Pix2Shape leverages probabilistic generative models to learn 3D-aware representations from single 2D images. The resulting formalism allows us to perform a novel view synthesis of a scene and produce powerful representations of images. We achieve this by augmenting unsupervised learning with physically based inductive biases to decompose a scene structure into geometry, pose, reflectance and lighting. Our Second contribution ---MILe--- addresses the ambiguity issues in single-labeled datasets such as ImageNet. It is often inappropriate to describe an image with a single label when it is composed of more than one prominent object. We show that integrating ideas from Cognitive linguistic literature and imposing appropriate inductive biases helps in distilling multiple possible descriptions using such weakly labeled datasets. Next, moving into the RL setting, we consider an agent interacting with its environment without a reward signal. Our third Contribution ---HaC--- is a curiosity based unsupervised approach to learning associations between visual and tactile modalities. This aids the agent to explore the environment in an analogous self-guided fashion and further use this knowledge to adapt to downstream tasks. In the absence of reward supervision, intrinsic movitivation is useful to generate meaningful behavior in a self-supervised manner. In our final contribution, we address the representation learning bottleneck in unsupervised RL agents that has detrimental effect on the performance on high-dimensional pixel based inputs. Our model-based approach combines reward-free exploration and planning to efficiently fine-tune unsupervised pre-trained models, achieving comparable results to task-specific baselines. This is a step towards building agents that can generalize quickly on more than a single task using image inputs alone

    Simulation of Portable Gamma Radiation Detectors for Virtual Reality based Training applications

    Full text link
    [EN] This thesis focuses on the development of a simulator of a gamma radiation portable detector. The aim is to determine the feasibility of such a software tool in a virtual reality (VR) based application, with the purpose of using it in training tasks in the framework of nuclear safeguards and security activities. The work starts with the definition of the series of technical requirements which are necessary to achieve a working prototype of an application of the kind aforementioned. In order to achieve these requirements, a series of incremental prototypes of a VR based simulator are devised, implemented and tested. Each of these prototype versions tries to improve on its predecessor by introducing new concepts aimed at better satisfying the requirements set. The thesis is structured in several main chapters which divide the bulk of the PhD work in independent sections. Therefore, first the problem is introduced and then the current state of the art analysed. Next the first solutions are explained and following these introductory chapters the main contribution of the author is found in the development chapter. This chapter explains the ideas and methods created in a chronological manner, taking the reader through the steps the author took in the same order as he did. Logically, the next chapter deals with the testing of these methods in order to assess their validity and last a conclusion chapter evaluates if the objectives set at the beginning of the thesis have been met according to the results obtained.[ES] Esta tesis trata el desarrollo de un simulador de un detector portátil de radiación gamma. El objetivo es determinar si una herramienta software basada en técnicas de realidad virtual como esta es factible. Con la intención de utilizarla en tareas de entrenamiento de personal en el sector de la salvaguarda y seguridad nuclear. El trabajo empieza con la definición de la serie de requisitos técnicos que son necesarios para conseguir un prototipo funcional de una aplicación como la anteriormente descrita. Para conseguir cumplir estos requisitos, se han desarrollado, implementado y probado una serie de prototipos incrementales de un simulador basado en realidad virtual. Cada uno de estos prototipos intenta mejorar a su predecesor introduciendo nuevos conceptos con el objetivo de satisfacer mejor los requisitos planteados. La tesis está estructurada en varios capítulos principales que dividen el grueso del trabajo en secciones independientes. Por lo tanto, en primer lugar se introduce el problema y luego se analiza el estado del arte. A continuación se explican las primeras soluciones probadas seguidas del capítulo de desarrollo que contiene la contribución principal del autor. Este capítulo explica las ideas y métodos creado en orden cronológico, llevando al lector por los mismos pasos que el autor dio. De manera lógica, el siguiente capítulo trata con el testeo de estos métodos para evaluar su validez y un último capítulo de conclusión analiza si se cumplieron los objetivos propuesto al inicio de la tesis según los resultados obtenidos.[CA] Esta tesi tracta el desenrotllament d'un simulador d'un detector portàtil de radiació gamma. L'objectiu és determinar si una ferramenta software basada en tècniques de realitat virtual com esta és factible, amb la intenció d'utilitzar-la en tasques d'entrenament de personal en el sector de la salvaguarda i seguretat nuclear. El treball comença amb la definició de la sèrie de requisits tècnics que són necessaris per a aconseguir un prototip funcional d'una aplicació com l'anteriorment descrita. Per a aconseguir complir estos requisits, s'han desenrotllat, implementat i provat una sèrie de prototips incrementals d'un simulador basat en realitat virtual. Cada un d'estos prototips intenta millorar el seu predecessor introduint nous conceptes amb l'objectiu de satisfer millor els requisits plantejats. La tesi està estructurada en diversos capítols principals que dividixen el total del treball en seccions independents. Per tant, en primer lloc s'introduïx el problema i després s'analitza l'estat de l'art. A continuació s'expliquen les primeres solucions provades, seguides del capítol de desenrotllament, que conté la contribució principal de l'autor. Este capítol explica les idees i mètodes creats en orde cronològic, portant al lector pels mateixos passos que l'autor va fer. De manera lògica, el següent capítol tracta el testeig d'estos mètodes per a avaluar la seua validesa i un últim capítol de conclusió analitza si es van complir els objectius proposats a l'inici de la tesi segons els resultats obtinguts.Moltó Caracena, T. (2016). Simulation of Portable Gamma Radiation Detectors for Virtual Reality based Training applications [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/61301TESI

    Virtual Reality Games for Motor Rehabilitation

    Get PDF
    This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion

    Advances in Human-Robot Interaction

    Get PDF
    Rapid advances in the field of robotics have made it possible to use robots not just in industrial automation but also in entertainment, rehabilitation, and home service. Since robots will likely affect many aspects of human existence, fundamental questions of human-robot interaction must be formulated and, if at all possible, resolved. Some of these questions are addressed in this collection of papers by leading HRI researchers
    corecore