2,388 research outputs found

    Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment Features

    Full text link
    Humans make extensive use of vision and touch as complementary senses, with vision providing global information about the scene and touch measuring local information during manipulation without suffering from occlusions. While prior work demonstrates the efficacy of tactile sensing for precise manipulation of deformables, they typically rely on supervised, human-labeled datasets. We propose Self-Supervised Visuo-Tactile Pretraining (SSVTP), a framework for learning multi-task visuo-tactile representations in a self-supervised manner through cross-modal supervision. We design a mechanism that enables a robot to autonomously collect precisely spatially-aligned visual and tactile image pairs, then train visual and tactile encoders to embed these pairs into a shared latent space using cross-modal contrastive loss. We apply this latent space to downstream perception and control of deformable garments on flat surfaces, and evaluate the flexibility of the learned representations without fine-tuning on 5 tasks: feature classification, contact localization, anomaly detection, feature search from a visual query (e.g., garment feature localization under occlusion), and edge following along cloth edges. The pretrained representations achieve a 73-100% success rate on these 5 tasks.Comment: RSS 2023, site: https://sites.google.com/berkeley.edu/ssvt

    Engineering data compendium. Human perception and performance. User's guide

    Get PDF
    The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    Haptic SLAM: An Ideal Observer Model for Bayesian Inference of Object Shape and Hand Pose from Contact Dynamics

    No full text
    Dynamic tactile exploration enables humans to seamlessly estimate the shape of objects and distinguish them from one another in the complete absence of visual information. Such a blind tactile exploration allows integrating information of the hand pose and contacts on the skin to form a coherent representation of the object shape. A principled way to understand the underlying neural computations of human haptic perception is through normative modelling. We propose a Bayesian perceptual model for recursive integration of noisy proprioceptive hand pose with noisy skin–object contacts. The model simultaneously forms an optimal estimate of the true hand pose and a representation of the explored shape in an object–centred coordinate system. A classification algorithm can, thus, be applied in order to distinguish among different objects solely based on the similarity of their representations. This enables the comparison, in real–time, of the shape of an object identified by human subjects with the shape of the same object predicted by our model using motion capture data. Therefore, our work provides a framework for a principled study of human haptic exploration of complex objects

    Spatial representation and visual impairement - Developmental trends and new technological tools for assessment and rehabilitation

    Get PDF
    It is well known that perception is mediated by the five sensory modalities (sight, hearing, touch, smell and taste), which allows us to explore the world and build a coherent spatio-temporal representation of the surrounding environment. Typically, our brain collects and integrates coherent information from all the senses to build a reliable spatial representation of the world. In this sense, perception emerges from the individual activity of distinct sensory modalities, operating as separate modules, but rather from multisensory integration processes. The interaction occurs whenever inputs from the senses are coherent in time and space (Eimer, 2004). Therefore, spatial perception emerges from the contribution of unisensory and multisensory information, with a predominant role of visual information for space processing during the first years of life. Despite a growing body of research indicates that visual experience is essential to develop spatial abilities, to date very little is known about the mechanisms underpinning spatial development when the visual input is impoverished (low vision) or missing (blindness). The thesis's main aim is to increase knowledge about the impact of visual deprivation on spatial development and consolidation and to evaluate the effects of novel technological systems to quantitatively improve perceptual and cognitive spatial abilities in case of visual impairments. Chapter 1 summarizes the main research findings related to the role of vision and multisensory experience on spatial development. Overall, such findings indicate that visual experience facilitates the acquisition of allocentric spatial capabilities, namely perceiving space according to a perspective different from our body. Therefore, it might be stated that the sense of sight allows a more comprehensive representation of spatial information since it is based on environmental landmarks that are independent of body perspective. Chapter 2 presents original studies carried out by me as a Ph.D. student to investigate the developmental mechanisms underpinning spatial development and compare the spatial performance of individuals with affected and typical visual experience, respectively visually impaired and sighted. Overall, these studies suggest that vision facilitates the spatial representation of the environment by conveying the most reliable spatial reference, i.e., allocentric coordinates. However, when visual feedback is permanently or temporarily absent, as in the case of congenital blindness or blindfolded individuals, respectively, compensatory mechanisms might support the refinement of haptic and auditory spatial coding abilities. The studies presented in this chapter will validate novel experimental paradigms to assess the role of haptic and auditory experience on spatial representation based on external (i.e., allocentric) frames of reference. Chapter 3 describes the validation process of new technological systems based on unisensory and multisensory stimulation, designed to rehabilitate spatial capabilities in case of visual impairment. Overall, the technological validation of new devices will provide the opportunity to develop an interactive platform to rehabilitate spatial impairments following visual deprivation. Finally, Chapter 4 summarizes the findings reported in the previous Chapters, focusing the attention on the consequences of visual impairment on the developmental of unisensory and multisensory spatial experience in visually impaired children and adults compared to sighted peers. It also wants to highlight the potential role of novel experimental tools to validate the use to assess spatial competencies in response to unisensory and multisensory events and train residual sensory modalities under a multisensory rehabilitation

    Communication of Digital Material Appearance Based on Human Perception

    Get PDF
    Im alltägliche Leben begegnen wir digitalen Materialien in einer Vielzahl von Situationen wie beispielsweise bei Computerspielen, Filmen, Reklamewänden in zB U-Bahn Stationen oder beim Online-Kauf von Kleidungen. Während einige dieser Materialien durch digitale Modelle repräsentiert werden, welche das Aussehen einer bestimmten Oberfläche in Abhängigkeit des Materials der Fläche sowie den Beleuchtungsbedingungen beschreiben, basieren andere digitale Darstellungen auf der simplen Verwendung von Fotos der realen Materialien, was zB bei Online-Shopping häufig verwendet wird. Die Verwendung von computer-generierten Materialien ist im Vergleich zu einzelnen Fotos besonders vorteilhaft, da diese realistische Erfahrungen im Rahmen von virtuellen Szenarien, kooperativem Produkt-Design, Marketing während der prototypischen Entwicklungsphase oder der Ausstellung von Möbeln oder Accesoires in spezifischen Umgebungen erlauben. Während mittels aktueller Digitalisierungsmethoden bereits eine beeindruckende Reproduktionsqualität erzielt wird, wird eine hochpräzise photorealistische digitale Reproduktion von Materialien für die große Vielfalt von Materialtypen nicht erreicht. Daher verwenden viele Materialkataloge immer noch Fotos oder sogar physikalische Materialproben um ihre Kollektionen zu repräsentieren. Ein wichtiger Grund für diese Lücke in der Genauigkeit des Aussehens von digitalen zu echten Materialien liegt darin, dass die Zusammenhänge zwischen physikalischen Materialeigenschaften und der vom Menschen wahrgenommenen visuellen Qualität noch weitgehend unbekannt sind. Die im Rahmen dieser Arbeit durchgeführten Untersuchungen adressieren diesen Aspekt. Zu diesem Zweck werden etablierte digitalie Materialmodellen bezüglich ihrer Eignung zur Kommunikation von physikalischen und sujektiven Materialeigenschaften untersucht, wobei Beobachtungen darauf hinweisen, dass ein Teil der fühlbaren/haptischen Informationen wie z.B. Materialstärke oder Härtegrad aufgrund der dem Modell anhaftenden geometrische Abstraktion verloren gehen. Folglich wird im Rahmen der Arbeit das Zusammenspiel der verschiedenen Sinneswahrnehmungen (mit Fokus auf die visuellen und akustischen Modalitäten) untersucht um festzustellen, welche Informationen während des Digitalisierungsprozesses verloren gehen. Es zeigt sich, dass insbesondere akustische Informationen in Kombination mit der visuellen Wahrnehmung die Einschätzung fühlbarer Materialeigenschaften erleichtert. Eines der Defizite bei der Analyse des Aussehens von Materialien ist der Mangel bezüglich sich an der Wahnehmung richtenden Metriken die eine Beantwortung von Fragen wie z.B. "Sind die Materialien A und B sich ähnlicher als die Materialien C und D?" erlauben, wie sie in vielen Anwendungen der Computergrafik auftreten. Daher widmen sich die im Rahmen dieser Arbeit durchgeführten Studien auch dem Vergleich von unterschiedlichen Materialrepräsentationen im Hinblick auf. Zu diesem Zweck wird eine Methodik zur Berechnung der wahrgenommenen paarweisen Ähnlichkeit von Material-Texturen eingeführt, welche auf der Verwendung von Textursyntheseverfahren beruht und sich an der Idee/dem Begriff der geradenoch-wahrnehmbaren Unterschiede orientiert. Der vorgeschlagene Ansatz erlaubt das Überwinden einiger Probleme zuvor veröffentlichter Methoden zur Bestimmung der Änhlichkeit von Texturen und führt zu sinnvollen/plausiblen Distanzen von Materialprobem. Zusammenfassend führen die im Rahmen dieser Dissertation dargestellten Inhalte/Verfahren zu einem tieferen Verständnis bezüglich der menschlichen Wahnehmung von digitalen bzw. realen Materialien über unterschiedliche Sinne, einem besseren Verständnis bzgl. der Bewertung der Ähnlichkeit von Texturen durch die Entwicklung einer neuen perzeptuellen Metrik und liefern grundlegende Einsichten für zukünftige Untersuchungen im Bereich der Perzeption von digitalen Materialien.In daily life, we encounter digital materials and interact with them in numerous situations, for instance when we play computer games, watch a movie, see billboard in the metro station or buy new clothes online. While some of these virtual materials are given by computational models that describe the appearance of a particular surface based on its material and the illumination conditions, some others are presented as simple digital photographs of real materials, as is usually the case for material samples from online retailing stores. The utilization of computer-generated materials entails significant advantages over plain images as they allow realistic experiences in virtual scenarios, cooperative product design, advertising in prototype phase or exhibition of furniture and wearables in specific environments. However, even though exceptional material reproduction quality has been achieved in the domain of computer graphics, current technology is still far away from highly accurate photo-realistic virtual material reproductions for the wide range of existing categories and, for this reason, many material catalogs still use pictures or even physical material samples to illustrate their collections. An important reason for this gap between digital and real material appearance is that the connections between physical material characteristics and the visual quality perceived by humans are far from well-understood. Our investigations intend to shed some light in this direction. Concretely, we explore the ability of state-of-the-art digital material models in communicating physical and subjective material qualities, observing that part of the tactile/haptic information (eg thickness, hardness) is missing due to the geometric abstractions intrinsic to the model. Consequently, in order to account for the information deteriorated during the digitization process, we investigate the interplay between different sensing modalities (vision and hearing) and discover that particular sound cues, in combination with visual information, facilitate the estimation of such tactile material qualities. One of the shortcomings when studying material appearance is the lack of perceptually-derived metrics able to answer questions like "are materials A and B more similar than C and D?", which arise in many computer graphics applications. In the absence of such metrics, our studies compare different appearance models in terms of how capable are they to depict/transmit a collection of meaningful perceptual qualities. To address this problem, we introduce a methodology to compute the perceived pairwise similarity between textures from material samples that makes use of patch-based texture synthesis algorithms and is inspired on the notion of Just-Noticeable Differences. Our technique is able to overcome some of the issues posed by previous texture similarity collection methods and produces meaningful distances between samples. In summary, with the contents presented in this thesis we are able to delve deeply in how humans perceive digital and real materials through different senses, acquire a better understanding of texture similarity by developing a perceptually-based metric and provide a groundwork for further investigations in the perception of digital materials

    Sensory Communication

    Get PDF
    Contains table of contents for Section 2, an introduction and reports on twelve research projects.National Institutes of Health Grant 5 R01 DC00117National Institutes of Health Contract 2 P01 DC00361National Institutes of Health Grant 5 R01 DC00126National Institutes of Health Grant R01-DC00270U.S. Air Force - Office of Scientific Research Contract AFOSR-90-0200National Institutes of Health Grant R29-DC00625U.S. Navy - Office of Naval Research Grant N00014-88-K-0604U.S. Navy - Office of Naval Research Grant N00014-91-J-1454U.S. Navy - Office of Naval Research Grant N00014-92-J-1814U.S. Navy - Naval Training Systems Center Contract N61339-93-M-1213U.S. Navy - Naval Training Systems Center Contract N61339-93-C-0055U.S. Navy - Naval Training Systems Center Contract N61339-93-C-0083U.S. Navy - Office of Naval Research Grant N00014-92-J-4005U.S. Navy - Office of Naval Research Grant N00014-93-1-119

    Grabbing your ear: rapid auditory-somatosensory multisensory interactions in low-level sensory cortices are not constrained by stimulus alignment.

    Get PDF
    Multisensory interactions are observed in species from single-cell organisms to humans. Important early work was primarily carried out in the cat superior colliculus and a set of critical parameters for their occurrence were defined. Primary among these were temporal synchrony and spatial alignment of bisensory inputs. Here, we assessed whether spatial alignment was also a critical parameter for the temporally earliest multisensory interactions that are observed in lower-level sensory cortices of the human. While multisensory interactions in humans have been shown behaviorally for spatially disparate stimuli (e.g. the ventriloquist effect), it is not clear if such effects are due to early sensory level integration or later perceptual level processing. In the present study, we used psychophysical and electrophysiological indices to show that auditory-somatosensory interactions in humans occur via the same early sensory mechanism both when stimuli are in and out of spatial register. Subjects more rapidly detected multisensory than unisensory events. At just 50 ms post-stimulus, neural responses to the multisensory 'whole' were greater than the summed responses from the constituent unisensory 'parts'. For all spatial configurations, this effect followed from a modulation of the strength of brain responses, rather than the activation of regions specifically responsive to multisensory pairs. Using the local auto-regressive average source estimation, we localized the initial auditory-somatosensory interactions to auditory association areas contralateral to the side of somatosensory stimulation. Thus, multisensory interactions can occur across wide peripersonal spatial separations remarkably early in sensory processing and in cortical regions traditionally considered unisensory
    corecore