Search CORE

801 research outputs found

Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

Author: Zhang Mabel M.
Atanasov Nikolay
Daniilidis Kostas
Publication venue
Publication date: 29/07/2017
Field of study

This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and Systems (IROS) 201

arXiv.org e-Print Archive

ZENODO

FigShare

Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

Author: Atanasov Nikolay
Daniilidis Kostas
Zhang Mabel M.
Publication venue
Publication date: 29/07/2017
Field of study

arXiv.org e-Print Archive

Crossref

Multimodality in VR: A survey

Author: Gutierrez Diego
Malpica Sandra
Martin Daniel
Masia Belen
Serrano Ana
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Virtual reality (VR) is rapidly growing, with the potential to change the way we create and consume content. In VR, users integrate multimodal sensory information they receive, to create a unified perception of the virtual world. In this survey, we review the body of work addressing multimodality in VR, and its role and benefits in user experience, together with different applications that leverage multimodality in many disciplines. These works thus encompass several fields of research, and demonstrate that multimodality plays a fundamental role in VR; enhancing the experience, improving overall performance, and yielding unprecedented abilities in skill and knowledge transfer

arXiv.org e-Print Archive

Repositorio Universidad de Zaragoza

Learning efficient haptic shape exploration with a rigid tactile sensor array

Author: Fleer Sascha
Klatzky Roberta L.
Moringen Alexandra
Ritter Helge
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Haptic exploration is a key skill for both robots and humans to discriminate and handle unknown objects or to recognize familiar objects. Its active nature is evident in humans who from early on reliably acquire sophisticated sensory-motor capabilities for active exploratory touch and directed manual exploration that associates surfaces and object properties with their spatial locations. This is in stark contrast to robotics. In this field, the relative lack of good real-world interaction models - along with very restricted sensors and a scarcity of suitable training data to leverage machine learning methods - has so far rendered haptic exploration a largely underdeveloped skill. In the present work, we connect recent advances in recurrent models of visual attention with previous insights about the organisation of human haptic search behavior, exploratory procedures and haptic glances for a novel architecture that learns a generative model of haptic exploration in a simulated three-dimensional environment. The proposed algorithm simultaneously optimizes main perception-action loop components: feature extraction, integration of features over time, and the control strategy, while continuously acquiring data online. We perform a multi-module neural network training, including a feature extractor and a recurrent neural network module aiding pose control for storing and combining sequential sensory data. The resulting haptic meta-controller for the rigid

16 \times 16

tactile sensor array moving in a physics-driven simulation environment, called the Haptic Attention Model, performs a sequence of haptic glances, and outputs corresponding force measurements. The resulting method has been successfully tested with four different objects. It achieved results close to

100 \%

while performing object contour exploration that has been optimized for its own sensor morphology

arXiv.org e-Print Archive

Directory of Open Access Journals

Biomimetic robots as scientific models: a view from the whisker tip

Author: Mitchinson B.
Pearson M.J.
Pipe A.G.
Prescott A.J.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/09/2011
Field of study

White Rose Research Online

Robot in the mirror: toward an embodied computational model of mirror self-recognition

Author: Alzueta Elisabet
Hoffmann Matej
Lanillos Pablo
Outrata Vojtech
Wang Shengzhi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/11/2020
Field of study

Self-recognition or self-awareness is a capacity attributed typically only to humans and few other species. The definitions of these concepts vary and little is known about the mechanisms behind them. However, there is a Turing test-like benchmark: the mirror self-recognition, which consists in covertly putting a mark on the face of the tested subject, placing her in front of a mirror, and observing the reactions. In this work, first, we provide a mechanistic decomposition, or process model, of what components are required to pass this test. Based on these, we provide suggestions for empirical research. In particular, in our view, the way the infants or animals reach for the mark should be studied in detail. Second, we develop a model to enable the humanoid robot Nao to pass the test. The core of our technical contribution is learning the appearance representation and visual novelty detection by means of learning the generative model of the face with deep auto-encoders and exploiting the prediction error. The mark is identified as a salient region on the face and reaching action is triggered, relying on a previously learned mapping to arm joint angles. The architecture is tested on two robots with a completely different face.Comment: To appear in KI - K\"unstliche Intelligenz - German Journal of Artificial Intelligence - Springe

arXiv.org e-Print Archive

Radboud Repository

Multimodality in {VR}: {A} Survey

Author: Gutierrez D.
Malpica S.
Martin D.
Masia B.
Serrano A.
Publication venue
Publication date: 01/01/2021
Field of study

Virtual reality has the potential to change the way we create and consume content in our everyday life. Entertainment, training, design and manufacturing, communication, or advertising are all applications that already benefit from this new medium reaching consumer level. VR is inherently different from traditional media: it offers a more immersive experience, and has the ability to elicit a sense of presence through the place and plausibility illusions. It also gives the user unprecedented capabilities to explore their environment, in contrast with traditional media. In VR, like in the real world, users integrate the multimodal sensory information they receive to create a unified perception of the virtual world. Therefore, the sensory cues that are available in a virtual environment can be leveraged to enhance the final experience. This may include increasing realism, or the sense of presence; predicting or guiding the attention of the user through the experience; or increasing their performance if the experience involves the completion of certain tasks. In this state-of-the-art report, we survey the body of work addressing multimodality in virtual reality, its role and benefits in the final user experience. The works here reviewed thus encompass several fields of research, including computer graphics, human computer interaction, or psychology and perception. Additionally, we give an overview of different applications that leverage multimodal input in areas such as medicine, training and education, or entertainment; we include works in which the integration of multiple sensory information yields significant improvements, demonstrating how multimodality can play a fundamental role in the way VR systems are designed, and VR experiences created and consumed

MPG.PuRe

Proceedings of the 20th BCS HCI Group conference Volume Two

Author: Fields Bob
Healey Patrick
Nickerson Louise Valgerdur
Stockman Tony
Publication venue
Publication date: 30/12/2013
Field of study

Queen Mary Research Online

10th Tübinger Perception Conference: TWK 2007

Author: Bülthoff H.
Chatziastros A.
Mallot H.
Ulrich R.
Publication venue: Knirsch
Publication date: 01/07/2007
Field of study

MPG.PuRe

Fusing Multimedia Data Into Dynamic Virtual Environments

Author: Du Ruofei
Publication venue
Publication date: 01/01/2018
Field of study

In spite of the dramatic growth of virtual and augmented reality (VR and AR) technology, content creation for immersive and dynamic virtual environments remains a significant challenge. In this dissertation, we present our research in fusing multimedia data, including text, photos, panoramas, and multi-view videos, to create rich and compelling virtual environments. First, we present Social Street View, which renders geo-tagged social media in its natural geo-spatial context provided by 360° panoramas. Our system takes into account visual saliency and uses maximal Poisson-disc placement with spatiotemporal filters to render social multimedia in an immersive setting. We also present a novel GPU-driven pipeline for saliency computation in 360° panoramas using spherical harmonics (SH). Our spherical residual model can be applied to virtual cinematography in 360° videos. We further present Geollery, a mixed-reality platform to render an interactive mirrored world in real time with three-dimensional (3D) buildings, user-generated content, and geo-tagged social media. Our user study has identified several use cases for these systems, including immersive social storytelling, experiencing the culture, and crowd-sourced tourism. We next present Video Fields, a web-based interactive system to create, calibrate, and render dynamic videos overlaid on 3D scenes. Our system renders dynamic entities from multiple videos, using early and deferred texture sampling. Video Fields can be used for immersive surveillance in virtual environments. Furthermore, we present VRSurus and ARCrypt projects to explore the applications of gestures recognition, haptic feedback, and visual cryptography for virtual and augmented reality. Finally, we present our work on Montage4D, a real-time system for seamlessly fusing multi-view video textures with dynamic meshes. We use geodesics on meshes with view-dependent rendering to mitigate spatial occlusion seams while maintaining temporal consistency. Our experiments show significant enhancement in rendering quality, especially for salient regions such as faces. We believe that Social Street View, Geollery, Video Fields, and Montage4D will greatly facilitate several applications such as virtual tourism, immersive telepresence, and remote education

Digital Repository at the University of Maryland