112 research outputs found

    Reflection-Aware Sound Source Localization

    Full text link
    We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signals that reach the microphones after reflecting off surfaces such as ceilings or walls. We then generate and trace direct and reflected acoustic paths using inverse acoustic ray tracing and utilize these paths with Monte Carlo localization to estimate a 3D sound source position. We have implemented our method on a robot with a cube-shaped microphone array and tested it against different settings with continuous and intermittent sound signals with a stationary or a mobile source. Across different settings, our approach can localize the sound with an average distance error of 0.8m tested in a room of 7m by 7m area with 3m height, including a mobile and non-line-of-sight sound source. We also reveal that the modeling of indirect rays increases the localization accuracy by 40% compared to only using direct acoustic rays.Comment: Submitted to ICRA 2018. The working video is available at (https://youtu.be/TkQ36lMEC-M

    Robust sound source mapping using three-layered selective audio rays for mobile robots

    Full text link
    © 2016 IEEE. This paper investigates sound source mapping in a real environment using a mobile robot. Our approach is based on audio ray tracing which integrates occupancy grids and sound source localization using a laser range finder and a microphone array. Previous audio ray tracing approaches rely on all observed rays and grids. As such observation errors caused by sound reflection, sound occlusion, wall occlusion, sounds at misdetected grids, etc. can significantly degrade the ability to locate sound sources in a map. A three-layered selective audio ray tracing mechanism is proposed in this work. The first layer conducts frame-based unreliable ray rejection (sensory rejection) considering sound reflection and wall occlusion. The second layer introduces triangulation and audio tracing to detect falsely detected sound sources, rejecting audio rays associated to these misdetected sounds sources (short-term rejection). A third layer is tasked with rejecting rays using the whole history (long-term rejection) to disambiguate sound occlusion. Experimental results under various situations are presented, which proves the effectiveness of our method

    Acoustic Echo Estimation using the model-based approach with Application to Spatial Map Construction in Robotics

    Get PDF

    Developing a Home Service Robot Platform for Smart Homes

    Get PDF
    The purpose of this work is to develop a testbed for a smart home environment integrated with a home service robot (ASH Testbed) as well as to build home service robot platforms. The architecture of ASH Testbed was proposed and implemented based on ROS (Robot Operating System). In addition, two robot platforms, ASCCHomeBots, were developed using an iRobot Create base and a Pioneer base. They are equipped with capabilities such as mapping, autonomous navigation. They are also equipped with the natural human interfaces including hand-gesture recognition using a RGB-D camera, online speech recognition through cloud computing services provided by Google, and local speech recognition based on PocketSphinx. Furthermore, the Pioneer based ASCCHomeBot was developed along with an open audition system. This allows the robot to serve the elderly living alone at home. We successfully implemented the software for this system that realizes robot services and audition services for high level applications such as telepresence video conference, sound source position estimation, multiple source speech recognition, and human assisted sound classification. Our experimental results validated the proposed framework and the effectiveness of the developed robots as well as the proposed testbed.Electrical Engineerin

    Autonomous robot systems and competitions: proceedings of the 12th International Conference

    Get PDF
    This is the 2012’s edition of the scientific meeting of the Portuguese Robotics Open (ROBOTICA’ 2012). It aims to disseminate scientific contributions and to promote discussion of theories, methods and experiences in areas of relevance to Autonomous Robotics and Robotic Competitions. All accepted contributions are included in this proceedings book. The conference program has also included an invited talk by Dr.ir. Raymond H. Cuijpers, from the Department of Human Technology Interaction of Eindhoven University of Technology, Netherlands.The conference is kindly sponsored by the IEEE Portugal Section / IEEE RAS ChapterSPR-Sociedade Portuguesa de Robótic

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Calibration of sound source localisation for robots using multiple adaptive filter models of the cerebellum

    Get PDF
    The aim of this research was to investigate the calibration of Sound Source Localisation (SSL) for robots using the adaptive filter model of the cerebellum and how this could be automatically adapted for multiple acoustic environments. The role of the cerebellum has mainly been identified in the context of motor control, and only in recent years has it been recognised that it has a wider role to play in the senses and cognition. The adaptive filter model of the cerebellum has been successfully applied to a number of robotics applications but so far none involving auditory sense. Multiple models frameworks such as MOdular Selection And Identification for Control (MOSAIC) have also been developed in the context of motor control, and this has been the inspiration for adaptation of audio calibration in multiple acoustic environments; again, application of this approach in the area of auditory sense is completely new. The thesis showed that it was possible to calibrate the output of an SSL algorithm using the adaptive filter model of the cerebellum, improving the performance compared to the uncalibrated SSL. Using an adaptation of the MOSAIC framework, and specifically using responsibility estimation, a system was developed that was able to select an appropriate set of cerebellar calibration models and to combine their outputs in proportion to how well each was able to calibrate, to improve the SSL estimate in multiple acoustic contexts, including novel contexts. The thesis also developed a responsibility predictor, also part of the MOSAIC framework, and this improved the robustness of the system to abrupt changes in context which could otherwise have resulted in a large performance error. Responsibility prediction also improved robustness to missing ground truth, which could occur in challenging environments where sensory feedback of ground truth may become impaired, which has not been addressed in the MOSAIC literature, adding to the novelty of the thesis. The utility of the so-called cerebellar chip has been further demonstrated through the development of a responsibility predictor that is based on the adaptive filter model of the cerebellum, rather than the more conventional function fitting neural network used in the literature. Lastly, it was demonstrated that the multiple cerebellar calibration architecture is capable of limited self-organising from a de-novo state, with a predetermined number of models. It was also demonstrated that the responsibility predictor could learn against its model after self-organisation, and to a limited extent, during self-organisation. The thesis addresses an important question of how a robot could improve its ability to listen in multiple, challenging acoustic environments, and recommends future work to develop this ability

    Advances in Robot Navigation

    Get PDF
    Robot navigation includes different interrelated activities such as perception - obtaining and interpreting sensory information; exploration - the strategy that guides the robot to select the next direction to go; mapping - the construction of a spatial representation by using the sensory information perceived; localization - the strategy to estimate the robot position within the spatial map; path planning - the strategy to find a path towards a goal location being optimal or not; and path execution, where motor actions are determined and adapted to environmental changes. This book integrates results from the research work of authors all over the world, addressing the abovementioned activities and analyzing the critical implications of dealing with dynamic environments. Different solutions providing adaptive navigation are taken from nature inspiration, and diverse applications are described in the context of an important field of study: social robotics

    Augmented reality (AR) for surgical robotic and autonomous systems: State of the art, challenges, and solutions

    Get PDF
    Despite the substantial progress achieved in the development and integration of augmented reality (AR) in surgical robotic and autonomous systems (RAS), the center of focus in most devices remains on improving end-effector dexterity and precision, as well as improved access to minimally invasive surgeries. This paper aims to provide a systematic review of different types of state-of-the-art surgical robotic platforms while identifying areas for technological improvement. We associate specific control features, such as haptic feedback, sensory stimuli, and human-robot collaboration, with AR technology to perform complex surgical interventions for increased user perception of the augmented world. Current researchers in the field have, for long, faced innumerable issues with low accuracy in tool placement around complex trajectories, pose estimation, and difficulty in depth perception during two-dimensional medical imaging. A number of robots described in this review, such as Novarad and SpineAssist, are analyzed in terms of their hardware features, computer vision systems (such as deep learning algorithms), and the clinical relevance of the literature. We attempt to outline the shortcomings in current optimization algorithms for surgical robots (such as YOLO and LTSM) whilst providing mitigating solutions to internal tool-to-organ collision detection and image reconstruction. The accuracy of results in robot end-effector collisions and reduced occlusion remain promising within the scope of our research, validating the propositions made for the surgical clearance of ever-expanding AR technology in the future

    Leveraging eXtented Reality & Human-Computer Interaction for User Experi- ence in 360◦ Video

    Get PDF
    EXtended Reality systems have resurged as a medium for work and entertainment. While 360o video has been characterized as less immersive than computer-generated VR, its realism, ease of use and affordability mean it is in widespread commercial use. Based on the prevalence and potential of the 360o video format, this research is focused on improving and augmenting the user experience of watching 360o video. By leveraging knowledge from Extented Reality (XR) systems and Human-Computer Interaction (HCI), this research addresses two issues affecting user experience in 360o video: Attention Guidance and Visually Induced Motion Sickness (VIMS). This research work relies on the construction of multiple artifacts to answer the de- fined research questions: (1) IVRUX, a tool for analysis of immersive VR narrative expe- riences; (2) Cue Control, a tool for creation of spatial audio soundtracks for 360o video, as well as enabling the collection and analysis of captured metrics emerging from the user experience; and (3) VIMS mitigation pipeline, a linear sequence of modules (including optical flow and visual SLAM among others) that control parameters for visual modi- fications such as a restricted Field of View (FoV). These artifacts are accompanied by evaluation studies targeting the defined research questions. Through Cue Control, this research shows that non-diegetic music can be spatialized to act as orientation for users. A partial spatialization of music was deemed ineffective when used for orientation. Addi- tionally, our results also demonstrate that diegetic sounds are used for notification rather than orientation. Through VIMS mitigation pipeline, this research shows that dynamic restricted FoV is statistically significant in mitigating VIMS, while mantaining desired levels of Presence. Both Cue Control and the VIMS mitigation pipeline emerged from a Research through Design (RtD) approach, where the IVRUX artifact is the product of de- sign knowledge and gave direction to research. The research presented in this thesis is of interest to practitioners and researchers working on 360o video and helps delineate future directions in making 360o video a rich design space for interaction and narrative.Sistemas de Realidade EXtendida ressurgiram como um meio de comunicação para o tra- balho e entretenimento. Enquanto que o vídeo 360o tem sido caracterizado como sendo menos imersivo que a Realidade Virtual gerada por computador, o seu realismo, facili- dade de uso e acessibilidade significa que tem uso comercial generalizado. Baseado na prevalência e potencial do formato de vídeo 360o, esta pesquisa está focada em melhorar e aumentar a experiência de utilizador ao ver vídeos 360o. Impulsionado por conhecimento de sistemas de Realidade eXtendida (XR) e Interacção Humano-Computador (HCI), esta pesquisa aborda dois problemas que afetam a experiência de utilizador em vídeo 360o: Orientação de Atenção e Enjoo de Movimento Induzido Visualmente (VIMS). Este trabalho de pesquisa é apoiado na construção de múltiplos artefactos para res- ponder as perguntas de pesquisa definidas: (1) IVRUX, uma ferramenta para análise de experiências narrativas imersivas em VR; (2) Cue Control, uma ferramenta para a criação de bandas sonoras de áudio espacial, enquanto permite a recolha e análise de métricas capturadas emergentes da experiencia de utilizador; e (3) canal para a mitigação de VIMS, uma sequência linear de módulos (incluindo fluxo ótico e SLAM visual entre outros) que controla parâmetros para modificações visuais como o campo de visão restringido. Estes artefactos estão acompanhados por estudos de avaliação direcionados para às perguntas de pesquisa definidas. Através do Cue Control, esta pesquisa mostra que música não- diegética pode ser espacializada para servir como orientação para os utilizadores. Uma espacialização parcial da música foi considerada ineficaz quando usada para a orientação. Adicionalmente, os nossos resultados demonstram que sons diegéticos são usados para notificação em vez de orientação. Através do canal para a mitigação de VIMS, esta pesquisa mostra que o campo de visão restrito e dinâmico é estatisticamente significante ao mitigar VIMS, enquanto mantem níveis desejados de Presença. Ambos Cue Control e o canal para a mitigação de VIMS emergiram de uma abordagem de Pesquisa através do Design (RtD), onde o artefacto IVRUX é o produto de conhecimento de design e deu direcção à pesquisa. A pesquisa apresentada nesta tese é de interesse para profissionais e investigadores tra- balhando em vídeo 360o e ajuda a delinear futuras direções em tornar o vídeo 360o um espaço de design rico para a interação e narrativa
    corecore