1,839 research outputs found

    Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition

    Full text link
    This paper presents a self-supervised method for visual detection of the active speaker in a multi-person spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive system attempting to acquire language in social settings. The proposed method is intended to complement the acoustic detection of the active speaker, thus improving the system robustness in noisy conditions. The method can detect an arbitrary number of possibly overlapping active speakers based exclusively on visual information about their face. Furthermore, the method does not rely on external annotations, thus complying with cognitive development. Instead, the method uses information from the auditory modality to support learning in the visual domain. This paper reports an extensive evaluation of the proposed method using a large multi-person face-to-face interaction dataset. The results show good performance in a speaker dependent setting. However, in a speaker independent setting the proposed method yields a significantly lower performance. We believe that the proposed method represents an essential component of any artificial cognitive system or robotic platform engaging in social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System

    Implementing flexible rules of interaction for object manipulation in cluttered virtual environments

    Get PDF
    Object manipulation in cluttered virtual environments (VEs) brings additional challenges to the design of interaction algorithms, when compared with open virtual spaces. As the complexity of the algorithms increases so does the flexibility with which users can interact, but this is at the expense of much greater difficulties in implementation for developers. Three rules that increase the realism and flexibility of interaction are outlined: collision response, order of control, and physical compatibility. The implementation of each is described, highlighting the substantial increase in algorithm complexity that arises. Data are reported from an experiment in which participants manipulated a bulky virtual object through parts of a virtual building (the piano movers’ problem). These data illustrate the benefits to users that accrue from implementing flexible rules of interaction

    Place recognition using batlike sonar

    Get PDF
    Echolocating bats have excellent spatial memory and are able to navigate to salient locations using bio-sonar. Navigating and route-following require animals to recognize places. Currently, it is mostly unknown how bats recognize places using echolocation. In this paper, we propose template based place recognition might underlie sonar-based navigation in bats. Under this hypothesis, bats recognize places by remembering their echo signature - rather than their 3D layout. Using a large body of ensonification data collected in three different habitats, we test the viability of this hypothesis assessing two critical properties of the proposed echo signatures: (1) they can be uniquely classified and (2) they vary continuously across space. Based on the results presented, we conclude that the proposed echo signatures satisfy both criteria. We discuss how these two properties of the echo signatures can support navigation and building a cognitive map. DOI: http://dx.doi.org/10.7554/eLife.14188.00

    Moveable worlds/digital scenographies

    Get PDF
    This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ Intellect Ltd 2010.The mixed reality choreographic installation UKIYO explored in this article reflects an interest in scenographic practices that connect physical space to virtual worlds and explore how performers can move between material and immaterial spaces. The spatial design for UKIYO is inspired by Japanese hanamichi and western fashion runways, emphasizing the research production company's commitment to various creative crossovers between movement languages, innovative wearable design for interactive performance, acoustic and electronic sound processing and digital image objects that have a plastic as well as an immaterial/virtual dimension. The work integrates various forms of making art in order to visualize things that are not in themselves visual, or which connect visual and kinaesthetic/tactile/auditory experiences. The ‘Moveable Worlds’ in this essay are also reflections of the narrative spaces, subtexts and auditory relationships in the mutating matrix of an installation-space inviting the audience to move around and follow its sensorial experiences, drawn near to the bodies of the dancers.Brunel University, the British Council, and the Japan Foundation

    Development of a Bio-inspired GNS methodology in the dark environment

    Get PDF
    The current research explores the connection between gazing and locomotion of acoustic guided animals and the application of this in autonomous vehicles guidance and navigation strategies. Research groups worldwide are currently investigating different technologies and autonomous guidance algorithm-based strategies. The use of nature-inspired innovations ensures both the efficiency and the robustness of guidance strategies. The current research looks to fill the lack of research of those methodologies using bio-inspired techniques for acoustic guided animals as only visual-based methodologies have been implemented for a variety of tasks. Also, to connect the results from batÂżs flight experiments of Moss et al. with the Tau Theory of David Lee. The connection between the Tau Theory and flight dynamics and manoeuvring is another interesting topic not only for autonomous navigation but also for handling qualities and safety improvement. After carrying out a data analysis of BatÂżs flight experiments through the cluttering of the environment and connecting the flight behaviour with the extensive research done upon environmental cues perception guiding locomotion action for visual and acoustic cues. This concept is in an early phase of development and therefore, the aim is to set the baseline for further research on the topic. Results showed that bats perform a controlled braking manoeuvre when closing gaps, which is coined the term ÂżEnergised ApproachÂż. However, biased errors were found in some cases hence the results were negatively impacted, causing the results to be inaccurate in certain phases of the analysis. Despite the error found post-analysis, the results found in this research can still be considered insightful however artificial intelligence algorithms should be incorporated in future studies in order to achieve a more accurate result and finding.Outgoin

    New Method for Localization and Human Being Detection using UWB Technology: Helpful Solution for Rescue Robots

    No full text
    International audienceTwo challenges for rescue robots are to detect human beings and to have an accurate positioning system. In indoor positioning, GPS receivers cannot be used due to the reflections or attenuation caused by obstacles. To detect human beings, sensors such as thermal camera, ultrasonic and microphone can be embedded on the rescue robot. The drawback of these sensors is the detection range. These sensors have to be in close proximity to the victim in order to detect it. UWB technology is then very helpful to ensure precise localization of the rescue robot inside the disaster site and detect human beings. We propose a new method to both detect human beings and locate the rescue robot at the same time. To achieve these goals we optimize the design of UWB pulses based on B-splines. The spectral effectiveness is optimized so the symbols are easier to detect and the mitigation with noise is reduced. Our positioning system performs to locate the rescue robot with an accuracy about 2 centimeters. During some tests we discover that UWB signal characteristics abruptly change after passing through a human body. Our system uses this particular signature to detect human body

    Sensorimotor Model of Obstacle Avoidance in Echolocating Bats

    Get PDF
    Bat echolocation is an ability consisting of many subtasks such as navigation, prey detection and object recognition. Understanding the echolocation capabilities of bats comes down to isolating the minimal set of acoustic cues needed to complete each task. For some tasks, the minimal cues have already been identified. However, while a number of possible cues have been suggested, little is known about the minimal cues supporting obstacle avoidance in echolocating bats. In this paper, we propose that the Interaural Intensity Difference (IID) and travel time of the first millisecond of the echo train are sufficient cues for obstacle avoidance. We describe a simple control algorithm based on the use of these cues in combination with alternating ear positions modeled after the constant frequency bat Rhinolophus rouxii. Using spatial simulations (2D and 3D), we show that simple phonotaxis can steer a bat clear from obstacles without performing a reconstruction of the 3D layout of the scene. As such, this paper presents the first computationally explicit explanation for obstacle avoidance validated in complex simulated environments. Based on additional simulations modelling the FM bat Phyllostomus discolor, we conjecture that the proposed cues can be exploited by constant frequency (CF) bats and frequency modulated (FM) bats alike. We hypothesize that using a low level yet robust cue for obstacle avoidance allows bats to comply with the hard real-time constraints of this basic behaviour

    Adaptations to changes in the acoustic scene of the echolocating bat

    Get PDF
    Our natural environment is noisy and in order to navigate it successfully, we must filter out the important components so that we may guide our next steps. In analyzing our acoustic scene, one of the most common challenges is to segregate speech communication sounds from background noise; this process is not unique to humans. Echolocating bats emit high frequency biosonar signals and listen to echoes returning off objects in their environment. The sound wave they receive is a merging of echoes reflecting off target prey and other scattered objects, conspecific calls and echoes, and any naturally-occurring environmental noises. The bat is faced with the challenge of segregating this complex sound wave into the components of interest to adapt its flight and echolocation behavior in response to fast and dynamic environmental changes. In this thesis, we employ two approaches to investigate the mechanisms that may aid the bat in analyzing its acoustic scene. First, we test the bat’s adaptations to changes of controlled echo-acoustic flow patterns, similar to those it may encounter when flying along forest edges and among clutter. Our findings show that big brown bats adapt their flight paths in response to the intervals between echoes, and suggest that there is a limit to how close objects can be spaced, before the bat does not represent them as distinct any longer. Further, we consider how bats that use different echolocation signals may navigate similar environments, and provide evidence of species-specific flight and echolocation adaptations. Second, we research how temporal patterning of echolocation calls is affected during competitive foraging of paired bats in open and cluttered environments. Our findings show that “silent behavior”, the ceasing of emitting echolocation calls, which had previously been proposed as a mechanism to avoid acoustic interference, or to “eavesdrop” on another bat, may not be as common as has been reported
    • 

    corecore