Search CORE

6,081 research outputs found

Student Teaching and Research Laboratory Focusing on Brain-computer Interface Paradigms - A Creative Environment for Computer Science Students -

Author: Rutkowski Tomasz M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/06/2015
Field of study

This paper presents an applied concept of a brain-computer interface (BCI) student research laboratory (BCI-LAB) at the Life Science Center of TARA, University of Tsukuba, Japan. Several successful case studies of the student projects are reviewed together with the BCI Research Award 2014 winner case. The BCI-LAB design and project-based teaching philosophy is also explained. Future teaching and research directions summarize the review.Comment: 4 pages, 4 figures, accepted for EMBC 2015, IEEE copyrigh

arXiv.org e-Print Archive

Crossref

Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

Author: Deleforge Antoine
Forbes Florence
Horaud Radu
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 20/03/2014
Field of study

In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head. A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions. We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure. We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction. We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech. We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework. The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources. Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Aerospace Medicine and Biology. A continuing bibliography (Supplement 226)

Author
Publication venue
Publication date
Field of study

This bibliography lists 129 reports, articles, and other documents introduced into the NASA scientific and technical information system in November 1981

NASA Technical Reports Server

Sound Localization for Robot Navigation

Author: Jie Huang
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

Non

IntechOpen

Crossref

Bioinspired auditory sound localisation for improving the signal to noise ratio of socially interactive robots

Author: Erwin Harry
Murray John C.
Wermter Stefan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In this paper we describe a bioinspired hybrid architecture for acoustic sound source localisation and tracking to increase the signal to noise ratio (SNR) between speaker and background sources for a socially interactive robot's speech recogniser system. The model presented incorporates the use of Interaural Time Differ- ence for azimuth estimation and Recurrent Neural Net- works for trajectory prediction. The results are then pre- sented showing the difference in the SNR of a localised and non-localised speaker source, in addition to presenting the recognition rates between a localised and non-localised speaker source. From the results presented in this paper it can be seen that by orientating towards the sound source of interest the recognition rates of that source can be in- creased

University of Lincoln Institutional Repository

CiteSeerX

Crossref

Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition

Author: Beskow Jonas
Salvi Giampiero
Stefanov Kalin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This paper presents a self-supervised method for visual detection of the active speaker in a multi-person spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive system attempting to acquire language in social settings. The proposed method is intended to complement the acoustic detection of the active speaker, thus improving the system robustness in noisy conditions. The method can detect an arbitrary number of possibly overlapping active speakers based exclusively on visual information about their face. Furthermore, the method does not rely on external annotations, thus complying with cognitive development. Instead, the method uses information from the auditory modality to support learning in the visual domain. This paper reports an extensive evaluation of the proposed method using a large multi-person face-to-face interaction dataset. The results show good performance in a speaker dependent setting. However, in a speaker independent setting the proposed method yields a significantly lower performance. We believe that the proposed method represents an essential component of any artificial cognitive system or robotic platform engaging in social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

NORA - Norwegian Open Research Archives