6,081 research outputs found
Student Teaching and Research Laboratory Focusing on Brain-computer Interface Paradigms - A Creative Environment for Computer Science Students -
This paper presents an applied concept of a brain-computer interface (BCI)
student research laboratory (BCI-LAB) at the Life Science Center of TARA,
University of Tsukuba, Japan. Several successful case studies of the student
projects are reviewed together with the BCI Research Award 2014 winner case.
The BCI-LAB design and project-based teaching philosophy is also explained.
Future teaching and research directions summarize the review.Comment: 4 pages, 4 figures, accepted for EMBC 2015, IEEE copyrigh
Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds
In this paper we address the problems of modeling the acoustic space
generated by a full-spectrum sound source and of using the learned model for
the localization and separation of multiple sources that simultaneously emit
sparse-spectrum sounds. We lay theoretical and methodological grounds in order
to introduce the binaural manifold paradigm. We perform an in-depth study of
the latent low-dimensional structure of the high-dimensional interaural
spectral data, based on a corpus recorded with a human-like audiomotor robot
head. A non-linear dimensionality reduction technique is used to show that
these data lie on a two-dimensional (2D) smooth manifold parameterized by the
motor states of the listener, or equivalently, the sound source directions. We
propose a probabilistic piecewise affine mapping model (PPAM) specifically
designed to deal with high-dimensional data exhibiting an intrinsic piecewise
linear structure. We derive a closed-form expectation-maximization (EM)
procedure for estimating the model parameters, followed by Bayes inversion for
obtaining the full posterior density function of a sound source direction. We
extend this solution to deal with missing data and redundancy in real world
spectrograms, and hence for 2D localization of natural sound sources such as
speech. We further generalize the model to the challenging case of multiple
sound sources and we propose a variational EM framework. The associated
algorithm, referred to as variational EM for source separation and localization
(VESSL) yields a Bayesian estimation of the 2D locations and time-frequency
masks of all the sources. Comparisons of the proposed approach with several
existing methods reveal that the combination of acoustic-space learning with
Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table
Aerospace Medicine and Biology. A continuing bibliography (Supplement 226)
This bibliography lists 129 reports, articles, and other documents introduced into the NASA scientific and technical information system in November 1981
Bioinspired auditory sound localisation for improving the signal to noise ratio of socially interactive robots
In this paper we describe a bioinspired hybrid architecture for acoustic sound source localisation and tracking to increase the signal to noise ratio (SNR) between speaker and background sources for a socially interactive robot's speech recogniser system. The model presented incorporates the use of Interaural Time Differ- ence for azimuth estimation and Recurrent Neural Net- works for trajectory prediction. The results are then pre- sented showing the difference in the SNR of a localised and non-localised speaker source, in addition to presenting the recognition rates between a localised and non-localised speaker source. From the results presented in this paper it can be seen that by orientating towards the sound source of interest the recognition rates of that source can be in- creased
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
This paper presents a self-supervised method for visual detection of the
active speaker in a multi-person spoken interaction scenario. Active speaker
detection is a fundamental prerequisite for any artificial cognitive system
attempting to acquire language in social settings. The proposed method is
intended to complement the acoustic detection of the active speaker, thus
improving the system robustness in noisy conditions. The method can detect an
arbitrary number of possibly overlapping active speakers based exclusively on
visual information about their face. Furthermore, the method does not rely on
external annotations, thus complying with cognitive development. Instead, the
method uses information from the auditory modality to support learning in the
visual domain. This paper reports an extensive evaluation of the proposed
method using a large multi-person face-to-face interaction dataset. The results
show good performance in a speaker dependent setting. However, in a speaker
independent setting the proposed method yields a significantly lower
performance. We believe that the proposed method represents an essential
component of any artificial cognitive system or robotic platform engaging in
social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System
- …