12,213 research outputs found
Inferring Room Semantics Using Acoustic Monitoring
Having knowledge of the environmental context of the user i.e. the knowledge
of the users' indoor location and the semantics of their environment, can
facilitate the development of many of location-aware applications. In this
paper, we propose an acoustic monitoring technique that infers semantic
knowledge about an indoor space \emph{over time,} using audio recordings from
it. Our technique uses the impulse response of these spaces as well as the
ambient sounds produced in them in order to determine a semantic label for
them. As we process more recordings, we update our \emph{confidence} in the
assigned label. We evaluate our technique on a dataset of single-speaker human
speech recordings obtained in different types of rooms at three university
buildings. In our evaluation, the confidence\emph{ }for the true label
generally outstripped the confidence for all other labels and in some cases
converged to 100\% with less than 30 samples.Comment: 2017 IEEE International Workshop on Machine Learning for Signal
Processing, Sept.\ 25--28, 2017, Tokyo, Japa
DoubleEcho: Mitigating Context-Manipulation Attacks in Copresence Verification
Copresence verification based on context can improve usability and strengthen
security of many authentication and access control systems. By sensing and
comparing their surroundings, two or more devices can tell whether they are
copresent and use this information to make access control decisions. To the
best of our knowledge, all context-based copresence verification mechanisms to
date are susceptible to context-manipulation attacks. In such attacks, a
distributed adversary replicates the same context at the (different) locations
of the victim devices, and induces them to believe that they are copresent. In
this paper we propose DoubleEcho, a context-based copresence verification
technique that leverages acoustic Room Impulse Response (RIR) to mitigate
context-manipulation attacks. In DoubleEcho, one device emits a wide-band
audible chirp and all participating devices record reflections of the chirp
from the surrounding environment. Since RIR is, by its very nature, dependent
on the physical surroundings, it constitutes a unique location signature that
is hard for an adversary to replicate. We evaluate DoubleEcho by collecting RIR
data with various mobile devices and in a range of different locations. We show
that DoubleEcho mitigates context-manipulation attacks whereas all other
approaches to date are entirely vulnerable to such attacks. DoubleEcho detects
copresence (or lack thereof) in roughly 2 seconds and works on commodity
devices
Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks
We present a novel learning-based approach to estimate the
direction-of-arrival (DOA) of a sound source using a convolutional recurrent
neural network (CRNN) trained via regression on synthetic data and Cartesian
labels. We also describe an improved method to generate synthetic data to train
the neural network using state-of-the-art sound propagation algorithms that
model specular as well as diffuse reflections of sound. We compare our model
against three other CRNNs trained using different formulations of the same
problem: classification on categorical labels, and regression on spherical
coordinate labels. In practice, our model achieves up to 43% decrease in
angular error over prior methods. The use of diffuse reflection results in 34%
and 41% reduction in angular prediction errors on LOCATA and SOFA datasets,
respectively, over prior methods based on image-source methods. Our method
results in an additional 3% error reduction over prior schemes that use
classification based networks, and we use 36% fewer network parameters
The Audio Degradation Toolbox and its Application to Robustness Evaluation
We introduce the Audio Degradation Toolbox (ADT) for the controlled degradation of audio signals, and propose its usage as a means of evaluating and comparing the robustness of audio processing algorithms. Music recordings encountered in practical applications are subject to varied, sometimes unpredictable degradation. For example, audio is degraded by low-quality microphones, noisy recording environments, MP3 compression, dynamic compression in broadcasting or vinyl decay. In spite of this, no standard software for the degradation of audio exists, and music processing methods are usually evaluated against clean data. The ADT fills this gap by providing Matlab scripts that emulate a wide range of degradation types. We describe 14 degradation units, and how they can be chained to create more complex, `real-world' degradations. The ADT also provides functionality to adjust existing ground-truth, correcting for temporal distortions introduced by degradation. Using four different music informatics tasks, we show that performance strongly depends on the combination of method and degradation applied. We demonstrate that specific degradations can reduce or even reverse the performance difference between two competing methods. ADT source code, sounds, impulse responses and definitions are freely available for download
- …