142,064 research outputs found
Improving elevation perception with a tool for image-guided head-related transfer function selection
This paper proposes an image-guided HRTF selection procedure that exploits the relation between features of the pinna shape and HRTF notches. Using a 2D image of a subject's pinna, the procedure selects from a database the HRTF set that best fits the anthropometry of that subject. The proposed procedure is designed to be quickly applied and easy to use for a user without previous knowledge on binaural audio technologies. The entire process is evaluated by means of an auditory model for sound localization in the mid-sagittal plane available from previous literature. Using virtual subjects from a HRTF database, a virtual experiment is implemented to assess the vertical localization performance of the database subjects when they are provided with HRTF sets selected by the proposed procedure. Results report a statistically significant improvement in predictions of localization performance for selected HRTFs compared to KEMAR HRTF which is a commercial standard in many binaural audio solutions; moreover, the proposed analysis provides useful indications to refine the perceptually-motivated metrics that guides the selection
Frequency Estimation Of The First Pinna Notch In Head-Related Transfer Functions With A Linear Anthropometric Model
The relation between anthropometric parameters and Head-Related Transfer Function (HRTF) features, especially those due to the pinna, are not fully understood yet. In this paper we apply signal processing techniques to extract the frequencies of the main pinna notches (known as N1, N2, and N3) in the frontal part of the median plane and build a model relating them to 13 different anthropometric parameters of the pinna, some of which depend on the elevation angle of the sound source. Results show that while the considered anthropometric parameters are not able to approximate
with sufficient accuracy neither the N2 nor the N3 frequency, eight of them are sufficient for modeling the frequency of N1 within a psychoacoustically acceptable margin of error. In particular, distances between the ear canal and the outer helix border are the most important parameters for predicting N1
Charge separation: From the topology of molecular electronic transitions to the dye/semiconductor interfacial energetics and kinetics
Charge separation properties, that is the ability of a chromophore, or a
chromophore/semiconductor interface, to separate charges upon light absorption,
are crucial characteristics for an efficient photovoltaic device. Starting from
this concept, we devote the first part of this book chapter to the topological
analysis of molecular electronic transitions induced by photon capture. Such
analysis can be either qualitative or quantitative, and is presented here in
the framework of the reduced density matrix theory applied to single-reference,
multiconfigurational excited states. The qualitative strategies are separated
into density-based and wave function-based approaches, while the quantitative
methods reported here for analysing the photoinduced charge transfer nature are
either fragment-based, global or statistical. In the second part of this
chapter we extend the analysis to dye-sensitized metal oxide surface models,
discussing interfacial charge separation, energetics and electron injection
kinetics from the dye excited state to the semiconductor conduction band
states
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization
Conventional approaches to sound source localization require at least two
microphones. It is known, however, that people with unilateral hearing loss can
also localize sounds. Monaural localization is possible thanks to the
scattering by the head, though it hinges on learning the spectra of the various
sources. We take inspiration from this human ability to propose algorithms for
accurate sound source localization using a single microphone embedded in an
arbitrary scattering structure. The structure modifies the frequency response
of the microphone in a direction-dependent way giving each direction a
signature. While knowing those signatures is sufficient to localize sources of
white noise, localizing speech is much more challenging: it is an ill-posed
inverse problem which we regularize by prior knowledge in the form of learned
non-negative dictionaries. We demonstrate a monaural speech localization
algorithm based on non-negative matrix factorization that does not depend on
sophisticated, designed scatterers. In fact, we show experimental results with
ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures
we can accurately localize arbitrary speakers; that is, we do not need to learn
the dictionary for the particular speaker to be localized. Finally, we discuss
multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM
Transactions on Audio, Speech, and Language processing (TASLP
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together
Neural networks equipped with self-attention have parallelizable computation,
light-weight structure, and the ability to capture both long-range and local
dependencies. Further, their expressive power and performance can be boosted by
using a vector to measure pairwise dependency, but this requires to expand the
alignment matrix to a tensor, which results in memory and computation
bottlenecks. In this paper, we propose a novel attention mechanism called
"Multi-mask Tensorized Self-Attention" (MTSA), which is as fast and as
memory-efficient as a CNN, but significantly outperforms previous
CNN-/RNN-/attention-based models. MTSA 1) captures both pairwise (token2token)
and global (source2token) dependencies by a novel compatibility function
composed of dot-product and additive attentions, 2) uses a tensor to represent
the feature-wise alignment scores for better expressive power but only requires
parallelizable matrix multiplications, and 3) combines multi-head with
multi-dimensional attentions, and applies a distinct positional mask to each
head (subspace), so the memory and computation can be distributed to multiple
heads, each with sequential information encoded independently. The experiments
show that a CNN/RNN-free model based on MTSA achieves state-of-the-art or
competitive performance on nine NLP benchmarks with compelling memory- and
time-efficiency
Development of an Advanced Force Field for Water using Variational Energy Decomposition Analysis
Given the piecewise approach to modeling intermolecular interactions for
force fields, they can be difficult to parameterize since they are fit to data
like total energies that only indirectly connect to their separable functional
forms. Furthermore, by neglecting certain types of molecular interactions such
as charge penetration and charge transfer, most classical force fields must
rely on, but do not always demonstrate, how cancellation of errors occurs among
the remaining molecular interactions accounted for such as exchange repulsion,
electrostatics, and polarization. In this work we present the first generation
of the (many-body) MB-UCB force field that explicitly accounts for the
decomposed molecular interactions commensurate with a variational energy
decomposition analysis, including charge transfer, with force field design
choices that reduce the computational expense of the MB-UCB potential while
remaining accurate. We optimize parameters using only single water molecule and
water cluster data up through pentamers, with no fitting to condensed phase
data, and we demonstrate that high accuracy is maintained when the force field
is subsequently validated against conformational energies of larger water
cluster data sets, radial distribution functions of the liquid phase, and the
temperature dependence of thermodynamic and transport water properties. We
conclude that MB-UCB is comparable in performance to MB-Pol, but is less
expensive and more transferable by eliminating the need to represent
short-ranged interactions through large parameter fits to high order
polynomials
- …