29,622 research outputs found
ACCDIST: A Metric for comparing speakers' accents
This paper introduces a new metric for the quantitative assessment of the similarity of speakers' accents. The ACCDIST metric is based on the correlation of inter-segment distance tables across speakers or groups. Basing the metric on segment similarity within a speaker ensures that it is sensitive to the speaker's pronunciation system rather than to his or her voice characteristics. The metric is shown to have an error rate of only 11% on the accent classification of speakers into 14 English regional accents of the British Isles, half the error rate of a metric based on spectral information directly. The metric may also be useful for cluster analysis of accent groups
Regional averaging and scaling in relativistic cosmology
Averaged inhomogeneous cosmologies lie at the forefront of interest, since
cosmological parameters like the rate of expansion or the mass density are to
be considered as volume-averaged quantities and only these can be compared with
observations. For this reason the relevant parameters are intrinsically
scale-dependent and one wishes to control this dependence without restricting
the cosmological model by unphysical assumptions. In the latter respect we
contrast our way to approach the averaging problem in relativistic cosmology
with shortcomings of averaged Newtonian models. Explicitly, we investigate the
scale-dependence of Eulerian volume averages of scalar functions on Riemannian
three-manifolds. We propose a complementary view of a Lagrangian smoothing of
(tensorial) variables as opposed to their Eulerian averaging on spatial
domains. This program is realized with the help of a global Ricci deformation
flow for the metric. We explain rigorously the origin of the Ricci flow which,
on heuristic grounds, has already been suggested as a possible candidate for
smoothing the initial data set for cosmological spacetimes. The smoothing of
geometry implies a renormalization of averaged spatial variables. We discuss
the results in terms of effective cosmological parameters that would be
assigned to the smoothed cosmological spacetime.Comment: LateX, IOPstyle, 48 pages, 11 figures; matches published version in
C.Q.
Learnable PINs: Cross-Modal Embeddings for Person Identity
We propose and investigate an identity sensitive joint embedding of face and
voice. Such an embedding enables cross-modal retrieval from voice to face and
from face to voice. We make the following four contributions: first, we show
that the embedding can be learnt from videos of talking faces, without
requiring any identity labels, using a form of cross-modal self-supervision;
second, we develop a curriculum learning schedule for hard negative mining
targeted to this task, that is essential for learning to proceed successfully;
third, we demonstrate and evaluate cross-modal retrieval for identities unseen
and unheard during training over a number of scenarios and establish a
benchmark for this novel task; finally, we show an application of using the
joint embedding for automatically retrieving and labelling characters in TV
dramas.Comment: To appear in ECCV 201
The metaphysics of Machian frame-dragging
The paper investigates the kind of dependence relation that best portrays Machian frame-dragging in general relativity. The question is tricky because frame-dragging relates local inertial frames to distant distributions of matter in a time-independent way, thus establishing some sort of non-local link between the two. For this reason, a plain causal interpretation of frame-dragging faces huge challenges. The paper will shed light on the issue by using a generalized structural equation model analysis in terms of manipulationist counterfactuals recently applied in the context of metaphysical enquiry by Schaffer (2016) and Wilson (2017). The verdict of the analysis will be that frame-dragging is best understood in terms of a novel type of dependence relation that is half-way between causation and grounding
Recommended from our members
Alternative conventions and geometry for special relativity
This paper argues that Einsteinâs conventionalist definition of time is sufficient for, but not necessary to the geometric modelling of Special Relativity. A different convention allows that any time interval t, can be measured by dc, the distance travelled from an origin by the spherical wave-front of a light pulse, c. This means that the relationships represented by the hyperbolic geometry of Minkowski can also be represented by circular function geometry (CFG), where the spherical surface of c provides both a fourth set t, of frame-dependent co-ordinate points and a parameter s, for measuring intervals that are invariant between reference frames. However, sine values under the circle range from 1-0, rather than 1-â. This does not allow that for a reference frame velocity â c, any interval length â â. Furthermore, since CFG does not subdivide space-time into past and future zones, it excludes the possibility of backwards time travel for signal velocities > c
I hear you eat and speak: automatic recognition of eating condition and food type, use-cases, and impact on ASR performance
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient
Inhomogeneity and the foundations of concordance cosmology
The apparent accelerating expansion of the Universe is forcing us to examine
the foundational aspects of the standard model of cosmology -- in particular,
the fact that dark energy is a direct consequence of the homogeneity
assumption. We discuss the foundations of the assumption of spatial
homogeneity, in the case when the Copernican Principle is adopted. We present
results that show how (almost-) homogeneity follows from (almost-) isotropy of
various observables. The analysis requires the fully nonlinear field equations
-- i.e., it is not possible to use second- or higher-order perturbation theory,
since one cannot assume a homogeneous and isotropic background. Then we
consider what happens if the Copernican Principle is abandoned in our Hubble
volume. The simplest models are inhomogeneous but spherically symmetric
universes which do not require dark energy to fit the distance modulus. Key
problems in these models are to compute the CMB anisotropies and the features
of large-scale structure. We review how to construct perturbation theory on a
non-homogeneous cosmological background, and discuss the complexities that
arise in using this to determine the growth of large-scale structure.Comment: 26 pages and 1 figure. Invited review article for the CQG special
issue on nonlinear cosmological perturbations. v2 has additional refs and
comments, minor errors corrected, version in CQ
- âŠ