29,622 research outputs found

    ACCDIST: A Metric for comparing speakers' accents

    Get PDF
    This paper introduces a new metric for the quantitative assessment of the similarity of speakers' accents. The ACCDIST metric is based on the correlation of inter-segment distance tables across speakers or groups. Basing the metric on segment similarity within a speaker ensures that it is sensitive to the speaker's pronunciation system rather than to his or her voice characteristics. The metric is shown to have an error rate of only 11% on the accent classification of speakers into 14 English regional accents of the British Isles, half the error rate of a metric based on spectral information directly. The metric may also be useful for cluster analysis of accent groups

    Regional averaging and scaling in relativistic cosmology

    Get PDF
    Averaged inhomogeneous cosmologies lie at the forefront of interest, since cosmological parameters like the rate of expansion or the mass density are to be considered as volume-averaged quantities and only these can be compared with observations. For this reason the relevant parameters are intrinsically scale-dependent and one wishes to control this dependence without restricting the cosmological model by unphysical assumptions. In the latter respect we contrast our way to approach the averaging problem in relativistic cosmology with shortcomings of averaged Newtonian models. Explicitly, we investigate the scale-dependence of Eulerian volume averages of scalar functions on Riemannian three-manifolds. We propose a complementary view of a Lagrangian smoothing of (tensorial) variables as opposed to their Eulerian averaging on spatial domains. This program is realized with the help of a global Ricci deformation flow for the metric. We explain rigorously the origin of the Ricci flow which, on heuristic grounds, has already been suggested as a possible candidate for smoothing the initial data set for cosmological spacetimes. The smoothing of geometry implies a renormalization of averaged spatial variables. We discuss the results in terms of effective cosmological parameters that would be assigned to the smoothed cosmological spacetime.Comment: LateX, IOPstyle, 48 pages, 11 figures; matches published version in C.Q.

    Learnable PINs: Cross-Modal Embeddings for Person Identity

    Full text link
    We propose and investigate an identity sensitive joint embedding of face and voice. Such an embedding enables cross-modal retrieval from voice to face and from face to voice. We make the following four contributions: first, we show that the embedding can be learnt from videos of talking faces, without requiring any identity labels, using a form of cross-modal self-supervision; second, we develop a curriculum learning schedule for hard negative mining targeted to this task, that is essential for learning to proceed successfully; third, we demonstrate and evaluate cross-modal retrieval for identities unseen and unheard during training over a number of scenarios and establish a benchmark for this novel task; finally, we show an application of using the joint embedding for automatically retrieving and labelling characters in TV dramas.Comment: To appear in ECCV 201

    The metaphysics of Machian frame-dragging

    Get PDF
    The paper investigates the kind of dependence relation that best portrays Machian frame-dragging in general relativity. The question is tricky because frame-dragging relates local inertial frames to distant distributions of matter in a time-independent way, thus establishing some sort of non-local link between the two. For this reason, a plain causal interpretation of frame-dragging faces huge challenges. The paper will shed light on the issue by using a generalized structural equation model analysis in terms of manipulationist counterfactuals recently applied in the context of metaphysical enquiry by Schaffer (2016) and Wilson (2017). The verdict of the analysis will be that frame-dragging is best understood in terms of a novel type of dependence relation that is half-way between causation and grounding

    I hear you eat and speak: automatic recognition of eating condition and food type, use-cases, and impact on ASR performance

    Get PDF
    We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient

    Inhomogeneity and the foundations of concordance cosmology

    Get PDF
    The apparent accelerating expansion of the Universe is forcing us to examine the foundational aspects of the standard model of cosmology -- in particular, the fact that dark energy is a direct consequence of the homogeneity assumption. We discuss the foundations of the assumption of spatial homogeneity, in the case when the Copernican Principle is adopted. We present results that show how (almost-) homogeneity follows from (almost-) isotropy of various observables. The analysis requires the fully nonlinear field equations -- i.e., it is not possible to use second- or higher-order perturbation theory, since one cannot assume a homogeneous and isotropic background. Then we consider what happens if the Copernican Principle is abandoned in our Hubble volume. The simplest models are inhomogeneous but spherically symmetric universes which do not require dark energy to fit the distance modulus. Key problems in these models are to compute the CMB anisotropies and the features of large-scale structure. We review how to construct perturbation theory on a non-homogeneous cosmological background, and discuss the complexities that arise in using this to determine the growth of large-scale structure.Comment: 26 pages and 1 figure. Invited review article for the CQG special issue on nonlinear cosmological perturbations. v2 has additional refs and comments, minor errors corrected, version in CQ
    • 

    corecore