156 research outputs found

    Enumeration of m-ary cacti

    Get PDF
    The purpose of this paper is to enumerate various classes of cyclically colored m-gonal plane cacti, called m-ary cacti. This combinatorial problem is motivated by the topological classification of complex polynomials having at most m critical values, studied by Zvonkin and others. We obtain explicit formulae for both labelled and unlabelled m-ary cacti, according to i) the number of polygons, ii) the vertex-color distribution, iii) the vertex-degree distribution of each color. We also enumerate m-ary cacti according to the order of their automorphism group. Using a generalization of Otter's formula, we express the species of m-ary cacti in terms of rooted and of pointed cacti. A variant of the m-dimensional Lagrange inversion is then used to enumerate these structures. The method of Liskovets for the enumeration of unrooted planar maps can also be adapted to m-ary cacti.Comment: LaTeX2e, 28 pages, 9 figures (eps), 3 table

    Duration mismatch compensation using four-covariance model and deep neural network for speaker verification

    Get PDF
    International audienceDuration mismatch between enrollment and test utterances still remains a major concern for reliability of real-life speaker recognition applications. Two approaches are proposed here to deal with this case when using the i-vector representation. The first one is an adaptation of Gaussian Probabilistic Linear Discriminant Analysis (PLDA) modeling, which can be extended to the case of any shift between i-vectors drawn from two distinct distributions. The second one attempts to map i-vectors of truncated segments of an utterance to the i-vector of the full segment, by the use of deep neural networks (DNN). Our results show that both new approaches outperform the standard PLDA by about 10 % relative, noting that these back-end methods could complement those quantifying the i-vector uncertainty during its extraction process, in the case of duration gap

    Typicality extraction in a Speaker Binary Keys model

    Get PDF
    International audienceIn the field of speaker recognition, the recently proposed notion of "Speaker Binary Key" provides a representation of each acoustic frame in a discriminant binary space. This approach relies on an unique acoustic model composed by a large set of speaker specific local likelihood peaks (called specificities). The model proposes a spatial coverage where each frame is characterized in terms of neighborhood. The most frequent specificities, picked up to represent the whole utterance, generate a binary key vector. The flexibility of this modeling allows to capture non-parametric behaviors. In this paper, we introduce a concept of "typicality" between binary keys, with a discriminant goal. We describe an algorithm able to extract such typicalities, which involves a singular value decomposition in a binary space. The theoretical aspects of this decomposition as well as its potential in terms of future developments are presented. All the propositions are also experimentally validated using NIST SRE 2008 framework

    Constrained discriminative speaker verification specific to normalized i-vectors

    Get PDF
    International audienceThis paper focuses on discriminative trainings (DT) applied to i-vectors after Gaussian probabilistic linear discriminant analysis (PLDA). If DT has been successfully used with non-normalized vectors, this technique struggles to improve speaker detection when i-vectors have been first normalized, whereas the latter option has proven to achieve best performance in speaker verification. We propose an additional normalization procedure which limits the amount of coefficient to discriminatively train, with a minimal loss of accuracy. Adaptations of logistic regression based-DT to this new configuration are proposed, then we introduce a discriminative classifier for speaker verification which is a novelty in the field

    Exploring some limits of Gaussian PLDA modeling for i-vector distributions

    Get PDF
    International audienceGaussian-PLDA (G-PLDA) modeling for i-vector based speaker verification has proven to be competitive versus heavy-tailed PLDA (HT-PLDA) based on Student's t-distribution, when the latter is much more computationally expensive. However , its results are achieved using a length-normalization, which projects i-vectors on the non-linear and finite surface of a hypersphere. This paper investigates the limits of linear and Gaussian G-PLDA modeling when distribution of data is spherical. In particular, assumptions of homoscedasticity are questionable: the model assumes that the within-speaker variability can be estimated by a unique and linear parameter. A non-probabilistic approach is proposed, competitive with state-of-the-art, which reveals some limits of the Gaussian modeling in terms of goodness of fit. We carry out an analysis of residue, which finds out a relation between the dispersion of a speaker-class and its location and, thus, shows that homoscedasticity assumptions are not fulfilled

    Spoken Language Understanding in a Latent Topic-based Subspace

    Get PDF
    International audiencePerformance of spoken language understanding applications declines when spoken documents are automatically transcribed in noisy conditions due to high Word Error Rates (WER). To improve the robustness to transcription errors, recent solutions propose to map these automatic transcriptions in a latent space. These studies have proposed to compare classical topic-based representations such as Latent Dirichlet Allocation (LDA), supervised LDA and author-topic (AT) models. An original compact representation, called c-vector, has recently been introduced to walk around the tricky choice of the number of latent topics in these topic-based representations. Moreover, c-vectors allow to increase the robustness of document classification with respect to transcription errors by compacting different LDA representations of a same speech document in a reduced space and then compensate most of the noise of the document representation. The main drawback of this method is the number of sub-tasks needed to build the c-vector space. This paper proposes to both improve this compact representation (c-vector) of spoken documents and to reduce the number of needed sub-tasks, using an original framework in a robust low dimensional space of features from a set of AT models called "Latent Topic-based Sub-space" (LTS). In comparison to LDA, the AT model considers not only the dialogue content (words), but also the class related to the document. Experiments are conducted on the DECODA corpus containing speech conversations from the call-center of the RATP Paris transportation company. Results show that the original LTS representation outperforms the best previous compact representation (c-vector), with a substantial gain of more than 2.5% in terms of correctly labeled conversations

    Occupation du premier âge du Fer sur le site de La Condamine VII à Vauvert (Gard)

    Get PDF
    Une fouille préventive menée à Vauvert en 2012 au quartier de La Condamine a révélé la présence de plusieurs fosses de la fin du premier âge du Fer, dont la plus vaste, correspondant apparemment à une extraction de matériau, a été réutilisée comme dépotoir et a livré un riche mobilier. L'étude du comblement et du contexte suggère que le remplissage est issu de l'incendie d'une ou plusieurs maisons proches, construites en terre et probablement couvertes de bruyère. Le mobilier comprend notamment plusieurs coupes attiques et de nombreux vases gris monochromes témoignant d'un large accès des habitants au commerce méditerranéen et régional, visiblement favorisé par la proximité du comptoir lagunaire du Cailar. Remarquable est aussi la qualité des objets en bronze, fibules et épingles, dont certains sont originaux. On ignore cependant si les habitations en cause étaient isolées au sein d'un domaine rural, ou si un véritable village existait à proximité.A preventive excavation conducted in 2012 in the district of La Condamine in Vauvert revealed the presence of several pits from the late Early Iron Age. The largest apparently corresponds to a pit serving for extraction of material and was reused as a dumping ground revealing now a lot of artifacts. The study of the filling and the context suggests that the filling stems from one or more nearby burned houses, which were built of earth and probably covered with heather. The furniture includes several Attic cups and many grey monochrome vases, indicating the population’s participation to the Mediterranean and regional trade. The latter fact is apparently favored by the proximity to the settlement of the Cailar.Also remarkable is the quality of bronze objects, fibulae and pins, some of which are original. It is not known if the settlements in question were isolated in a rural area, or if there was a real village nearby

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

    Get PDF
    The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve sub-systems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others, a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation.Comment: 5 page

    I4U System Description for NIST SRE'20 CTS Challenge

    Full text link
    This manuscript describes the I4U submission to the 2020 NIST Speaker Recognition Evaluation (SRE'20) Conversational Telephone Speech (CTS) Challenge. The I4U's submission was resulted from active collaboration among researchers across eight research teams - I2^2R (Singapore), UEF (Finland), VALPT (Italy, Spain), NEC (Japan), THUEE (China), LIA (France), NUS (Singapore), INRIA (France) and TJU (China). The submission was based on the fusion of top performing sub-systems and sub-fusion systems contributed by individual teams. Efforts have been spent on the use of common development and validation sets, submission schedule and milestone, minimizing inconsistency in trial list and score file format across sites.Comment: SRE 2021, NIST Speaker Recognition Evaluation Workshop, CTS Speaker Recognition Challenge, 14-12 December 202

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

    Get PDF
    The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve subsystems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others , a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation
    • …
    corecore