Search CORE

5,640 research outputs found

CLIWOC final report

Author: García-Herrera R.
Jones P. D.
Koek F. B.
Können G. P.
Prieto M. R.
Wheeler D. A.
Publication venue
Publication date: 01/01/2003
Field of study

Electronic Publication Information Center

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU

Decisions regarding infringements of sponsoring provisions

Author: Lievens Eva
Publication venue
Publication date: 01/01/2015
Field of study

status: publishe

Lirias

Ghent University Academic Bibliography

Five broadcasters warned for non-compliance with rules on commercial communication on sugary confectionery

Author: Lievens Eva
Publication venue
Publication date: 01/01/2015
Field of study

status: publishe

Lirias

Ghent University Academic Bibliography

Frame-level features conveying phonetic information for language and speaker recognition

Author: Díez Sánchez Mireia
Publication venue
Publication date: 04/09/2015
Field of study

150 p.This Thesis, developed in the Software Technologies Working Group of the Departmentof Electricity and Electronics of the University of the Basque Country, focuseson the research eld of spoken language and speaker recognition technologies.More specically, the research carried out studies the design of a set of featuresconveying spectral acoustic and phonotactic information, searches for the optimalfeature extraction parameters, and analyses the integration and usage of the featuresin language recognition systems, and the complementarity of these approacheswith regard to state-of-the-art systems. The study reveals that systems trained onthe proposed set of features, denoted as Phone Log-Likelihood Ratios (PLLRs), arehighly competitive, outperforming in several benchmarks other state-of-the-art systems.Moreover, PLLR-based systems also provide complementary information withregard to other phonotactic and acoustic approaches, which makes them suitable infusions to improve the overall performance of spoken language recognition systems.The usage of this features is also studied in speaker recognition tasks. In this context,the results attained by the approaches based on PLLR features are not as remarkableas the ones of systems based on standard acoustic features, but they still providecomplementary information that can be used to enhance the overall performance ofthe speaker recognition systems

Archivo Digital para la Docencia y la Investigación

The Domain Mismatch Problem in the Broadcast Speaker Attribution Task

Author: Lleida Eduardo
Miguel Antonio
Ortega Alfonso
Viñals Ignacio
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The demand of high-quality metadata for the available multimedia content requires the development of new techniques able to correctly identify more and more information, including the speaker information. The task known as speaker attribution aims at identifying all or part of the speakers in the audio under analysis. In this work, we carry out a study of the speaker attribution problem in the broadcast domain. Through our experiments, we illustrate the positive impact of diarization on the final performance. Additionally, we show the influence of the variability present in broadcast data, depicting the broadcast domain as a collection of subdomains with particular characteristics. Taking these two factors into account, we also propose alternative approximations robust against domain mismatch. These approximations include a semisupervised alternative as well as a totally unsupervised new hybrid solution fusing diarization and speaker assignment. Thanks to these two approximations, our performance is boosted around a relative 50%. The analysis has been carried out using the corpus for the Albayzín 2020 challenge, a diarization and speaker attribution evaluation working with broadcast data. These data, provided by Radio Televisión Española (RTVE), the Spanish public Radio and TV Corporation, include multiple shows and genres to analyze the impact of new speech technologies in real-world scenarios

Multidisciplinary Digital Publishing Institute

Repositorio Universidad de Zaragoza

Directory of Open Access Journals