Search CORE

7 research outputs found

The MGB Challenge: Evaluating Multi-genre Broadcast Media Recognition

Author: Bell P.
Gales M.
Hain T.
Kilgour J.
Lanchantin P.
Liu A.
McParland A.
Renals S.
Saz O.
Wester M.
Woodland P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper describes the Multi-Genre Broadcast (MGB) Challenge at ASRU 2015, an evaluation focused on speech recognition, speaker diarization, and "lightly supervised" alignment of BBC TV recordings. The challenge training data covered the whole range of seven weeks BBC TV output across four channels, resulting in about 1,600 hours of broadcast audio. In addition several hundred million words of BBC subtitle text was provided for language modelling. A novel aspect of the evaluation was the exploration of speech recognition and speaker diarization in a longitudinal setting - i.e. recognition of several episodes of the same show, and speaker diarization across these episodes, linking speakers. The longitudinal tasks also offered the opportunity for systems to make use of supplied metadata including show title, genre tag, and date/time of transmission. This paper describes the task data and evaluation process used in the MGB challenge, and summarises the results obtained

Crossref

Edinburgh Research Explorer

White Rose Research Online

Advances in Subspace-based Solutions for Diarization in the Broadcast Domain

Author: Ortega Giménez Alfonso
Viñals Bailo Ignacio
Publication venue: Universidad de Zaragoza, Prensas de la Universidad
Publication date: 01/01/2020
Field of study

La motivación de esta tesis es la necesidad de soluciones robustas al problema de diarización. Estas técnicas de diarización deben proporcionar valor añadido a la creciente cantidad disponible de datos multimedia mediante la precisa discriminación de los locutores presentes en la señal de audio. Desafortunadamente, hasta tiempos recientes este tipo de tecnologías solamente era viable en condiciones restringidas, quedando por tanto lejos de una solución general. Las razones detrás de las limitadas prestaciones de los sistemas de diarización son múltiples. La primera causa a tener en cuenta es la alta complejidad de la producción de la voz humana, en particular acerca de los procesos fisiológicos necesarios para incluir las características discriminativas de locutor en la señal de voz. Esta complejidad hace del proceso inverso, la estimación de dichas características a partir del audio, una tarea ineficiente por medio de las técnicas actuales del estado del arte. Consecuentemente, en su lugar deberán tenerse en cuenta aproximaciones. Los esfuerzos en la tarea de modelado han proporcionado modelos cada vez más elaborados, aunque no buscando la explicación última de naturaleza fisiológica de la señal de voz. En su lugar estos modelos aprenden relaciones entre la señales acústicas a partir de un gran conjunto de datos de entrenamiento. El desarrollo de modelos aproximados genera a su vez una segunda razón, la variabilidad de dominio. Debido al uso de relaciones aprendidas a partir de un conjunto de entrenamiento concreto, cualquier cambio de dominio que modifique las condiciones acústicas con respecto a los datos de entrenamiento condiciona las relaciones asumidas, pudiendo causar fallos consistentes en los sistemas.Nuestra contribución a las tecnologías de diarización se ha centrado en el entorno de radiodifusión. Este dominio es actualmente un entorno todavía complejo para los sistemas de diarización donde ninguna simplificación de la tarea puede ser tenida en cuenta. Por tanto, se deberá desarrollar un modelado eficiente del audio para extraer la información de locutor y como inferir el etiquetado correspondiente. Además, la presencia de múltiples condiciones acústicas debido a la existencia de diferentes programas y/o géneros en el domino requiere el desarrollo de técnicas capaces de adaptar el conocimiento adquirido en un determinado escenario donde la información está disponible a aquellos entornos donde dicha información es limitada o sencillamente no disponible.Para este propósito el trabajo desarrollado a lo largo de la tesis se ha centrado en tres subtareas: caracterización de locutor, agrupamiento y adaptación de modelos. La primera subtarea busca el modelado de un fragmento de audio para obtener representaciones precisas de los locutores involucrados, poniendo de manifiesto sus propiedades discriminativas. En este área se ha llevado a cabo un estudio acerca de las actuales estrategias de modelado, especialmente atendiendo a las limitaciones de las representaciones extraídas y poniendo de manifiesto el tipo de errores que pueden generar. Además, se han propuesto alternativas basadas en redes neuronales haciendo uso del conocimiento adquirido. La segunda tarea es el agrupamiento, encargado de desarrollar estrategias que busquen el etiquetado óptimo de los locutores. La investigación desarrollada durante esta tesis ha propuesto nuevas estrategias para estimar el mejor reparto de locutores basadas en técnicas de subespacios, especialmente PLDA. Finalmente, la tarea de adaptación de modelos busca transferir el conocimiento obtenido de un conjunto de entrenamiento a dominios alternativos donde no hay datos para extraerlo. Para este propósito los esfuerzos se han centrado en la extracción no supervisada de información de locutor del propio audio a diarizar, sinedo posteriormente usada en la adaptación de los modelos involucrados.<br /

Repositorio Universidad de Zaragoza

CRIM and LIUM approaches for Multi-Genre Broadcast Media Transcription

Author: Boulianne Gilles
Deléglise Paul
Estève Yannick
Gupta Vishwa
Meignier Sylvain
Rousseau Anthony
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceno abstrac

Central Washington University 1991/93 Undergraduate/Graduate Catalog

Author: Central Washington University
Publication venue: Central Washington University
Publication date: 01/06/1991
Field of study

https://digitalcommons.cwu.edu/catalogs/1187/thumbnail.jp

ScholarWorks at Central Washington University

Maritime expressions:a corpus based exploration of maritime metaphors

Author: Isserlis Simon Jonathon
Publication venue
Publication date
Field of study

This study uses a purpose-built corpus to explore the linguistic legacy of Britain’s maritime history found in the form of hundreds of specialised ‘Maritime Expressions’ (MEs), such as TAKEN ABACK, ANCHOR and ALOOF, that permeate modern English. Selecting just those expressions commencing with ’A’, it analyses 61 MEs in detail and describes the processes by which these technical expressions, from a highly specialised occupational discourse community, have made their way into modern English. The Maritime Text Corpus (MTC) comprises 8.8 million words, encompassing a range of text types and registers, selected to provide a cross-section of ‘maritime’ writing. It is analysed using WordSmith analytical software (Scott, 2010), with the 100 million-word British National Corpus (BNC) as a reference corpus. Using the MTC, a list of keywords of specific salience within the maritime discourse has been compiled and, using frequency data, concordances and collocations, these MEs are described in detail and their use and form in the MTC and the BNC is compared. The study examines the transformation from ME to figurative use in the general discourse, in terms of form and metaphoricity. MEs are classified according to their metaphorical strength and their transference from maritime usage into new registers and domains such as those of business, politics, sports and reportage etc. A revised model of metaphoricity is developed and a new category of figurative expression, the ‘resonator’, is proposed. Additionally, developing the work of Lakov and Johnson, Kovesces and others on Conceptual Metaphor Theory (CMT), a number of Maritime Conceptual Metaphors are identified and their cultural significance is discussed

Aston Publications Explorer

Medicinal and poisonous plants 2

Author: Bunyapraphatsara N.
Valkenburg J.L.C.H., van
Publication venue: Backhuys Publishers
Publication date
Field of study

Wageningen University & Research Publications

Recommended from our members

Nostratic Dictionary

Author: Dolgopolsky Aharon
Publication venue
Publication date: 07/05/2008
Field of study

A revised edition can be found at http://www.dspace.cam.ac.uk/handle/1810/244080.Aharon Dolgopolsky is the leading authority on the Nostratic macrofamily. His 'Nostratic Dictionary' presented here is, of course, something very much more than a dictionary. It is the most thorough and extensive demonstration and documentation so far of what may be termed the Nostratic hypothesis: that several of the world's best- known language families are related in their origin, their grammar and their lexicon, and that they belong together in a larger unit, of earlier origin, the Nostratic macrofamily. It should at once be noted that several elements of this enterprise are controversial. For while the Nostratic hypothesis has many supporters, it has been criticized on rather fundamental grounds by a number of distinguished linguists. The matter was reviewed some years ago in a symposium held at the McDonald Institute, and positions remain very much polarized. It was a result of that meeting that the decision was taken to invite Aharon Dolgopolsky to publish his Dictionary - a much more substantial treatise than any work hitherto undertaken on the subject - at the McDonald Institute. For it became clear that the diversities of view expressed at that symposium were not likely to be resolved by further polemical exchanges. Instead, a substantial body of data was required, whose examination and evaluation could subsequently lead to more mature judgments. Those data are presented here, and that more mature evaluation can now proceed.McDonald Institute for Archaeological Research Alfred P. Sloan Foundatio

Apollo (Cambridge)