Search CORE

471 research outputs found

BUCEADOR, a multi-language search engine for digital libraries

Author: Adell Mercado Jordi
Bonafonte Cávez Antonio
Cardenal Antonio
Moreno Bilbao M. Asunción
Navas Eva
Rodríguez Banga Eduardo
Rodríguez Fonollosa José Adrián
Ruiz Costa-Jussà Marta
Publication venue
Publication date: 01/01/2012
Field of study

This paper presents a web-based multimedia search engine built within the Buceador (www.buceador.org) research project. A proof-of-concept tool has been implemented which is able to retrieve information from a digital library made of multimedia documents in the 4 official languages in Spain (Spanish, Basque, Catalan and Galician). The retrieved documents are presented in the user language after translation and dubbing (the four previous languages + English). The paper presents the tool functionality, the architecture, the digital library and provide some information about the technology involved in the fields of automatic speech recognition, statistical machine translation, text-to-speech synthesis and information retrieval. Each technology has been adapted to the purposes of the presented tool as well as to interact with the rest of the technologies involved.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Speech technologies for the audiovisual and multimedia interaction environments

Author: Alvarez Muniain Aitor
Publication venue
Publication date: 22/07/2016
Field of study

361 p

Archivo Digital para la Docencia y la Investigación

Accesibilidad y multilingüismo: un estudio exploratorio sobre la traducción automática de descripciones de audio

Author: Matamala Anna
Ortiz-Boix Carla
Publication venue: 'Malaga University'
Publication date: 01/01/2016
Field of study

This article presents the results of an exploratory study which assesses the machine translation of audio descriptions as offering a possible solution to increase accessibility in multilingual environments. Accessibility is understood to encompass two different categories: sensorial accessibility (in this specific case, for the blind and visually impaired, who cannot access the visual content of audiovisual productions), and linguistic accessibility (for those who want to access this content in their own language). The article presents some thoughts on translation as a means of promoting multilingualism, on the feasibility of translating audio descriptions, and on machine translation as applied to this audiovisual translation mode, before summarising the findings of the present study and, most importantly, opening up new potential avenues for research.Este artículo presenta los resultados de un estudio exploratorio que evalúa la traducción automática de audiodescripciones como una posible solución para aumentar la accesibilidad en entornos multilingües. Se entiende que la accesibilidad abarca dos categorías diferentes: accesibilidad sensorial (en este caso específico, para los ciegos y discapacitados visuales, que no pueden acceder al contenido visual de las producciones audiovisuales) y accesibilidad lingüística (para aquellos que quieren acceder a este contenido en su propio idioma). El artículo presenta algunas reflexiones sobre la traducción como medio para promover el multilingüismo, sobre la viabilidad de traducir descripciones de audio y sobre la traducción automática tal y como se aplica a este tipo de traducción audiovisual, antes de sintetizar los hallazgos del presente estudio y, lo que es más importante, abrir la posibilidad de nuevas vías a la investigación

Crossref

Diposit Digital de Documents de la UAB

Portal de Revistas OJS

Multilingual audio information management system based on semantic knowledge in complex environments

Author: Barroso Moreno Nora
Calvo Salomón Pilar María
Ezeiza Ramos Aitzol
Fernández Gómez de Segura Elsa
Hernández Gómez María del Carmen
López de Ipiña Peña Miren Karmele
Susperregui Aseguinolaza Unai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

This paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, French, and Basque that is in under-resourced situation in some areas); and the regular appearance of cross-lingual elements between the three languages. In addition to this, the system is also constrained by the requirements of the local multilingual industrial sector. We present the first evolutionary system based on a scalable architecture that is able to fulfill these specifications with automatic adaptation based on automatic semantic speech recognition, folksonomies, automatic configuration selection, machine learning, neural computing methodologies, and collaborative networks. As a result, it can be said that the initial goals have been accomplished and the usability of the final application has been tested successfully, even with non-experienced users.This work is being funded by Grants: TEC201677791-C4 from Plan Nacional de I + D + i, Ministry of Economic Affairs and Competitiveness of Spain and from the DomusVi Foundation Kms para recorder, the Basque Government (ELKARTEK KK-2018/00114, GEJ IT1189-19, the Government of Gipuzkoa (DG18/14 DG17/16), UPV/EHU (GIU19/090), COST ACTION (CA18106, CA15225)

Archivo Digital para la Docencia y la Investigación

Multilingual sentiment analysis in social media.

Author: San Vicente Roncal Iñaki
Publication venue
Publication date: 01/01/2019
Field of study

252 p.This thesis addresses the task of analysing sentiment in messages coming from social media. The ultimate goal was to develop a Sentiment Analysis system for Basque. However, because of the socio-linguistic reality of the Basque language a tool providing only analysis for Basque would not be enough for a real world application. Thus, we set out to develop a multilingual system, including Basque, English, French and Spanish.The thesis addresses the following challenges to build such a system:- Analysing methods for creating Sentiment lexicons, suitable for less resourced languages.- Analysis of social media (specifically Twitter): Tweets pose several challenges in order to understand and extract opinions from such messages. Language identification and microtext normalization are addressed.- Research the state of the art in polarity classification, and develop a supervised classifier that is tested against well known social media benchmarks.- Develop a social media monitor capable of analysing sentiment with respect to specific events, products or organizations

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación

Multilingual sentiment analysis in social media.

Author: San Vicente Roncal Iñaki
Publication venue
Publication date: 11/03/2019
Field of study

Archivo Digital para la Docencia y la Investigación

Sub-Sync: automatic synchronization of subtitles in the broadcasting of true live programs in spanish

Author: González Carrasco Israel
López Cuadrado José Luis
Puente Luis
Ruiz Mecua María Belén
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/05/2019
Field of study

Individuals With Sensory Impairment (Hearing Or Visual) Encounter Serious Communication Barriers Within Society And The World Around Them. These Barriers Hinder The Communication Process And Make Access To Information An Obstacle They Must Overcome On A Daily Basis. In This Context, One Of The Most Common Complaints Made By The Television (Tv) Users With Sensory Impairment Is The Lack Of Synchronism Between Audio And Subtitles In Some Types Of Programs. In Addition, Synchronization Remains One Of The Most Significant Factors In Audience Perception Of Quality In Live-Originated Tv Subtitles For The Deaf And Hard Of Hearing. This Paper Introduces The Sub-Sync Framework Intended For Use In Automatic Synchronization Of Audio-Visual Contents And Subtitles, Taking Advantage Of Current Well-Known Techniques Used In Symbol Sequences Alignment. In This Particular Case, These Symbol Sequences Are The Subtitles Produced By The Broadcaster Subtitling System And The Word Flow Generated By An Automatic Speech Recognizing The Procedure. The Goal Of Sub-Sync Is To Address The Lack Of Synchronism That Occurs In The Subtitles When Produced During The Broadcast Of Live Tv Programs Or Other Programs That Have Some Improvised Parts. Furthermore, It Also Aims To Resolve The Problematic Interphase Of Synchronized And Unsynchronized Parts Of Mixed Type Programs. In Addition, The Framework Is Able To Synchronize The Subtitles Even When They Do Not Correspond Literally To The Original Audio And/Or The Audio Cannot Be Completely Transcribed By An Automatic Process. Sub-Sync Has Been Successfully Tested In Different Live Broadcasts, Including Mixed Programs, In Which The Synchronized Parts (Recorded, Scripted) Are Interspersed With Desynchronized (Improvised) Ones

Universidad Carlos III de Madrid e-Archivo

Language report for Catalan (English version)

Author: Bel Nùria
Garcia Emília
Moreno Bilbao M. Asunción
Revilla Espí Eva
Vallverdú Bayés Sisco
Publication venue
Publication date: 01/01/2011
Field of study

The central objective of the Metanet4u project is to contribute to the establishment of a pan-European digital platform that makes available language resources and services, encompassing both datasets and software tools, for speech and language processing, and supports a new generation of exchange facilities for them.Peer ReviewedPreprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism

Author: Barrón Cedeño Luis Alberto
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 08/06/2012
Field of study

Barrón Cedeño, LA. (2012). On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16012Palanci

RiuNet

Recommended from our members

Multilingual audio information management system based on semantic knowledge in complex environments

Author: Barroso Nora
Calvo Pilar M
Ezeiza Aitzol
Fernández Elsa
Hernandez Carmen
Lopez-de-Ipina Karmele
Susperregi Unai
Publication venue: Neural Computing and Applications
Publication date: 03/12/2020
Field of study

AbstractThis paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, French, and Basque that is in under-resourced situation in some areas); and the regular appearance of cross-lingual elements between the three languages. In addition to this, the system is also constrained by the requirements of the local multilingual industrial sector. We present the first evolutionary system based on a scalable architecture that is able to fulfill these specifications with automatic adaptation based on automatic semantic speech recognition, folksonomies, automatic configuration selection, machine learning, neural computing methodologies, and collaborative networks. As a result, it can be said that the initial goals have been accomplished and the usability of the final application has been tested successfully, even with non-experienced users.</jats:p

Apollo (Cambridge)