Search CORE

6 research outputs found

Nommage non supervisé des personnes dans les émissions de télévision. Utilisation des noms écrits, des noms prononcés ou des deux ?

Author: Besacier Laurent
Poignant Johann
Quénot Georges
Publication venue: 'Lavoisier'
Publication date: 01/01/2014
Field of study

National audienceL'identiﬁcation de personnes dans les émissions de télévision est un outil précieux pour l'indexation de ce type de vidéos mais l'utilisation de modèles biométriques n'est pas une option viable sans connaissance a priori des personnes présentes dans les vidéos. Les noms prononcés ou écrits peuvent nous fournir une liste de noms hypothèses. Nous proposons une comparaison du potentiel de ces deux modalités (noms prononcés ou écrits) aﬁn d'extraire le nom des personnes parlant et/ou apparaissant. Les noms prononcés proposent un plus grand nombre d'occurrences de citation mais les erreurs de transcription et de détection de ces noms réduisent de moitié le potentiel de cette modalité. Les noms écrits bénéﬁcient d'une amélioration croissante de la qualité des vidéos et sont plus facilement détectés. Par ailleurs, l'afﬁliation aux locuteurs/visages des noms écrits reste plus simple que pour les noms prononcés

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Anotación automática de personas en programas de TV sin supervisión

Author: Molina Torrell Joshua
Publication venue: Universitat Politècnica de Catalunya
Publication date: 30/05/2018
Field of study

La gran cantidad de datos visuales que se generan en la actualidad lleva a la necesidad de crear herramientas de anotación que permitan la búsqueda y la recuperación de la información que queramos en los vídeos. Una de las informaciones más importante de un vídeo es la identidad de personas. En este contexto, la anotación consiste en determinar quién aparece y cuando lo hace

UPCommons. Portal del coneixement obert de la UPC

Person annotation in video sequences

Author: Fernández-Pedraza Jorde Carolina
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

In the recent years, the demand for video tools to automatically annotate and classify large audiovisual datasets has increased considerably. One specific task in this field applies to TV broadcast videos, to determine who and when a person appears in a video sequence. This work starts from the base of the ALBAYZIN evaluation series presented in the IberSPEECH-RTVE 2018 in Barcelona, and the purpose of this thesis is trying to improve the results obtained and compare the different face detection and tracking methods. We will evaluate the performance of classic face detection techniques and other techniques based on machine learning on a closed dataset of 34 known people. The rest of characters on the audiovisual document will be labelled as "unknown". We will work with small videos and images of each known character to build his/her model and finally, evaluate the performance of the ALBAYZIN algorithm over a 2h video called "La noche en 24H" whose format is like a news program. We will analyze the results and the type of errors and scenarios we encountered as well as the solutions we propose for each of them if there is any. In this work, We will only focus on a monomodal basis of face recognition and tracking

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

Author: Barras Claude
Besacier Laurent
Bredin Hervé
Fortier Guillaume
Hazim Kemal Ekenel
Hua Gao
Jurie Frédéric
Le Viet Bac
Napoléon Thibault
Poignant Johann
Quénot Georges
Rosset Sophie
Tapaswi Makarand
Verbeek Jakob
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Poster session: WS21 - Workshop on Information Fusion in Computer Vision for Concept RecognitionInternational audienceThe REPERE challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 REPERE evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models)

HAL - Normandie Université

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Discriminative Appearance Models for Face Alignment

Author: Gao Hua
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

The proposed face alignment algorithm uses local gradient features as the appearance representation. These features are obtained by pixel value comparison, which provide robustness against changes in illumination, as well as partial occlusion and local deformation due to the locality. The adopted features are modeled in three discriminative methods, which correspond to different alignment cost functions. The discriminative appearance modeling alleviate the generalization problem to some extent

KITopen

Contextual Person Identification in Multimedia Data

Author: Bäuml Martin
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2014
Field of study

We propose methods to improve automatic person identification, regardless of the visibility of a face, by integration of multiple cues including multiple modalities and contextual information. We propose a joint learning approach using contextual information from videos to improve learned face models. Further, we integrate additional modalities in a global fusion framework. We evaluate our approaches on a novel TV series data set, consisting of over 100 000 annotated faces

KITopen