Search CORE

4 research outputs found

Separació de shots de vídeo amb anàlisis multimodal

Author: Palou Llobera Pere
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2006
Field of study

La indexació i la recuperació de vídeo en format digital és una de les àrees del tractament digital de senyals audiovisuals en les quals s’està desenvolupant una gran activitat. La quantitat d’informació audiovisual digital disponible en bases de dades està creixent de forma espectacular gràcies al desenvolupament tecnològic en la societat de la informació i la comunicació en els últims anys. Per aquesta raó, l’accés a les dades audiovisuals ha de ser el més senzill i ràpid possible per a estalviar temps i recursos. Per això es necessiten eines automàtiques de segmentació, que separin una seqüència de vídeo en els seus shots elementals. S’han implementat dos descriptors de color basats en histogrames definits en l’estàndard MPEG-7, el Scalable Color Descriptor (SCD), que extreu els bins de l’histograma de l’espai de color HSV, i el Group-of-Frames Descriptor (GoF), que s’utilitza per a representar el contingut de cada shot detectat mitjançant l’acumulació de tres histogrames diferents. Una vegada extretes les característiques de color, es calculen mesures de distància L2 entre frames consecutius que proporcionen la informació necessària per a, aplicant algorismes basats en llindars temporals adaptatius, detectar els shots (hard cuts) d’una seqüència de vídeo. Es presenten un conjunt de resultats per a tots els gèneres de vídeo inclosos en la base de dades segmentada manualment. Aquests resultats s’avaluen a partir de la mesura de distància L2 entre frames consecutius per als paràmetres estadístics μ i σ del canal HSV i, per altra banda, a partir de la mesura de distància L2 entre frames consecutius per als bins de l’histograma extret pel SCD. Recall i Precision mesuren la qualitat de les deteccions. Per a la valoració global del gènere de vídeo s’obtenen els següents resultats: Recallbins (97,29%) > Recall μ, σ (92,69%) Precisionbins (78,92%) < Precision μ, σ (86,51%

UPCommons. Portal del coneixement obert de la UPC

Word prediction for a real-time reader device for blind people

Author: Palou Llobera Pere
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2008
Field of study

The aim of this project, taking the software developed in a previous work as the starting point, is to increase the recognition reliability and robustness. The main goal of the future global system is the ability to be the closest possible to the way that blind people read, increasing the accessibility to this group of people. If this system can considerably help blind people to read, these people would probably get more reliability to access new technologies, due to the fact that unfortunately, nowadays, a great amount of blind people do not use computers because they can not access them. Therefore, a way to increase the system reliability is to make it more robust. The current system based on artificial neural networks processes a character and tries to recognize it only taking into consideration its acquired image from the camera. In consequence, the system does not take into consideration other information which would increase the system accuracy. Other information could be the use of previous characters or some orthographic notions of the language in use, which are useful to avoid errors when a bad recognition has occurred. For this reason, a character and word-level prediction systems have been implemented. On the one hand, useful to add a simultaneous way of recognition and, on the other hand, the starting point of a system able to correct characters or words in text

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Separació de shots de vídeo amb anàlisis multimodal

Author: Palou Llobera Pere
Publication venue: Universitat Politècnica de Catalunya
Publication date
Field of study

RECERCAT

Word prediction for a real-time reader device for blind people

Author: Palou Llobera Pere
Publication venue: Universitat Politècnica de Catalunya
Publication date
Field of study

RECERCAT