Search CORE

24 research outputs found

Automatic Sign Language Recognition from Image Data

Author: Campr Pavel
Publication venue: Západočeská univerzita v Plzni
Publication date: 12/02/2013
Field of study

Tato práce se zabývá problematikou automatického rozpoznávání znakového jazyka z obrazových dat. Práce představuje pět hlavních přínosů v oblasti tvorby systému pro rozpoznávání, tvorby korpusů, extrakci příznaků z rukou a obličeje s využitím metod pro sledování pozice a pohybu rukou (tracking) a modelování znaků s využitím menších fonetických jednotek (sub-units). Metody využité v rozpoznávacím systému byly využity i k tvorbě vyhledávacího nástroje "search by example", který dokáže vyhledávat ve videozáznamech podle obrázku ruky. Navržený systém pro automatické rozpoznávání znakového jazyka je založen na statistickém přístupu s využitím skrytých Markovových modelů, obsahuje moduly pro analýzu video dat, modelování znaků a dekódování. Systém je schopen rozpoznávat jak izolované, tak spojité promluvy. Veškeré experimenty a vyhodnocení byly provedeny s vlastními korpusy UWB-06-SLR-A a UWB-07-SLR-P, první z nich obsahuje 25 znaků, druhý 378. Základní extrakce příznaků z video dat byla provedena na nízkoúrovňových popisech obrazu. Lepších výsledků bylo dosaženo s příznaky získaných z popisů vyšší úrovně porozumění obsahu v obraze, které využívají sledování pozice rukou a metodu pro segmentaci rukou v době překryvu s obličejem. Navíc, využitá metoda dokáže interpolovat obrazy s obličejem v době překryvu a umožňuje tak využít metody pro extrakci příznaků z obličeje, které by během překryvu nefungovaly, jako např. metoda active appearance models (AAM). Bylo porovnáno několik různých metod pro extrakci příznaků z rukou, jako např. local binary patterns (LBP), histogram of oriented gradients (HOG), vysokoúrovnové lingvistické příznaky a nové navržená metoda hand shape radial distance function (hRDF). Bylo také zkoumáno využití menších fonetických jednotek, než jsou celé znaky, tzv. sub-units. Pro první krok tvorby těchto jednotek byl navržen iterativní algoritmus, který tyto jednotky automaticky vytváří analýzou existujících dat. Bylo ukázáno, že tento koncept je vhodný pro modelování a rozpoznávání znaků. Kromě systému pro rozpoznávání je v práci navržen a představen systém "search by example", který funguje jako vyhledávací systém pro videa se záznamy znakového jazyka a může být využit například v online slovnících znakového jazyka, kde je v současné době složité či nemožné v takovýchto datech vyhledávat. Tento nástroj využívá metody, které byly použity v rozpoznávacím systému. Výstupem tohoto vyhledávacího nástroje je seřazený seznam videí, které obsahují stejný nebo podobný tvar ruky, které zadal uživatel, např. přes webkameru.Katedra kybernetikyObhájenoThis thesis addresses several issues of automatic sign language recognition, namely the creation of vision based sign language recognition framework, sign language corpora creation, feature extraction, making use of novel hand tracking with face occlusion handling, data-driven creation of sub-units and "search by example" tool for searching in sign language corpora using hand images as a search query. The proposed sign language recognition framework, based on statistical approach incorporating hidden Markov models (HMM), consists of video analysis, sign modeling and decoding modules. The framework is able to recognize both isolated signs and continuous utterances from video data. All experiments and evaluations were performed on two own corpora, UWB-06-SLR-A and UWB-07-SLR-P, the first containing 25 signs and second 378. As a baseline feature descriptors, low level image features are used. It is shown that better performance is gained by higher level features that employ hand tracking, which resolve occlusions of hands and face. As a side effect, the occlusion handling method interpolates face area in the frames during the occlusion and allows to use face feature descriptors that fail in such a case, for instance features extracted from active appearance models (AAM) tracker. Several state-of-the-art appearance-based feature descriptors were compared for tracked hands, such as local binary patterns (LBP), histogram of oriented gradients (HOG), high-level linguistic features or newly proposed hand shape radial distance function (denoted as hRDF) that enhances the feature description of hand-shape like concave regions. The concept of sub-units, that uses HMM models based on linguistic units smaller than whole sign and covers inner structures of the signs, was investigated in the proposed iterative method that is a first required step for data-driven construction of sub-units, and shows that such a concept is suitable for sign modeling and recognition tasks. Except of experiments in the sign language recognition, additional tool \textit{search by example} was created and evaluated. This tool is a search engine for sign language videos. Such a system can be incorporated into an online sign language dictionary where it is difficult to search in the sign language data. This proposed tool employs several methods which were examined in the sign language recognition task and allows to search in the video corpora based on an user-given query that consists of one or multiple images of hands. As a result, an ordered list of videos that contain the same or similar hand configurations is returned

University of West Bohemia Digital Library

Digital Library University of West Bohemia

Comparison of Semantic Segmentation Approaches for Horizon/Sky Line Detection

Author: Ahmad Touqeer
Bebis George
Campr Pavel
Čadík Martin
Publication venue
Publication date: 21/05/2018
Field of study

Horizon or skyline detection plays a vital role towards mountainous visual geo-localization, however most of the recently proposed visual geo-localization approaches rely on \textbf{user-in-the-loop} skyline detection methods. Detecting such a segmenting boundary fully autonomously would definitely be a step forward for these localization approaches. This paper provides a quantitative comparison of four such methods for autonomous horizon/sky line detection on an extensive data set. Specifically, we provide the comparison between four recently proposed segmentation methods; one explicitly targeting the problem of horizon detection\cite{Ahmad15}, second focused on visual geo-localization but relying on accurate detection of skyline \cite{Saurer16} and other two proposed for general semantic segmentation -- Fully Convolutional Networks (FCN) \cite{Long15} and SegNet\cite{Badrinarayanan15}. Each of the first two methods is trained on a common training set \cite{Baatz12} comprised of about 200 images while models for the third and fourth method are fine tuned for sky segmentation problem through transfer learning using the same data set. Each of the method is tested on an extensive test set (about 3K images) covering various challenging geographical, weather, illumination and seasonal conditions. We report average accuracy and average absolute pixel error for each of the presented formulation.Comment: Proceedings of the International Joint Conference on Neural Networks (IJCNN) (oral presentation), IEEE Computational Intelligence Society, 201

arXiv.org e-Print Archive

Crossref

Sign Language Tutoring Tool

Author: Akarun Lale
Aran Oya
Ari Ismail
Benoit Alexandre
Campr Pavel
Caplier Alice
Carrillo Ana Huerta
Fanard François-Xavier
Rombaut Michele
Sankur Bulent
Publication venue
Publication date: 01/01/2007
Field of study

In this project, we have developed a sign language tutor that lets users learn isolated signs by watching recorded videos and by trying the same signs. The system records the user's video and analyses it. If the sign is recognized, both verbal and animated feedback is given to the user. The system is able to recognize complex signs that involve both hand gestures and head movements and expressions. Our performance tests yield a 99% recognition rate on signs involving only manual gestures and 85% recognition rate on signs that involve both manual and non manual components, such as head movement and facial expressions.Comment: eNTERFACE'06. Summer Workshop. on Multimodal Interfaces, Dubrovnik : Croatie (2007

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

SignTutor: An Interactive System for Sign Language Tutoring

Author: Alexandre Benoit
Alice Caplier
Ana Huerta Carrillo
Bülent Sankur
Fran¸ois-Xavier Fanard
Ismail Ari
Lale Akarun
Oya Aran
Pavel Campr
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Návrh a záznam korpusu české znakové řeči pro automatické rozpoznávání znakové řeči

Author: Campr Pavel
Hrúz Marek
Železný Miloš
Publication venue: ELRA
Publication date: 01/01/2008
Field of study

In this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual speech recognition system. The name of the database is UWB-07-ICAVR, where ICAVR stands for Impaired Condition Audio Visual speech Recognition. The corpus consist of 10000 utterances of continuous speech obtained from 50 speakers. The total length of the database is 25 hours. Each utterance is stored as a separate sentence. The corpus extends existing databases by covering condition of variable illumination. We acquired 50 speakers, where half of them were men and half of them were women. Recording was done by two cameras and two microphones. Database introduced in this paper can be used for testing of visual parameterization in audio-visual speech recognition (AVSR). Corpus can be easily split into training and testing part. Each speaker pronounced 200 sentences: ﬁrst 50 were the same for all, the rest of them were different. Six types of illumination were covered. Session for one speaker can ﬁt on one DVD disk. All ﬁles are accompanied by visual labels. Labels specify region of interest (mouth and area around them speciﬁed by bounding box). Actual pronunciation of each sentence is transcribed into the text ﬁle

University of West Bohemia Digital Library

Digital Library University of West Bohemia

Analýza korelace výrazu tváře a gest znakové řeči

Author: Campr Pavel
Hrúz Marek
Krňoul Zdeněk
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Tento článek analyzuje korelaci signálů produkovaných manuální a nemanuální složkou znakové řeči. Analýza je potřebná při dalším studiu problémů počítačového rozpoznání a syntézy znakové řeči. Nejprve jsou extrahovány příznaky z před-zaznamenaných obrazových dat. V článku je použita jednoduchá a robustní metoda pro sledování rukou, jejímž výsledkem je 2D trajektorie části manuální složky. Sledování pohybu a výrazu tváře je zajištěno pomocí Active Appearance Modelu. V článku jsou prezentovány prvotní výsledky provedených experimentů na korpusu izolovaných znaků z české znakové řeči. Z výsledků vyplývá, že manuální a nemanuální složky jsou korelované s největší shodou pro vertikální pohyb dominantní ruky a pohybu hlavy.In this paper we focus on the potential correlation of the manual and the non-manual component of sign language. This information is useful for sign language analysis, recognition and synthesis. We are mainly concerned with the application for sign synthesis. First we extracted features that represent the manual and non-manual component. We present a simple but robust method for the hand tracking to obtain a 2D trajectory representing a portion of the manual component. The head is tracked via Active Appearance Model. We introduce initial experiments to reveal the relationship between these features. The procedure is verified on the corpus of isolated signs from Czech Sign Language. The results imply that the components of sign language are correlated. The most correlated signals are the vertical movement of head and hands

Crossref

University of West Bohemia Digital Library

Digital Library University of West Bohemia

Metodika pro automatizovanou tvorbu slovníku znakového jazyka

Author: Campr Pavel
Hrúz Marek
Kanis Jakub
Publication venue: BMI sdružení
Publication date: 01/01/2011
Field of study

Tento článek je věnován popisu metodiky, jejímž cílem je zrychlit a zjednodušit tvorbu slovníků znakového jazyka. Tato metodika je založena na využití pokročilých metod zpracování digitalizovaného obrazu. Ty umožňují například automatickou detekci hranic jednotlivých znaků ve video nahrávce obsahující posloupnost znaků a dále také automatickou kategorizaci znaků na základě jejich automaticky rozpoznaných atributů

University of West Bohemia Digital Library

Digital Library University of West Bohemia

Tvorba a předzpracování korpusu českého znakového jazyka pro automatické rozpoznávání znakového jazyka

Author: Campr Pavel
Hrúz Marek
Trojanová Jana
Publication venue: ELRA
Publication date: 01/01/2008
Field of study

Tento článek popisuje tvorbu, nahrávání a předzpracování korpusu českého znakového jazyka. Korpus slouží pro trénování a testování automatických systémů pro rozpoznávání znakového jazyka. Tento korpus UWB-07-SLP-P obsahuje videonahrávky 4 znakujících osob, které jsou nasnímány za 3 pohledů. Dva z nich zachytávají celou osobu, třetí je zaměřena na obličej. Každý znakující provedl 378 znaků, každý pětkrát opakován. Korpus obsahuje celkem 21853 videosouborů v délce 11,1 hodin. Každý znak je předzpracován a jsou dostupné základní příznaky, jako např. trajektorie rukou v 3D prostoru.This paper discusses the design, recording and preprocessing of a Czech sign language corpus. The corpus is intended for training and testing of sign language recognition (SLR) systems. The UWB-07-SLR-P corpus contains video data of 4 signers recorded from 3 different perspectives. Two of the perspectives contain whole body and provide 3D motion data, the third one is focused on signer’s face and provide data for face expression and lip feature extraction. Each signer performed 378 signs with 5 repetitions. The corpus consists of several types of signs: numbers (35 signs), one and two-handed finger alphabet (64), town names (35) and other signs (244). Each sign is stored in a separate AVI file. In total the corpus consists of 21853 video files in total length of 11.1 hours. Additionally each sign is preprocessed and basic features such as 3D hand and head trajectories are available. The corpus is mainly focused on feature extraction and isolated SLR rather than continuous SLR experiments

University of West Bohemia Digital Library

Digital Library University of West Bohemia

Metodika pro automatizovanou tvorbu slovníku znakového jazyka

Author: Kanis Jakub
Hrúz Marek
Campr Pavel
Publication venue: BMI sdružení
Publication date: 01/01/2011
Field of study

University of West Bohemia Digital Library