703,785 research outputs found

    Automatic Sign Language Recognition from Image Data

    Get PDF
    Tato práce se zabývá problematikou automatického rozpoznávání znakového jazyka z obrazových dat. Práce představuje pět hlavních přínosů v oblasti tvorby systému pro rozpoznávání, tvorby korpusů, extrakci příznaků z rukou a obličeje s využitím metod pro sledování pozice a pohybu rukou (tracking) a modelování znaků s využitím menších fonetických jednotek (sub-units). Metody využité v rozpoznávacím systému byly využity i k tvorbě vyhledávacího nástroje "search by example", který dokáže vyhledávat ve videozáznamech podle obrázku ruky. Navržený systém pro automatické rozpoznávání znakového jazyka je založen na statistickém přístupu s využitím skrytých Markovových modelů, obsahuje moduly pro analýzu video dat, modelování znaků a dekódování. Systém je schopen rozpoznávat jak izolované, tak spojité promluvy. Veškeré experimenty a vyhodnocení byly provedeny s vlastními korpusy UWB-06-SLR-A a UWB-07-SLR-P, první z nich obsahuje 25 znaků, druhý 378. Základní extrakce příznaků z video dat byla provedena na nízkoúrovňových popisech obrazu. Lepších výsledků bylo dosaženo s příznaky získaných z popisů vyšší úrovně porozumění obsahu v obraze, které využívají sledování pozice rukou a metodu pro segmentaci rukou v době překryvu s obličejem. Navíc, využitá metoda dokáže interpolovat obrazy s obličejem v době překryvu a umožňuje tak využít metody pro extrakci příznaků z obličeje, které by během překryvu nefungovaly, jako např. metoda active appearance models (AAM). Bylo porovnáno několik různých metod pro extrakci příznaků z rukou, jako např. local binary patterns (LBP), histogram of oriented gradients (HOG), vysokoúrovnové lingvistické příznaky a nové navržená metoda hand shape radial distance function (hRDF). Bylo také zkoumáno využití menších fonetických jednotek, než jsou celé znaky, tzv. sub-units. Pro první krok tvorby těchto jednotek byl navržen iterativní algoritmus, který tyto jednotky automaticky vytváří analýzou existujících dat. Bylo ukázáno, že tento koncept je vhodný pro modelování a rozpoznávání znaků. Kromě systému pro rozpoznávání je v práci navržen a představen systém "search by example", který funguje jako vyhledávací systém pro videa se záznamy znakového jazyka a může být využit například v online slovnících znakového jazyka, kde je v současné době složité či nemožné v takovýchto datech vyhledávat. Tento nástroj využívá metody, které byly použity v rozpoznávacím systému. Výstupem tohoto vyhledávacího nástroje je seřazený seznam videí, které obsahují stejný nebo podobný tvar ruky, které zadal uživatel, např. přes webkameru.Katedra kybernetikyObhájenoThis thesis addresses several issues of automatic sign language recognition, namely the creation of vision based sign language recognition framework, sign language corpora creation, feature extraction, making use of novel hand tracking with face occlusion handling, data-driven creation of sub-units and "search by example" tool for searching in sign language corpora using hand images as a search query. The proposed sign language recognition framework, based on statistical approach incorporating hidden Markov models (HMM), consists of video analysis, sign modeling and decoding modules. The framework is able to recognize both isolated signs and continuous utterances from video data. All experiments and evaluations were performed on two own corpora, UWB-06-SLR-A and UWB-07-SLR-P, the first containing 25 signs and second 378. As a baseline feature descriptors, low level image features are used. It is shown that better performance is gained by higher level features that employ hand tracking, which resolve occlusions of hands and face. As a side effect, the occlusion handling method interpolates face area in the frames during the occlusion and allows to use face feature descriptors that fail in such a case, for instance features extracted from active appearance models (AAM) tracker. Several state-of-the-art appearance-based feature descriptors were compared for tracked hands, such as local binary patterns (LBP), histogram of oriented gradients (HOG), high-level linguistic features or newly proposed hand shape radial distance function (denoted as hRDF) that enhances the feature description of hand-shape like concave regions. The concept of sub-units, that uses HMM models based on linguistic units smaller than whole sign and covers inner structures of the signs, was investigated in the proposed iterative method that is a first required step for data-driven construction of sub-units, and shows that such a concept is suitable for sign modeling and recognition tasks. Except of experiments in the sign language recognition, additional tool \textit{search by example} was created and evaluated. This tool is a search engine for sign language videos. Such a system can be incorporated into an online sign language dictionary where it is difficult to search in the sign language data. This proposed tool employs several methods which were examined in the sign language recognition task and allows to search in the video corpora based on an user-given query that consists of one or multiple images of hands. As a result, an ordered list of videos that contain the same or similar hand configurations is returned

    System for Subject Detection based on Image Signal Recognition

    Get PDF
    Tato diplomová práce se zabývá zpracováním obrazu, konkrétně rozpoznáváním předmětů na základě obrazového signálu. V práci jsou popsány základní vlastnosti digitálního obrazu, teoretické znalosti předzpracování digitálního obrazu a metody segmentace objektů v obraze. Praktická část práce se zabývá realizací knihovny pro zpracování obrazu v jazyce C++ ve vývojovém prostředí Microsoft Visual Studio a následné využití knihovny pro implementaci systému vizuálního rozhodčího stolní hry Qwirkle. Systém vizuálního rozhodčího využívá metody segmentace na principu prahování barev herních kamenů s následným rozpoznáním tvarů herních kamenů založeném na podobnosti se vzory. Vytvořený systém je doplněn o vhodné uživatelské rozhraní. Výsledky testování algoritmu jsou součástí práce.This diploma thesis deals with image processing, particularly about the subject detection based on image signal recognition. The basic features of digital image, the theoretical knowledge of digital image preprocessing and methods of segmentation of objects in the picture are described. The practical part deals with the implementation of the C ++ image processing library in the Microsoft Visual Studio development environment and the subsequent use of the library for the implementation of visual referee system in the Qwirkle board game. The Visual Referee System uses image segmentation based on the thresholding colour method of the playing tiles with subsequent recognition of the shape of the playing tiles based on similarity to the patterns. The created system is complemented by an appropriate user interface. The outcome of an algorithm test is a part of this diploma thesis as well.450 - Katedra kybernetiky a biomedicínského inženýrstvívýborn

    Querying out-of-vocabulary words in lexicon-based keyword spotting

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00521-016-2197-8[EN] Lexicon-based handwritten text keyword spotting (KWS) has proven to be a faster and more accurate alternative to lexicon-free methods. Nevertheless, since lexicon-based KWS relies on a predefined vocabulary, fixed in the training phase, it does not support queries involving out-of-vocabulary (OOV) keywords. In this paper, we outline previous work aimed at solving this problem and present a new approach based on smoothing the (null) scores of OOV keywords by means of the information provided by ``similar'' in-vocabulary words. Good results achieved using this approach are compared with previously published alternatives on different data sets.This work was partially supported by the Spanish MEC under FPU Grant FPU13/06281, by the Generalitat Valenciana under the Prometeo/2009/014 Project Grant ALMA-MATER, and through the EU Projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, grant Ref. 674943).Puigcerver, J.; Toselli, AH.; Vidal, E. (2016). Querying out-of-vocabulary words in lexicon-based keyword spotting. Neural Computing and Applications. 1-10. https://doi.org/10.1007/s00521-016-2197-8S110Almazan J, Gordo A, Fornes A, Valveny E (2013) Handwritten word spotting with corrected attributes. In: 2013 IEEE international conference on computer vision (ICCV), pp 1017–1024. doi: 10.1109/ICCV.2013.130Amengual JC, Vidal E (2000) On the estimation of error-correcting parameters. In: Proceedings 15th international conference on pattern recognition, 2000, vol 2, pp 883–886Fernández D, Lladós J, Fornés A (2011) Handwritten word spotting in old manuscript images using a pseudo-structural descriptor organized in a hash structure. In: Vitri'a J, Sanches JM, Hern'andez M (eds) Pattern recognition and image analysis: Proceedings of 5th Iberian Conference, IbPRIA 2011, Las Palmas de Gran Canaria, Spain, June 8–10. Springer, Berlin, Heidelberg, pp 628–635. doi: 10.1007/978-3-642-21257-4_78Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit Lett 33(7):934–942. doi: 10.1016/j.patrec.2011.09.009 Special Issue on Awards from ICPR 2010Fornés A, Frinken V, Fischer A, Almazán J, Jackson G, Bunke H (2011) A keyword spotting approach using blurred shape model-based descriptors. In: Proceedings of the 2011 workshop on historical document imaging and processing, pp 83–90. ACMFrinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224. doi: 10.1109/TPAMI.2011.113Gatos B, Pratikakis I (2009) Segmentation-free word spotting in historical printed documents. In: 10th International conference on document analysis and recognition, 2009. ICDAR’09, pp 271–275. IEEEJelinek F (1998) Statistical methods for speech recognition. MIT Press, CambridgeKneser R, Ney H (1995) Improved backing-off for N-gram language modeling. In: International conference on acoustics, speech and signal processing (ICASSP ’95), vol 1, pp 181–184. IEEE Computer Society, Los Alamitos, CA, USA. doi: http://doi.ieeecomputersociety.org/10.1109/ICASSP.1995.479394Kolcz A, Alspector J, Augusteijn M, Carlson R, Popescu GV (2000) A line-oriented approach to word spotting in handwritten documents. Pattern Anal Appl 3:153–168. doi: 10.1007/s100440070020Konidaris T, Gatos B, Ntzios K, Pratikakis I, Theodoridis S, Perantonis SJ (2007) Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int J Doc Anal Recognit 9(2–4):167–177Kumar G, Govindaraju V (2014) Bayesian active learning for keyword spotting in handwritten documents. In: 2014 22nd International conference on pattern recognition (ICPR), pp 2041–2046. IEEELevenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkMarti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46. doi: 10.1007/s100320200071Puigcerver J, Toselli AH, Vidal E (2014) Word-graph and character-lattice combination for KWS in handwritten documents. In: 14th International conference on frontiers in handwriting recognition (ICFHR), pp 181–186Puigcerver J, Toselli AH, Vidal E (2014) Word-graph-based handwriting keyword spotting of out-of-vocabulary queries. In: 22nd International conference on pattern recognition (ICPR), pp 2035–2040Puigcerver J, Toselli AH, Vidal E (2015) A new smoothing method for lexicon-based handwritten text keyword spotting. In: 7th Iberian conference on pattern recognition and image analysis. SpringerRath T, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 9:139–152Robertson S. (2008) A new interpretation of average precision. In: Proceedings of the international. ACM SIGIR conference on research and development in information retrieval (SIGIR ’08), pp 689–690. ACM, New York, NY, USA. doi: http://doi.acm.org/10.1145/1390334.1390453Rodriguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42(9):2106–2116. doi: 10.1016/j.patcog.2009.02.005 . http://www.sciencedirect.com/science/article/pii/S0031320309000673Rusinol M, Aldavert D, Toledo R, Llados J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: International conference on document analysis and recognition (ICDAR), pp 63–67. doi: 10.1109/ICDAR.2011.22Shang H, Merrettal T (1996) Tries for approximate string matching. IEEE Trans Knowl Data Eng 8(4):540–547Toselli AH, Vidal E (2013) Fast HMM-Filler approach for key word spotting in handwritten documents. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR), pp 501–505Toselli AH, Vidal E (2014) Word-graph based handwriting key-word spotting: impact of word-graph size on performance. In: 11th IAPR international workshop on document analysis systems (DAS), pp 176–180. IEEEToselli AH, Vidal E, Romero V, Frinken V (2013) Word-graph based keyword spotting and indexing of handwritten document images. Technical report, Universitat Politécnica de ValénciaVidal E, Toselli AH, Puigcerver J (2015) High performance query-by-example keyword spotting using query-by-string techniques. In: 2015 13th International conference on document analysis and recognition (ICDAR), pp 741–745. IEEEWoodland P, Leggetter C, Odell J, Valtchev V, Young S (1995) The 1994 HTK large vocabulary speech recognition system. In: International conference on acoustics, speech, and signal processing (ICASSP ’95), vol 1, pp 73 –76. doi: 10.1109/ICASSP.1995.479276Wshah S, Kumar G, Govindaraju V (2012) Script independent word spotting in offline handwritten documents based on hidden markov models. In: 2012 International conference on frontiers in handwriting recognition (ICFHR), pp 14–19. doi: 10.1109/ICFHR.2012.26

    Probabilistic Evaluation of 3D Surfaces Using Statistical Shape Models (SSM)

    Full text link
    [EN] Inspecting a 3D object which shape has elastic manufacturing tolerances in order to find defects is a challenging and time-consuming task. This task usually involves humans, either in the specification stage followed by some automatic measurements, or in other points along the process. Even when a detailed inspection is performed, the measurements are limited to a few dimensions instead of a complete examination of the object. In this work, a probabilistic method to evaluate 3D surfaces is presented. This algorithm relies on a training stage to learn the shape of the object building a statistical shape model. Making use of this model, any inspected object can be evaluated obtaining a probability that the whole object or any of its dimensions are compatible with the model, thus allowing to easily find defective objects. Results in simulated and real environments are presented and compared to two different alternatives.This work was partially funded by Generalitat Valenciana through IVACE (Valencian Institute of Business Competitiveness) distributed nominatively to Valencian technological innovation centres under project expedient IMAMCN/2020/1.Pérez, J.; Guardiola Garcia, JL.; Pérez Jiménez, AJ.; Perez-Cortes, J. (2020). Probabilistic Evaluation of 3D Surfaces Using Statistical Shape Models (SSM). Sensors. 20(22):1-16. https://doi.org/10.3390/s20226554S1162022Brosed, F. J., Aguilar, J. J., Guillomía, D., & Santolaria, J. (2010). 3D Geometrical Inspection of Complex Geometry Parts Using a Novel Laser Triangulation Sensor and a Robot. Sensors, 11(1), 90-110. doi:10.3390/s110100090Perez-Cortes, J.-C., Perez, A., Saez-Barona, S., Guardiola, J.-L., & Salvador, I. (2018). A System for In-Line 3D Inspection without Hidden Surfaces. Sensors, 18(9), 2993. doi:10.3390/s18092993Bi, Z. M., & Wang, L. (2010). Advances in 3D data acquisition and processing for industrial applications. Robotics and Computer-Integrated Manufacturing, 26(5), 403-413. doi:10.1016/j.rcim.2010.03.003Fu, K., Peng, J., He, Q., & Zhang, H. (2020). Single image 3D object reconstruction based on deep learning: A review. Multimedia Tools and Applications, 80(1), 463-498. doi:10.1007/s11042-020-09722-8Pichat, J., Iglesias, J. E., Yousry, T., Ourselin, S., & Modat, M. (2018). A Survey of Methods for 3D Histology Reconstruction. Medical Image Analysis, 46, 73-105. doi:10.1016/j.media.2018.02.004Pathak, V. K., Singh, A. K., Sivadasan, M., & Singh, N. K. (2016). Framework for Automated GD&T Inspection Using 3D Scanner. Journal of The Institution of Engineers (India): Series C, 99(2), 197-205. doi:10.1007/s40032-016-0337-7Bustos, B., Keim, D. A., Saupe, D., Schreck, T., & Vranić, D. V. (2005). Feature-based similarity search in 3D object databases. ACM Computing Surveys, 37(4), 345-387. doi:10.1145/1118890.1118893Mian, A., Bennamoun, M., & Owens, R. (2009). On the Repeatability and Quality of Keypoints for Local Feature-based 3D Object Retrieval from Cluttered Scenes. International Journal of Computer Vision, 89(2-3), 348-361. doi:10.1007/s11263-009-0296-zLiu, Z., Zhao, C., Wu, X., & Chen, W. (2017). An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors. Sensors, 17(3), 451. doi:10.3390/s17030451Barra, V., & Biasotti, S. (2013). 3D shape retrieval using Kernels on Extended Reeb Graphs. Pattern Recognition, 46(11), 2985-2999. doi:10.1016/j.patcog.2013.03.019Xie, J., Dai, G., Zhu, F., Wong, E. K., & Fang, Y. (2017). DeepShape: Deep-Learned Shape Descriptor for 3D Shape Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(7), 1335-1345. doi:10.1109/tpami.2016.2596722Lague, D., Brodu, N., & Leroux, J. (2013). Accurate 3D comparison of complex topography with terrestrial laser scanner: Application to the Rangitikei canyon (N-Z). ISPRS Journal of Photogrammetry and Remote Sensing, 82, 10-26. doi:10.1016/j.isprsjprs.2013.04.009Cook, K. L. (2017). An evaluation of the effectiveness of low-cost UAVs and structure from motion for geomorphic change detection. Geomorphology, 278, 195-208. doi:10.1016/j.geomorph.2016.11.009Martínez-Carricondo, P., Agüera-Vega, F., Carvajal-Ramírez, F., Mesas-Carrascosa, F.-J., García-Ferrer, A., & Pérez-Porras, F.-J. (2018). Assessment of UAV-photogrammetric mapping accuracy based on variation of ground control points. International Journal of Applied Earth Observation and Geoinformation, 72, 1-10. doi:10.1016/j.jag.2018.05.015Burdziakowski, P., Specht, C., Dabrowski, P. S., Specht, M., Lewicka, O., & Makar, A. (2020). Using UAV Photogrammetry to Analyse Changes in the Coastal Zone Based on the Sopot Tombolo (Salient) Measurement Project. Sensors, 20(14), 4000. doi:10.3390/s20144000MARDIA, K. V., & DRYDEN, I. L. (1989). The statistical analysis of shape data. Biometrika, 76(2), 271-281. doi:10.1093/biomet/76.2.271Heimann, T., & Meinzer, H.-P. (2009). Statistical shape models for 3D medical image segmentation: A review. Medical Image Analysis, 13(4), 543-563. doi:10.1016/j.media.2009.05.004Ambellan, F., Tack, A., Ehlke, M., & Zachow, S. (2019). Automated segmentation of knee bone and cartilage combining statistical shape knowledge and convolutional neural networks: Data from the Osteoarthritis Initiative. Medical Image Analysis, 52, 109-118. doi:10.1016/j.media.2018.11.009Avendi, M. R., Kheradvar, A., & Jafarkhani, H. (2016). A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac MRI. Medical Image Analysis, 30, 108-119. doi:10.1016/j.media.2016.01.005Booth, J., Roussos, A., Ponniah, A., Dunaway, D., & Zafeiriou, S. (2017). Large Scale 3D Morphable Models. International Journal of Computer Vision, 126(2-4), 233-254. doi:10.1007/s11263-017-1009-7Erus, G., Zacharaki, E. I., & Davatzikos, C. (2014). Individualized statistical learning from medical image databases: Application to identification of brain lesions. Medical Image Analysis, 18(3), 542-554. doi:10.1016/j.media.2014.02.00

    Extreme 3D Face Reconstruction: Seeing Through Occlusions

    Full text link
    Existing single view, 3D face reconstruction methods can produce beautifully detailed 3D results, but typically only for near frontal, unobstructed viewpoints. We describe a system designed to provide detailed 3D reconstructions of faces viewed under extreme conditions, out of plane rotations, and occlusions. Motivated by the concept of bump mapping, we propose a layered approach which decouples estimation of a global shape from its mid-level details (e.g., wrinkles). We estimate a coarse 3D face shape which acts as a foundation and then separately layer this foundation with details represented by a bump map. We show how a deep convolutional encoder-decoder can be used to estimate such bump maps. We further show how this approach naturally extends to generate plausible details for occluded facial regions. We test our approach and its components extensively, quantitatively demonstrating the invariance of our estimated facial details. We further provide numerous qualitative examples showing that our method produces detailed 3D face shapes in viewing conditions where existing state of the art often break down.Comment: Accepted to CVPR'18. Previously titled: "Extreme 3D Face Reconstruction: Looking Past Occlusions

    A 3D Face Modelling Approach for Pose-Invariant Face Recognition in a Human-Robot Environment

    Full text link
    Face analysis techniques have become a crucial component of human-machine interaction in the fields of assistive and humanoid robotics. However, the variations in head-pose that arise naturally in these environments are still a great challenge. In this paper, we present a real-time capable 3D face modelling framework for 2D in-the-wild images that is applicable for robotics. The fitting of the 3D Morphable Model is based exclusively on automatically detected landmarks. After fitting, the face can be corrected in pose and transformed back to a frontal 2D representation that is more suitable for face recognition. We conduct face recognition experiments with non-frontal images from the MUCT database and uncontrolled, in the wild images from the PaSC database, the most challenging face recognition database to date, showing an improved performance. Finally, we present our SCITOS G5 robot system, which incorporates our framework as a means of image pre-processing for face analysis

    Pigment Melanin: Pattern for Iris Recognition

    Full text link
    Recognition of iris based on Visible Light (VL) imaging is a difficult problem because of the light reflection from the cornea. Nonetheless, pigment melanin provides a rich feature source in VL, unavailable in Near-Infrared (NIR) imaging. This is due to biological spectroscopy of eumelanin, a chemical not stimulated in NIR. In this case, a plausible solution to observe such patterns may be provided by an adaptive procedure using a variational technique on the image histogram. To describe the patterns, a shape analysis method is used to derive feature-code for each subject. An important question is how much the melanin patterns, extracted from VL, are independent of iris texture in NIR. With this question in mind, the present investigation proposes fusion of features extracted from NIR and VL to boost the recognition performance. We have collected our own database (UTIRIS) consisting of both NIR and VL images of 158 eyes of 79 individuals. This investigation demonstrates that the proposed algorithm is highly sensitive to the patterns of cromophores and improves the iris recognition rate.Comment: To be Published on Special Issue on Biometrics, IEEE Transaction on Instruments and Measurements, Volume 59, Issue number 4, April 201
    corecore