78 research outputs found

    Reliable pattern recognition system with novel semi-supervised learning approach

    Get PDF
    Over the past decade, there has been considerable progress in the design of statistical machine learning strategies, including Semi-Supervised Learning (SSL) approaches. However, researchers still have difficulties in applying most of these learning strategies when two or more classes overlap, and/or when each class has a bimodal/multimodal distribution. In this thesis, an efficient, robust, and reliable recognition system with a novel SSL scheme has been developed to overcome overlapping problems between two classes and bimodal distribution within each class. This system was based on the nature of category learning and recognition to enhance the system's performance in relevant applications. In the training procedure, besides the supervised learning strategy, the unsupervised learning approach was applied to retrieve the "extra information" that could not be obtained from the images themselves. This approach was very helpful for the classification between two confusing classes. In this SSL scheme, both the training data and the test data were utilized in the final classification. In this thesis, the design of a promising supervised learning model with advanced state-of-the-art technologies is firstly presented, and a novel rejection measurement for verification of rejected samples, namely Linear Discriminant Analysis Measurement (LDAM), is defined. Experiments on CENPARMI's Hindu-Arabic Handwritten Numeral Database, CENPARMI's Numerals Database, and NIST's Numerals Database were conducted in order to evaluate the efficiency of LDAM. Moreover, multiple verification modules, including a Writing Style Verification (WSV) module, have been developed according to four newly defined error categories. The error categorization was based on the different costs of misclassification. The WSV module has been developed by the unsupervised learning approach to automatically retrieve the person's writing styles so that the rejected samples can be classified and verified accordingly. As a result, errors on CENPARMI's Hindu-Arabic Handwritten Numeral Database (24,784 training samples, 6,199 testing samples) were reduced drastically from 397 to 59, and the final recognition rate of this HAHNR reached 99.05%, a significantly higher rate compared to other experiments on the same database. When the rejection option was applied on this database, the recognition rate, error rate, and reliability were 97.89%, 0.63%, and 99.28%, respectivel

    Incorporation of relational information in feature representation for online handwriting recognition of Arabic characters

    Get PDF
    Interest in online handwriting recognition is increasing due to market demand for both improved performance and for extended supporting scripts for digital devices. Robust handwriting recognition of complex patterns of arbitrary scale, orientation and location is elusive to date because reaching a target recognition rate is not trivial for most of the applications in this field. Cursive scripts such as Arabic and Persian with complex character shapes make the recognition task even more difficult. Challenges in the discrimination capability of handwriting recognition systems depend heavily on the effectiveness of the features used to represent the data, the types of classifiers deployed and inclusive databases used for learning and recognition which cover variations in writing styles that introduce natural deformations in character shapes. This thesis aims to improve the efficiency of online recognition systems for Persian and Arabic characters by presenting new formal feature representations, algorithms, and a comprehensive database for online Arabic characters. The thesis contains the development of the first public collection of online handwritten data for the Arabic complete-shape character set. New ideas for incorporating relational information in a feature representation for this type of data are presented. The proposed techniques are computationally efficient and provide compact, yet representative, feature vectors. For the first time, a hybrid classifier is used for recognition of online Arabic complete-shape characters based on the idea of decomposing the input data into variables representing factors of the complete-shape characters and the combined use of the Bayesian network inference and support vector machines. We advocate the usefulness and practicality of the features and recognition methods with respect to the recognition of conventional metrics, such as accuracy and timeliness, as well as unconventional metrics. In particular, we evaluate a feature representation for different character class instances by its level of separation in the feature space. Our evaluation results for the available databases and for our own database of the characters' main shapes confirm a higher efficiency than previously reported techniques with respect to all metrics analyzed. For the complete-shape characters, our techniques resulted in a unique recognition efficiency comparable with the state-of-the-art results for main shape characters

    SEARCHING HETEROGENEOUS DOCUMENT IMAGE COLLECTIONS

    Get PDF
    A decrease in data storage costs and widespread use of scanning devices has led to massive quantities of scanned digital documents in corporations, organizations, and governments around the world. Automatically processing these large heterogeneous collections can be difficult due to considerable variation in resolution, quality, font, layout, noise, and content. In order to make this data available to a wide audience, methods for efficient retrieval and analysis from large collections of document images remain an open and important area of research. In this proposal, we present research in three areas that augment the current state of the art in the retrieval and analysis of large heterogeneous document image collections. First, we explore an efficient approach to document image retrieval, which allows users to perform retrieval against large image collections in a query-by-example manner. Our approach is compared to text retrieval of OCR on a collection of 7 million document images collected from lawsuits against tobacco companies. Next, we present research in document verification and change detection, where one may want to quickly determine if two document images contain any differences (document verification) and if so, to determine precisely what and where changes have occurred (change detection). A motivating example is legal contracts, where scanned images are often e-mailed back and forth and small changes can have severe ramifications. Finally, approaches useful for exploiting the biometric properties of handwriting in order to perform writer identification and retrieval in document images are examined

    최적화 방법을 이용한 문서영상의 텍스트 라인 및 단어 검출법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 조남익.Locating text-lines and segmenting words in a document image are important processes for various document image processing applications such as optical character recognition, document rectification, layout analysis and document image compression. Thus, there have been a lot of researches in this area, and the segmentation of machine-printed documents scanned by flatbed scanners have been matured to some extent. However, in the case of handwritten documents, it is considered a challenging problem since the features of handwritten document are irregular and diverse depending on a person and his/her language. To address this problem, this dissertation presents new segmentation algorithms which extract text-lines and words from a document image based on a new super-pixel representation method and a new energy minimization framework from its characteristics. The overview of the proposed algorithms is as follows. First, this dissertation presents a text-line extraction algorithm for handwritten documents based on an energy minimization framework with a new super-pixel representation scheme. In order to deal with the documents in various languages, a language-independent text-line extraction algorithm is developed based on the super-pixel representation with normalized connected components(CCs). Due to this normalization, the proposed method is able to estimate the states of super-pixels for a range of different languages and writing styles. From the estimated states, an energy function is formulated whose minimization yields text-lines. Experimental results show that the proposed method yields the state-of-the-art performance on various handwritten databases. Second, a preprocessing method of historical documents for text-line detection is presented. Unlike modern handwritten documents, historical documents suffer from various types of degradations. To alleviate these roblems, the preprocessing algorithm including robust binarization and noise removal is introduced in this dissertation. For the robust binarization of historical documents, global and local thresholding binarization methods are combined to deal with various degradations such as stains and fainted characters. Also, the energy minimization framework is modified to fit the characteristics of historical documents. Experimental results on two historical databases show that the proposed preprocessing method with text-line detection algorithm achieves the best detection performance on severely degraded historical documents. Third, this dissertation presents word segmentation algorithm based on structured learning framework. In this dissertation, the word segmentation problem is formulated as a labeling problem that assigns a label (intra- word/inter-word gap) to each gap between the characters in a given text-line. In order to address the feature irregularities especially on handwritten documents, the word segmentation problem is formulated as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps based on the proposed text-line extraction results. Even though many parameters are involved in the formulation, all parameters are estimated based on the structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters. Experimental results on ICDAR 2009/2013 handwriting segmentation databases show that proposed method achieves the state-of-the-art performance on Latin-based and Indian languages.Abstract i Contents iii List of Figures vii List of Tables xiii 1 Introduction 1 1.1 Text-line Detection of Document Images 2 1.2 Word Segmentation of Document Images 5 1.3 Summary of Contribution 8 2 Related Work 11 2.1 Text-line Detection 11 2.2 Word Segmentation 13 3 Text-line Detection of Handwritten Document Images based on Energy Minimization 15 3.1 Proposed Approach for Text-line Detection 15 3.1.1 State Estimation of a Document Image 16 3.1.2 Problems with Under-segmented Super-pixels for Estimating States 18 3.1.3 A New Super-pixel Representation Method based on CC Partitioning 20 3.1.4 Cost Function for Text-line Segmentation 24 3.1.5 Minimization of Cost Function 27 3.2 Experimental Results of Various Handwritten Databases 30 3.2.1 Evaluation Measure 31 3.2.2 Parameter Selection 31 3.2.3 Experiment on HIT-MW Database 32 3.2.4 Experiment on ICDAR 2009/2013 Handwriting Segmentation Databases 35 3.2.5 Experiment on IAM Handwriting Database 38 3.2.6 Experiment on UMD Handwritten Arabic Database 46 3.2.7 Limitations 48 4 Preprocessing Method of Historical Document for Text-line Detection 53 4.1 Characteristics of Historical Documents 54 4.2 A Combined Approach for the Binarization of Historical Documents 56 4.3 Experimental Results of Text-line Detection for Historical Documents 61 4.3.1 Evaluation Measure and Configurations 61 4.3.2 George Washington Database 63 4.3.3 ICDAR 2015 ANDAR Datasets 65 5 Word Segmentation Method for Handwritten Documents based on Structured Learning 69 5.1 Proposed Approach for Word Segmentation 69 5.1.1 Text-line Segmentation and Super-pixel Representation 70 5.1.2 Proposed Energy Function for Word Segmentation 71 5.2 Structured Learning Framework 72 5.2.1 Feature Vector 72 5.2.2 Parameter Estimation by Structured SVM 75 5.3 Experimental Results 77 6 Conclusions 83 Bibliography 85 Abstract (Korean) 96Docto

    AutoGraff: towards a computational understanding of graffiti writing and related art forms.

    Get PDF
    The aim of this thesis is to develop a system that generates letters and pictures with a style that is immediately recognizable as graffiti art or calligraphy. The proposed system can be used similarly to, and in tight integration with, conventional computer-aided geometric design tools and can be used to generate synthetic graffiti content for urban environments in games and in movies, and to guide robotic or fabrication systems that can materialise the output of the system with physical drawing media. The thesis is divided into two main parts. The first part describes a set of stroke primitives, building blocks that can be combined to generate different designs that resemble graffiti or calligraphy. These primitives mimic the process typically used to design graffiti letters and exploit well known principles of motor control to model the way in which an artist moves when incrementally tracing stylised letter forms. The second part demonstrates how these stroke primitives can be automatically recovered from input geometry defined in vector form, such as the digitised traces of writing made by a user, or the glyph outlines in a font. This procedure converts the input geometry into a seed that can be transformed into a variety of calligraphic and graffiti stylisations, which depend on parametric variations of the strokes

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Contributions to non-conventional biometric systems : improvements on the fingerprint, facial and handwriting recognition approach

    Get PDF
    Tese (doutorado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2021.Os sistemas biométricos são amplamente utilizados pela sociedade. A maioria das aplicações desses sistemas está associada à identificação civil e à investigação criminal. No entanto, com o tempo, o desempenho dos métodos tradicionais de biometria está chegando ao limite. Neste contexto, sistemas biométricos emergentes ou não convencionais estão ganhando importância. Embora promissores, novos sistemas, assim como qualquer nova tecnologia, trazem consigo não apenas potencialidades, mas também fragilidades. Este trabalho apresenta contribuições para três importantes sistemas biométricos não convencionais (SBNC): impressão digital, reconhecimento facial e reconhecimento de escrita. No que diz respeito às impressões digitais, este trabalho apresenta um novo método para detectar a vida em dispositivos de impressão digital multivista sem toque, utilizando descritores de textura e redes neurais artificiais. Com relação ao reconhecimento facial, um método de reconhecimento de faces baseado em algoritmos de característica invariante à escala (SIFT e SURF) que opera sem a necessidade de treinamento prévio do classificador e que realiza o rastreamento de indivíduos em ambientes não controlados é apresentado. Finalmente, um método de baixo custo que usa sinais de acelerômetro e giroscópio obtidos a partir de um sensor acoplado a canetas convencionais para realizar o reconhecimento em tempo real de assinaturas é apresentado. Resultados mostram que os métodos propostos são promissores e que juntos podem contribuir para o aprimoramento dos SBNCCoordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).Biometric systems are widely used by society. Most applications are associated with civil identification and criminal investigation. However, over time, traditional methods of performing biometrics have been reaching their limits. In this context, emerging or nonconventional biometric systems (NCBS) are gaining ground. Although promising, new systems, as well as any new technology, bring not only potentialities but also weaknesses. This work presents contributions to three important non-conventional biometric systems: fingerprint, facial, and handwriting recognition. With regard to fingerprints, this work presents a novel method for detecting life on Touchless Multi-view Fingerprint Devices, using Texture Descriptors and Artificial Neural Networks. With regard to face recognition, a facial recognition method is presented, based on Scale Invariant Feature Algorithms (SIFT and SURF), that operates without the need of previous training of a classifier and can be used to track individuals in an unconstrained environment. Finally, a low-cost on-line handwriting signature recognition method that uses accelerometer and gyroscope signals obtained from a sensor coupled to conventional pens to identify individuals in real time is presented. Results show that the proposed methods are promising and that together may contribute to the improvement of the NCB
    corecore