2,358 research outputs found

    An IoT System for Converting Handwritten Text to Editable Format via Gesture Recognition

    Get PDF
    Evaluation of traditional classroom has led to electronic classroom i.e. e-learning. Growth of traditional classroom doesn’t stop at e-learning or distance learning. Next step to electronic classroom is a smart classroom. Most popular features of electronic classroom is capturing video/photos of lecture content and extracting handwriting for note-taking. Numerous techniques have been implemented in order to extract handwriting from video/photo of the lecture but still the deficiency of few techniques can be resolved, and which can turn electronic classroom into smart classroom. In this thesis, we present a real-time IoT system to convert handwritten text into editable format by implementing hand gesture recognition (HGR) with Raspberry Pi and camera. Hand Gesture Recognition (HGR) is built using edge detection algorithm and HGR is used in this system to reduce computational complexity of previous systems i.e. removal of redundant images and lecture’s body from image, recollecting text from previous images to fill area from where lecture’s body has been removed. Raspberry Pi is used to retrieve, perceive HGR and to build a smart classroom based on IoT. Handwritten images are converted into editable format by using OpenCV and machine learning algorithms. In text conversion, recognition of uppercase and lowercase alphabets, numbers, special characters, mathematical symbols, equations, graphs and figures are included with recognition of word, lines, blocks, and paragraphs. With the help of Raspberry Pi and IoT, the editable format of lecture notes is given to students via desktop application which helps students to edit notes and images according to their necessity

    Information Preserving Processing of Noisy Handwritten Document Images

    Get PDF
    Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

    Word Extraction Associated with a Confidence Index for On-Line Handwritten Sentence Recognition

    No full text
    International audienceThis paper presents a word extraction approach based on the use of a confidence index to limit the total number of segmentation hypotheses in order to further extend our on-line sentence recognition system to perform on-the-fly recognition. Our initial word extraction task is based on the characterization of the gap between each couple of consecutive strokes from the on-line signal of the handwritten sentence. A confidence index is associated to the gap classification result in order to evaluate its reliability. A reconsideration process is then performed to create additional segmentation hypotheses to ensure the presence of the correct segmentation among the hypotheses. In this process, we control the total number of segmentation hypotheses to limit the complexity of the recognition process and thus the execution time. This approach is evaluated on a test set of 425 English sentences written by 17 writers, using different metrics to analyze the impact of the word extraction task on the whole sentence recognition system's performances. The word extraction task using the best reconsideration strategy achieves a 97.94% word extraction rate and a 84.85% word recognition rate which represents a 33.1% word error rate decrease relatively to the initial word extraction task (with no segmentation hypothesis reconsideration)

    최적화 방법을 이용한 문서영상의 텍스트 라인 및 단어 검출법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 조남익.Locating text-lines and segmenting words in a document image are important processes for various document image processing applications such as optical character recognition, document rectification, layout analysis and document image compression. Thus, there have been a lot of researches in this area, and the segmentation of machine-printed documents scanned by flatbed scanners have been matured to some extent. However, in the case of handwritten documents, it is considered a challenging problem since the features of handwritten document are irregular and diverse depending on a person and his/her language. To address this problem, this dissertation presents new segmentation algorithms which extract text-lines and words from a document image based on a new super-pixel representation method and a new energy minimization framework from its characteristics. The overview of the proposed algorithms is as follows. First, this dissertation presents a text-line extraction algorithm for handwritten documents based on an energy minimization framework with a new super-pixel representation scheme. In order to deal with the documents in various languages, a language-independent text-line extraction algorithm is developed based on the super-pixel representation with normalized connected components(CCs). Due to this normalization, the proposed method is able to estimate the states of super-pixels for a range of different languages and writing styles. From the estimated states, an energy function is formulated whose minimization yields text-lines. Experimental results show that the proposed method yields the state-of-the-art performance on various handwritten databases. Second, a preprocessing method of historical documents for text-line detection is presented. Unlike modern handwritten documents, historical documents suffer from various types of degradations. To alleviate these roblems, the preprocessing algorithm including robust binarization and noise removal is introduced in this dissertation. For the robust binarization of historical documents, global and local thresholding binarization methods are combined to deal with various degradations such as stains and fainted characters. Also, the energy minimization framework is modified to fit the characteristics of historical documents. Experimental results on two historical databases show that the proposed preprocessing method with text-line detection algorithm achieves the best detection performance on severely degraded historical documents. Third, this dissertation presents word segmentation algorithm based on structured learning framework. In this dissertation, the word segmentation problem is formulated as a labeling problem that assigns a label (intra- word/inter-word gap) to each gap between the characters in a given text-line. In order to address the feature irregularities especially on handwritten documents, the word segmentation problem is formulated as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps based on the proposed text-line extraction results. Even though many parameters are involved in the formulation, all parameters are estimated based on the structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters. Experimental results on ICDAR 2009/2013 handwriting segmentation databases show that proposed method achieves the state-of-the-art performance on Latin-based and Indian languages.Abstract i Contents iii List of Figures vii List of Tables xiii 1 Introduction 1 1.1 Text-line Detection of Document Images 2 1.2 Word Segmentation of Document Images 5 1.3 Summary of Contribution 8 2 Related Work 11 2.1 Text-line Detection 11 2.2 Word Segmentation 13 3 Text-line Detection of Handwritten Document Images based on Energy Minimization 15 3.1 Proposed Approach for Text-line Detection 15 3.1.1 State Estimation of a Document Image 16 3.1.2 Problems with Under-segmented Super-pixels for Estimating States 18 3.1.3 A New Super-pixel Representation Method based on CC Partitioning 20 3.1.4 Cost Function for Text-line Segmentation 24 3.1.5 Minimization of Cost Function 27 3.2 Experimental Results of Various Handwritten Databases 30 3.2.1 Evaluation Measure 31 3.2.2 Parameter Selection 31 3.2.3 Experiment on HIT-MW Database 32 3.2.4 Experiment on ICDAR 2009/2013 Handwriting Segmentation Databases 35 3.2.5 Experiment on IAM Handwriting Database 38 3.2.6 Experiment on UMD Handwritten Arabic Database 46 3.2.7 Limitations 48 4 Preprocessing Method of Historical Document for Text-line Detection 53 4.1 Characteristics of Historical Documents 54 4.2 A Combined Approach for the Binarization of Historical Documents 56 4.3 Experimental Results of Text-line Detection for Historical Documents 61 4.3.1 Evaluation Measure and Configurations 61 4.3.2 George Washington Database 63 4.3.3 ICDAR 2015 ANDAR Datasets 65 5 Word Segmentation Method for Handwritten Documents based on Structured Learning 69 5.1 Proposed Approach for Word Segmentation 69 5.1.1 Text-line Segmentation and Super-pixel Representation 70 5.1.2 Proposed Energy Function for Word Segmentation 71 5.2 Structured Learning Framework 72 5.2.1 Feature Vector 72 5.2.2 Parameter Estimation by Structured SVM 75 5.3 Experimental Results 77 6 Conclusions 83 Bibliography 85 Abstract (Korean) 96Docto

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Design of an Offline Handwriting Recognition System Tested on the Bangla and Korean Scripts

    Get PDF
    This dissertation presents a flexible and robust offline handwriting recognition system which is tested on the Bangla and Korean scripts. Offline handwriting recognition is one of the most challenging and yet to be solved problems in machine learning. While a few popular scripts (like Latin) have received a lot of attention, many other widely used scripts (like Bangla) have seen very little progress. Features such as connectedness and vowels structured as diacritics make it a challenging script to recognize. A simple and robust design for offline recognition is presented which not only works reliably, but also can be used for almost any alphabetic writing system. The framework has been rigorously tested for Bangla and demonstrated how it can be transformed to apply to other scripts through experiments on the Korean script whose two-dimensional arrangement of characters makes it a challenge to recognize. The base of this design is a character spotting network which detects the location of different script elements (such as characters, diacritics) from an unsegmented word image. A transcript is formed from the detected classes based on their corresponding location information. This is the first reported lexicon-free offline recognition system for Bangla and achieves a Character Recognition Accuracy (CRA) of 94.8%. This is also one of the most flexible architectures ever presented. Recognition of Korean was achieved with a 91.2% CRA. Also, a powerful technique of autonomous tagging was developed which can drastically reduce the effort of preparing a dataset for any script. The combination of the character spotting method and the autonomous tagging brings the entire offline recognition problem very close to a singular solution. Additionally, a database named the Boise State Bangla Handwriting Dataset was developed. This is one of the richest offline datasets currently available for Bangla and this has been made publicly accessible to accelerate the research progress. Many other tools were developed and experiments were conducted to more rigorously validate this framework by evaluating the method against external datasets (CMATERdb 1.1.1, Indic Word Dataset and REID2019: Early Indian Printed Documents). Offline handwriting recognition is an extremely promising technology and the outcome of this research moves the field significantly ahead

    Automated framework for robust content-based verification of print-scan degraded text documents

    Get PDF
    Fraudulent documents frequently cause severe financial damages and impose security breaches to civil and government organizations. The rapid advances in technology and the widespread availability of personal computers has not reduced the use of printed documents. While digital documents can be verified by many robust and secure methods such as digital signatures and digital watermarks, verification of printed documents still relies on manual inspection of embedded physical security mechanisms.The objective of this thesis is to propose an efficient automated framework for robust content-based verification of printed documents. The principal issue is to achieve robustness with respect to the degradations and increased levels of noise that occur from multiple cycles of printing and scanning. It is shown that classic OCR systems fail under such conditions, moreover OCR systems typically rely heavily on the use of high level linguistic structures to improve recognition rates. However inferring knowledge about the contents of the document image from a-priori statistics is contrary to the nature of document verification. Instead a system is proposed that utilizes specific knowledge of the document to perform highly accurate content verification based on a Print-Scan degradation model and character shape recognition. Such specific knowledge of the document is a reasonable choice for the verification domain since the document contents are already known in order to verify them.The system analyses digital multi font PDF documents to generate a descriptive summary of the document, referred to as \Document Description Map" (DDM). The DDM is later used for verifying the content of printed and scanned copies of the original documents. The system utilizes 2-D Discrete Cosine Transform based features and an adaptive hierarchical classifier trained with synthetic data generated by a Print-Scan degradation model. The system is tested with varying degrees of Print-Scan Channel corruption on a variety of documents with corruption produced by repetitive printing and scanning of the test documents. Results show the approach achieves excellent accuracy and robustness despite the high level of noise

    Mathematical Expression Recognition based on Probabilistic Grammars

    Full text link
    [EN] Mathematical notation is well-known and used all over the world. Humankind has evolved from simple methods representing countings to current well-defined math notation able to account for complex problems. Furthermore, mathematical expressions constitute a universal language in scientific fields, and many information resources containing mathematics have been created during the last decades. However, in order to efficiently access all that information, scientific documents have to be digitized or produced directly in electronic formats. Although most people is able to understand and produce mathematical information, introducing math expressions into electronic devices requires learning specific notations or using editors. Automatic recognition of mathematical expressions aims at filling this gap between the knowledge of a person and the input accepted by computers. This way, printed documents containing math expressions could be automatically digitized, and handwriting could be used for direct input of math notation into electronic devices. This thesis is devoted to develop an approach for mathematical expression recognition. In this document we propose an approach for recognizing any type of mathematical expression (printed or handwritten) based on probabilistic grammars. In order to do so, we develop the formal statistical framework such that derives several probability distributions. Along the document, we deal with the definition and estimation of all these probabilistic sources of information. Finally, we define the parsing algorithm that globally computes the most probable mathematical expression for a given input according to the statistical framework. An important point in this study is to provide objective performance evaluation and report results using public data and standard metrics. We inspected the problems of automatic evaluation in this field and looked for the best solutions. We also report several experiments using public databases and we participated in several international competitions. Furthermore, we have released most of the software developed in this thesis as open source. We also explore some of the applications of mathematical expression recognition. In addition to the direct applications of transcription and digitization, we report two important proposals. First, we developed mucaptcha, a method to tell humans and computers apart by means of math handwriting input, which represents a novel application of math expression recognition. Second, we tackled the problem of layout analysis of structured documents using the statistical framework developed in this thesis, because both are two-dimensional problems that can be modeled with probabilistic grammars. The approach developed in this thesis for mathematical expression recognition has obtained good results at different levels. It has produced several scientific publications in international conferences and journals, and has been awarded in international competitions.[ES] La notación matemática es bien conocida y se utiliza en todo el mundo. La humanidad ha evolucionado desde simples métodos para representar cuentas hasta la notación formal actual capaz de modelar problemas complejos. Además, las expresiones matemáticas constituyen un idioma universal en el mundo científico, y se han creado muchos recursos que contienen matemáticas durante las últimas décadas. Sin embargo, para acceder de forma eficiente a toda esa información, los documentos científicos han de ser digitalizados o producidos directamente en formatos electrónicos. Aunque la mayoría de personas es capaz de entender y producir información matemática, introducir expresiones matemáticas en dispositivos electrónicos requiere aprender notaciones especiales o usar editores. El reconocimiento automático de expresiones matemáticas tiene como objetivo llenar ese espacio existente entre el conocimiento de una persona y la entrada que aceptan los ordenadores. De este modo, documentos impresos que contienen fórmulas podrían digitalizarse automáticamente, y la escritura se podría utilizar para introducir directamente notación matemática en dispositivos electrónicos. Esta tesis está centrada en desarrollar un método para reconocer expresiones matemáticas. En este documento proponemos un método para reconocer cualquier tipo de fórmula (impresa o manuscrita) basado en gramáticas probabilísticas. Para ello, desarrollamos el marco estadístico formal que deriva varias distribuciones de probabilidad. A lo largo del documento, abordamos la definición y estimación de todas estas fuentes de información probabilística. Finalmente, definimos el algoritmo que, dada cierta entrada, calcula globalmente la expresión matemática más probable de acuerdo al marco estadístico. Un aspecto importante de este trabajo es proporcionar una evaluación objetiva de los resultados y presentarlos usando datos públicos y medidas estándar. Por ello, estudiamos los problemas de la evaluación automática en este campo y buscamos las mejores soluciones. Asimismo, presentamos diversos experimentos usando bases de datos públicas y hemos participado en varias competiciones internacionales. Además, hemos publicado como código abierto la mayoría del software desarrollado en esta tesis. También hemos explorado algunas de las aplicaciones del reconocimiento de expresiones matemáticas. Además de las aplicaciones directas de transcripción y digitalización, presentamos dos propuestas importantes. En primer lugar, desarrollamos mucaptcha, un método para discriminar entre humanos y ordenadores mediante la escritura de expresiones matemáticas, el cual representa una novedosa aplicación del reconocimiento de fórmulas. En segundo lugar, abordamos el problema de detectar y segmentar la estructura de documentos utilizando el marco estadístico formal desarrollado en esta tesis, dado que ambos son problemas bidimensionales que pueden modelarse con gramáticas probabilísticas. El método desarrollado en esta tesis para reconocer expresiones matemáticas ha obtenido buenos resultados a diferentes niveles. Este trabajo ha producido varias publicaciones en conferencias internacionales y revistas, y ha sido premiado en competiciones internacionales.[CA] La notació matemàtica és ben coneguda i s'utilitza a tot el món. La humanitat ha evolucionat des de simples mètodes per representar comptes fins a la notació formal actual capaç de modelar problemes complexos. A més, les expressions matemàtiques constitueixen un idioma universal al món científic, i s'han creat molts recursos que contenen matemàtiques durant les últimes dècades. No obstant això, per accedir de forma eficient a tota aquesta informació, els documents científics han de ser digitalitzats o produïts directament en formats electrònics. Encara que la majoria de persones és capaç d'entendre i produir informació matemàtica, introduir expressions matemàtiques en dispositius electrònics requereix aprendre notacions especials o usar editors. El reconeixement automàtic d'expressions matemàtiques té per objectiu omplir aquest espai existent entre el coneixement d'una persona i l'entrada que accepten els ordinadors. D'aquesta manera, documents impresos que contenen fórmules podrien digitalitzar-se automàticament, i l'escriptura es podria utilitzar per introduir directament notació matemàtica en dispositius electrònics. Aquesta tesi està centrada en desenvolupar un mètode per reconèixer expressions matemàtiques. En aquest document proposem un mètode per reconèixer qualsevol tipus de fórmula (impresa o manuscrita) basat en gramàtiques probabilístiques. Amb aquesta finalitat, desenvolupem el marc estadístic formal que deriva diverses distribucions de probabilitat. Al llarg del document, abordem la definició i estimació de totes aquestes fonts d'informació probabilística. Finalment, definim l'algorisme que, donada certa entrada, calcula globalment l'expressió matemàtica més probable d'acord al marc estadístic. Un aspecte important d'aquest treball és proporcionar una avaluació objectiva dels resultats i presentar-los usant dades públiques i mesures estàndard. Per això, estudiem els problemes de l'avaluació automàtica en aquest camp i busquem les millors solucions. Així mateix, presentem diversos experiments usant bases de dades públiques i hem participat en diverses competicions internacionals. A més, hem publicat com a codi obert la majoria del software desenvolupat en aquesta tesi. També hem explorat algunes de les aplicacions del reconeixement d'expressions matemàtiques. A més de les aplicacions directes de transcripció i digitalització, presentem dues propostes importants. En primer lloc, desenvolupem mucaptcha, un mètode per discriminar entre humans i ordinadors mitjançant l'escriptura d'expressions matemàtiques, el qual representa una nova aplicació del reconeixement de fórmules. En segon lloc, abordem el problema de detectar i segmentar l'estructura de documents utilitzant el marc estadístic formal desenvolupat en aquesta tesi, donat que ambdós són problemes bidimensionals que poden modelar-se amb gramàtiques probabilístiques. El mètode desenvolupat en aquesta tesi per reconèixer expressions matemàtiques ha obtingut bons resultats a diferents nivells. Aquest treball ha produït diverses publicacions en conferències internacionals i revistes, i ha sigut premiat en competicions internacionals.Álvaro Muñoz, F. (2015). Mathematical Expression Recognition based on Probabilistic Grammars [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/51665TESI
    corecore