2,116 research outputs found

    Impact of Features and Classifiers Combinations on the Performances of Arabic Recognition Systems

    Get PDF
    International audienceArabic recognition is a very challenging task that begins to draw the attention of the OCR community. This work presents our latest contributions to this task, exploring the impact of several features and classifiers combinations on the performances of some developed systems. Different types of writings were considered (machine-printed, multi-fonts, handwritten , unconstrained, multi-writers, bi-dimensional, large vocabulary , ancient manuscripts). For each type of writing, we have considered both the most appropriate features and classifiers: contextual primitives to compensate the Arabic morphology variation, statistical features to recognize mathematical symbols and spectral features, mainly run lengths histogram-based features and histogram of oriented gradient-based descriptors to discriminate between machine-printed/handwritten and Ara-bic/Latin words. We have also used the shape context descriptor, for touching characters segmentation, which has been useful to train the models in the template-based recognition system. We have taken advantage of the Hough generalized transform to spot separator words in ancien arabic manuscripts. Otherwise Bayesian networks are used to apprehend the writing uncertainty and transparent neural networks to exploit the morphological aspect of Arabic language and integrate linguistic knowledge in the recognition process. The proposed systems are designed based on the characteristics, the similarities and the differences of Arabic writings

    Reconnaissance de formules mathématiques Arabes par un système dirigé par la syntaxe

    Get PDF
    L'objet de cette contribution est de présenter un système dirigé syntaxe qui reconnaît des formules mathématiques Arabes et retourne les résultats de la reconnaissance dans le format MathML. Un ensemble de règles de remplacement est défini par une grammaire de coordonnées pour analyser des formules mathématiques Arabes. Cette grammaire est employée en s'appuyant sur la reconnaissance de symboles et l'analyse de leur arrangement spatial. Nous avons utilisé les k plus proches voisins pour reconnaître des symboles mathématiques Arabes et un analyseur syntaxique à la fois descendant et ascendant qui repose sur la dominance d'opérateurs pour diviser récursivement la formule en sous formules plus simples. Dans le système proposé, les modules de la reconnaissance des symboles et de l'analyse structurelle s'interagissent d'une manière étroite. Il est ainsi possible d'utiliser des informations structurelles pour aider à deviner les symboles ambigus ou en confusion. Ce système de reconnaissance, dirigé par la syntaxe, a été démontré avec succès sur plusieurs types de formules se trouvant dans différents documents scientifiques Arabes

    Reconocimiento de notación matemática escrita a mano fuera de línea

    Get PDF
    El reconocimiento automático de expresiones matemáticas es uno de los problemas de reconocimiento de patrones, debido a que las matemáticas representan una fuente valiosa de información en muchos a ́reas de investigación. La escritura de expresiones matemáticas a mano es un medio de comunicación utilizado para la transmisión de información y conocimiento, con la cual se pueden generar de una manera sencilla escritos que contienen notación matemática. Este proceso puede volverse tedioso al ser escrito en lenguaje de composición tipográfica que pueda ser procesada por una computadora, tales como LATEX, MathML, entre otros. En los sistemas de reconocimiento de expresiones matem ́aticas existen dos m ́etodos diferentes a saber: fuera de l ́ınea y en l ́ınea. En esta tesis, se estudia el desempen ̃o de un sistema fuera de l ́ınea en donde se describen los pasos b ́asicos para lograr una mejor precisio ́n en el reconocimiento, las cuales esta ́n divididas en dos pasos principales: recono- cimiento de los s ́ımbolos de las ecuaciones matema ́ticas y el ana ́lisis de la estructura en que est ́an compuestos. Con el fin de convertir una expresi ́on matema ́tica escrita a mano en una expresio ́n equivalente en un sistema de procesador de texto, tal como TEX

    Features and Algorithms for Visual Parsing of Handwritten Mathematical Expressions

    Get PDF
    Math expressions are an essential part of scientific documents. Handwritten math expressions recognition can benefit human-computer interaction especially in the education domain and is a critical part of document recognition and analysis. Parsing the spatial arrangement of symbols is an essential part of math expression recognition. A variety of parsing techniques have been developed during the past three decades, and fall into two groups. The first group is graph-based parsing. It selects a path or sub-graph which obeys some rule to form a possible interpretation for the given expression. The second group is grammar driven parsing. Grammars and related parameters are defined manually for different tasks. The time complexity of these two groups parsing is high, and they often impose some strict constraints to reduce the computation. The aim of this thesis is working towards building a straightforward and effective parser with as few constraints as possible. First, we propose using a line of sight graph for representing the layout of strokes and symbols in math expressions. It achieves higher F-score than other graph representations and reduces search space for parsing. Second, we modify the shape context feature with Parzen window density estimation. This feature set works well for symbol segmentation, symbol classification and symbol layout analysis. We get a higher symbol segmentation F-score than other systems on CROHME 2014 dataset. Finally, we develop a Maximum Spanning Tree (MST) based parser using Edmonds\u27 algorithm, which extracts an MST from the directed line of sight graph in two passes: first symbols are segmented, and then symbols and spatial relationship are labeled. The time complexity of our MST-based parsing is lower than the time complexity of CYK parsing with context-free grammars. Also, our MST-based parsing obtains higher structure rate and expression rate than CYK parsing when symbol segmentation is accurate. Correct structure means we get the structure of the symbol layout tree correct, even though the label of the edge in the symbol layout tree might be wrong. The performance of our math expression recognition system with MST-based parsing is competitive on CROHME 2012 and 2014 datasets. For future work, how to incorporate symbol classifier result and correct segmentation error in MST-based parsing needs more research

    Special Libraries, October 1955

    Get PDF
    Volume 46, Issue 8https://scholarworks.sjsu.edu/sla_sl_1955/1007/thumbnail.jp

    Latin as the Language of Science and Learning

    Get PDF
    This book approaches its topic from three angles: diachronic semantics, a panorama of science and scientific writing in Latin from antiquity up to the present, and linguistic forays that try to demarcate scientific Latin from other registers and to identify groups of different language use among the authors introduced in the panorama. The transition of science from Greek through Latin to the modern vernaculars is especially emphasised
    corecore