3,830 research outputs found

    HAMEX - a Handwritten and Audio Dataset of Mathematical Expressions

    Get PDF
    International audienceIn this paper, we present HAMEX, a new public dataset that contains mathematical expressions available in their on-line handwritten form and in their audio spoken form. We have designed this dataset so that, given a mathematical expression, its handwritten signal and its audio signal can be used jointly to design multimodal recognition systems. Here, we describe the different steps that allowed us to acquire this dataset, from the creation of the mathematical expression corpora (including expressions from Wikipedia pages) to the segmentation and the transcription of the collected data, via the data collection process itself. Currently, the dataset contains 4 350 on-line handwritten mathematical expressions written by 58 writers, and the corresponding audio expressions (in French) spoken by 58 speakers. The ground truth is also provided both for the handwritten expressions (as INKML files with the digital ink, the symbol segmentation, and the MATHML structure) and for the audio expressions (as XML files with the transcriptions of the spoken expressions)

    Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval

    Full text link
    We summarize math search engines and search interfaces produced by the Document and Pattern Recognition Lab in recent years, and in particular the min math search interface and the Tangent search engine. Source code for both systems are publicly available. "The Masses" refers to our emphasis on creating systems for mathematical non-experts, who may be looking to define unfamiliar notation, or browse documents based on the visual appearance of formulae rather than their mathematical semantics.Comment: Paper for Invited Talk at 2015 Conference on Intelligent Computer Mathematics (July, Washington DC

    Probabilistic mathematical formula recognition using a 2D context-free graph grammar

    Get PDF
    We present a probabilistic framework for the mathematical expression recognition problem. The developed system is flexible in that its grammar can be extended easily thanks to its graph grammar which eliminates the need for specifying rule precedence. It is also optimal in the sense that all possible interpretations of the expressions are expanded without making early commitments or hard decisions. In this paper, we give an overview of the whole system and describe in detail the graph grammar and the parsing process used in the system, along with some preliminary results on character, structure and expression recognition performances

    Query-Driven Global Graph Attention Model for Visual Parsing: Recognizing Handwritten and Typeset Math Formulas

    Get PDF
    We present a new visual parsing method based on standard Convolutional Neural Networks (CNNs) for handwritten and typeset mathematical formulas. The Query-Driven Global Graph Attention (QD-GGA) parser employs multi-task learning, using a single feature representation for locating, classifying, and relating symbols. QD-GGA parses formulas by first constructing a Line-Of-Sight (LOS) graph over the input primitives (e.g handwritten strokes or connected components in images). Second, class distributions for LOS nodes and edges are obtained using query-specific feature filters (i.e., attention) in a single feed-forward pass. This allows end-to-end structure learning using a joint loss over primitive node and edge class distributions. Finally, a Maximum Spanning Tree (MST) is extracted from the weighted graph using Edmonds\u27 Arborescence Algorithm. The model may be run recurrently over the input graph, updating attention to focus on symbols detected in the previous iteration. QD-GGA does not require additional grammar rules and the language model is learned from the sets of symbols/relationships and the statistics over them in the training set. We benchmark our system against both handwritten and typeset state-of-the-art math recognition systems. Our preliminary results show that this is a promising new approach for visual parsing of math formulas. Using recurrent execution, symbol detection is near perfect for both handwritten and typeset formulas: we obtain a symbol f-measure of over 99.4% for both the CROHME (handwritten) and INFTYMCCDB-2 (typeset formula image) datasets. Our method is also much faster in both training and execution than state-of-the-art RNN-based formula parsers. The unlabeled structure detection of QDGGA is competitive with encoder-decoder models, but QD-GGA symbol and relationship classification is weaker. We believe this may be addressed through increased use of spatial features and global context

    Reconocimiento de notación matemática escrita a mano fuera de línea

    Get PDF
    El reconocimiento automático de expresiones matemáticas es uno de los problemas de reconocimiento de patrones, debido a que las matemáticas representan una fuente valiosa de información en muchos a ́reas de investigación. La escritura de expresiones matemáticas a mano es un medio de comunicación utilizado para la transmisión de información y conocimiento, con la cual se pueden generar de una manera sencilla escritos que contienen notación matemática. Este proceso puede volverse tedioso al ser escrito en lenguaje de composición tipográfica que pueda ser procesada por una computadora, tales como LATEX, MathML, entre otros. En los sistemas de reconocimiento de expresiones matem ́aticas existen dos m ́etodos diferentes a saber: fuera de l ́ınea y en l ́ınea. En esta tesis, se estudia el desempen ̃o de un sistema fuera de l ́ınea en donde se describen los pasos b ́asicos para lograr una mejor precisio ́n en el reconocimiento, las cuales esta ́n divididas en dos pasos principales: recono- cimiento de los s ́ımbolos de las ecuaciones matema ́ticas y el ana ́lisis de la estructura en que est ́an compuestos. Con el fin de convertir una expresi ́on matema ́tica escrita a mano en una expresio ́n equivalente en un sistema de procesador de texto, tal como TEX
    corecore