2,628 research outputs found
IMEGE: Image-based Mathematical Expression Global Error
Mathematical expression recognition is an active research eld that is related to document image analysis and typesetting. Several approaches have been proposed to tackle this problem, and automatic methods for performance evaluation are required. Mathematical expressions are usually represented as a coded string like LATEX or MathML for evaluation purpose. This representation has ambiguity problems given that the same expression can be coded in several ways. For that reason, the proposed approaches in the past either manually analyzed recognition results or they reported partial errors as symbol error rate. In this study,
we present a novel global performance evaluation measure for mathematical expression based on image matching. In this way, using an image representation solves the representation ambiguity as well as human beings do. The proposed evaluation method is a global error measure that also provides local information about the recognition result.Álvaro Muñoz, F.; Sánchez Peiró, JA.; Benedí Ruiz, JM. (2011). IMEGE: Image-based Mathematical Expression Global Error. http://hdl.handle.net/10251/1308
Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models
[EN] This paper describes a formal model for the recognition of on-line handwritten mathematical expressions
using 2D stochastic context-free grammars and hidden Markov models. Hidden Markov models are used
to recognize mathematical symbols, and a stochastic context-free grammar is used to model the relation
between these symbols. This formal model makes possible to use classic algorithms for parsing and stochastic
estimation. In this way, first, the model is able to capture many of variability phenomena that
appear in on-line handwritten mathematical expressions during the training process. And second, the
parsing process can make decisions taking into account only stochastic information, and avoiding heuristic
decisions. The proposed model participated in a contest of mathematical expression recognition and it
obtained the best results at different levels.
2012 Elsevier B.V. All rights reserved.Work supported by the EC (FEDER/ FSE) and the Spanish MEC/MICINN under the MIPRCV ‘‘Consolider Ingenio 2010’’ program (CSD2007-00018), the MITTRAL (TIN2009-14633-C03-01) project, the FPU Grant (AP2009-4363), and by the Generalitat Valenciana under the Grant Prometeo/2009/014.Álvaro Muñoz, F.; Sánchez Peiró, JA.; Benedí Ruiz, JM. (2014). Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models. Pattern Recognition Letters. 35:58-67. https://doi.org/10.1016/j.patrec.2012.09.023S58673
Mathematical Expression Recognition based on Probabilistic Grammars
[EN] Mathematical notation is well-known and used all over the
world. Humankind has evolved from simple methods representing
countings to current well-defined math notation able to account for
complex problems. Furthermore, mathematical expressions constitute a
universal language in scientific fields, and many information
resources containing mathematics have been created during the last
decades. However, in order to efficiently access all that information,
scientific documents have to be digitized or produced directly in
electronic formats.
Although most people is able to understand and produce mathematical
information, introducing math expressions into electronic devices
requires learning specific notations or using editors. Automatic
recognition of mathematical expressions aims at filling this gap
between the knowledge of a person and the input accepted by
computers. This way, printed documents containing math expressions
could be automatically digitized, and handwriting could be used for
direct input of math notation into electronic devices.
This thesis is devoted to develop an approach for mathematical
expression recognition. In this document we propose an approach for
recognizing any type of mathematical expression (printed or
handwritten) based on probabilistic grammars. In order to do so, we
develop the formal statistical framework such that derives several
probability distributions. Along the document, we deal with the
definition and estimation of all these probabilistic sources of
information. Finally, we define the parsing algorithm that globally
computes the most probable mathematical expression for a given input
according to the statistical framework.
An important point in this study is to provide objective performance
evaluation and report results using public data and standard
metrics. We inspected the problems of automatic evaluation in this
field and looked for the best solutions. We also report several
experiments using public databases and we participated in several
international competitions. Furthermore, we have released most of the
software developed in this thesis as open source.
We also explore some of the applications of mathematical expression
recognition. In addition to the direct applications of transcription
and digitization, we report two important proposals. First, we
developed mucaptcha, a method to tell humans and computers apart by
means of math handwriting input, which represents a novel application
of math expression recognition. Second, we tackled the problem of
layout analysis of structured documents using the statistical
framework developed in this thesis, because both are two-dimensional
problems that can be modeled with probabilistic grammars.
The approach developed in this thesis for mathematical expression
recognition has obtained good results at different levels. It has
produced several scientific publications in international conferences
and journals, and has been awarded in international competitions.[ES] La notación matemática es bien conocida y se utiliza en todo el
mundo. La humanidad ha evolucionado desde simples métodos para
representar cuentas hasta la notación formal actual capaz de modelar
problemas complejos. Además, las expresiones matemáticas constituyen
un idioma universal en el mundo científico, y se han creado muchos
recursos que contienen matemáticas durante las últimas décadas. Sin
embargo, para acceder de forma eficiente a toda esa información, los
documentos científicos han de ser digitalizados o producidos
directamente en formatos electrónicos.
Aunque la mayoría de personas es capaz de entender y producir
información matemática, introducir expresiones matemáticas en
dispositivos electrónicos requiere aprender notaciones especiales o
usar editores. El reconocimiento automático de expresiones matemáticas
tiene como objetivo llenar ese espacio existente entre el conocimiento
de una persona y la entrada que aceptan los ordenadores. De este modo,
documentos impresos que contienen fórmulas podrían digitalizarse
automáticamente, y la escritura se podría utilizar para introducir
directamente notación matemática en dispositivos electrónicos.
Esta tesis está centrada en desarrollar un método para reconocer
expresiones matemáticas. En este documento proponemos un método para
reconocer cualquier tipo de fórmula (impresa o manuscrita) basado en
gramáticas probabilísticas. Para ello, desarrollamos el marco
estadístico formal que deriva varias distribuciones de probabilidad. A
lo largo del documento, abordamos la definición y estimación de todas
estas fuentes de información probabilística. Finalmente, definimos el
algoritmo que, dada cierta entrada, calcula globalmente la expresión
matemática más probable de acuerdo al marco estadístico.
Un aspecto importante de este trabajo es proporcionar una evaluación
objetiva de los resultados y presentarlos usando datos públicos y
medidas estándar. Por ello, estudiamos los problemas de la evaluación
automática en este campo y buscamos las mejores soluciones. Asimismo,
presentamos diversos experimentos usando bases de datos públicas y
hemos participado en varias competiciones internacionales. Además,
hemos publicado como código abierto la mayoría del software
desarrollado en esta tesis.
También hemos explorado algunas de las aplicaciones del reconocimiento
de expresiones matemáticas. Además de las aplicaciones directas de
transcripción y digitalización, presentamos dos propuestas
importantes. En primer lugar, desarrollamos mucaptcha, un método para
discriminar entre humanos y ordenadores mediante la escritura de
expresiones matemáticas, el cual representa una novedosa aplicación
del reconocimiento de fórmulas. En segundo lugar, abordamos el
problema de detectar y segmentar la estructura de documentos
utilizando el marco estadístico formal desarrollado en esta tesis,
dado que ambos son problemas bidimensionales que pueden modelarse con
gramáticas probabilísticas.
El método desarrollado en esta tesis para reconocer expresiones
matemáticas ha obtenido buenos resultados a diferentes niveles. Este
trabajo ha producido varias publicaciones en conferencias
internacionales y revistas, y ha sido premiado en competiciones
internacionales.[CA] La notació matemàtica és ben coneguda i s'utilitza a tot el món. La
humanitat ha evolucionat des de simples mètodes per representar
comptes fins a la notació formal actual capaç de modelar
problemes complexos. A més, les expressions matemàtiques
constitueixen un idioma universal al món científic, i s'han creat
molts recursos que contenen matemàtiques durant les últimes
dècades. No obstant això, per accedir de forma eficient a tota
aquesta informació, els documents científics han de ser
digitalitzats o produïts directament en formats electrònics.
Encara que la majoria de persones és capaç d'entendre i produir
informació matemàtica, introduir expressions matemàtiques en
dispositius electrònics requereix aprendre notacions especials o usar
editors. El reconeixement automàtic d'expressions matemàtiques
té per objectiu omplir aquest espai existent entre el coneixement
d'una persona i l'entrada que accepten els ordinadors. D'aquesta
manera, documents impresos que contenen fórmules podrien
digitalitzar-se automàticament, i l'escriptura es podria utilitzar per
introduir directament notació matemàtica en dispositius electrònics.
Aquesta tesi està centrada en desenvolupar un mètode per reconèixer
expressions matemàtiques. En aquest document proposem un mètode per
reconèixer qualsevol tipus de fórmula (impresa o manuscrita) basat en
gramàtiques probabilístiques. Amb aquesta finalitat, desenvolupem el
marc estadístic formal que deriva diverses distribucions de
probabilitat. Al llarg del document, abordem la definició i estimació
de totes aquestes fonts d'informació probabilística. Finalment,
definim l'algorisme que, donada certa entrada, calcula globalment
l'expressió matemàtica més probable d'acord al marc estadístic.
Un aspecte important d'aquest treball és proporcionar una avaluació
objectiva dels resultats i presentar-los usant dades públiques i
mesures estàndard. Per això, estudiem els problemes de l'avaluació
automàtica en aquest camp i busquem les millors solucions. Així
mateix, presentem diversos experiments usant bases de dades públiques
i hem participat en diverses competicions internacionals. A més, hem
publicat com a codi obert la majoria del software desenvolupat en
aquesta tesi.
També hem explorat algunes de les aplicacions del reconeixement
d'expressions matemàtiques. A més de les aplicacions directes de
transcripció i digitalització, presentem dues propostes
importants. En primer lloc, desenvolupem mucaptcha, un mètode per
discriminar entre humans i ordinadors mitjançant l'escriptura
d'expressions matemàtiques, el qual representa una nova aplicació del
reconeixement de fórmules. En segon lloc, abordem el problema de
detectar i segmentar l'estructura de documents utilitzant el marc
estadístic formal desenvolupat en aquesta tesi, donat que ambdós són
problemes bidimensionals que poden modelar-se amb gramàtiques
probabilístiques.
El mètode desenvolupat en aquesta tesi per reconèixer expressions
matemàtiques ha obtingut bons resultats a diferents nivells. Aquest
treball ha produït diverses publicacions en conferències
internacionals i revistes, i ha sigut premiat en competicions
internacionals.Álvaro Muñoz, F. (2015). Mathematical Expression Recognition based on Probabilistic Grammars [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/51665TESI
Features and Algorithms for Visual Parsing of Handwritten Mathematical Expressions
Math expressions are an essential part of scientific documents. Handwritten math expressions recognition can benefit human-computer interaction especially in the education domain and is a critical part of document recognition and analysis.
Parsing the spatial arrangement of symbols is an essential part of math expression recognition. A variety of parsing techniques have been developed during the past three decades, and fall into two groups. The first group is graph-based parsing. It selects a path or sub-graph which obeys some rule to form a possible interpretation for the given expression. The second group is grammar driven parsing. Grammars and related parameters are defined manually for different tasks. The time complexity of these two groups parsing is high, and they often impose some strict constraints to reduce the computation.
The aim of this thesis is working towards building a straightforward and effective parser with as few constraints as possible. First, we propose using a line of sight graph for representing the layout of strokes and symbols in math expressions. It achieves higher F-score than other graph representations and reduces search space for parsing. Second, we modify the shape context feature with Parzen window density estimation. This feature set works well for symbol segmentation, symbol classification and symbol layout analysis. We get a higher symbol segmentation F-score than other systems on CROHME 2014 dataset. Finally, we develop a Maximum Spanning Tree (MST) based parser using Edmonds\u27 algorithm, which extracts an MST from the directed line of sight graph in two passes: first symbols are segmented, and then symbols and spatial relationship are labeled. The time complexity of our MST-based parsing is lower than the time complexity of CYK parsing with context-free grammars. Also, our MST-based parsing obtains higher structure rate and expression rate than CYK parsing when symbol segmentation is accurate. Correct structure means we get the structure of the symbol layout tree correct, even though the label of the edge in the symbol layout tree might be wrong. The performance of our math expression recognition system with MST-based parsing is competitive on CROHME 2012 and 2014 datasets.
For future work, how to incorporate symbol classifier result and correct segmentation error in MST-based parsing needs more research
Reconocimiento de notación matemática escrita a mano fuera de línea
El reconocimiento automático de expresiones matemáticas es uno de los problemas de reconocimiento de patrones, debido a que las matemáticas representan una fuente valiosa de información en muchos a ́reas de investigación. La escritura de expresiones matemáticas a mano es un medio de comunicación utilizado para la transmisión de información y conocimiento, con la cual se pueden generar de una manera sencilla escritos que contienen notación matemática. Este proceso puede volverse tedioso al ser escrito en lenguaje de composición tipográfica que pueda ser procesada por una computadora, tales como LATEX, MathML, entre otros. En los sistemas de reconocimiento de expresiones matem ́aticas existen dos m ́etodos diferentes a saber: fuera de l ́ınea y en l ́ınea. En esta tesis, se estudia el desempen ̃o de un sistema fuera de l ́ınea en donde se describen los pasos b ́asicos para lograr una mejor precisio ́n en el reconocimiento, las cuales esta ́n divididas en dos pasos principales: recono- cimiento de los s ́ımbolos de las ecuaciones matema ́ticas y el ana ́lisis de la estructura en que est ́an compuestos. Con el fin de convertir una expresi ́on matema ́tica escrita a mano en una expresio ́n equivalente en un sistema de procesador de texto, tal como TEX
Applying Hierarchical Contextual Parsing with Visual Density and Geometric Features to Typeset Formula Recognition
We demonstrate that recognition of scanned typeset mathematical expression images can be done by extracting maximum spanning trees from line of sight graphs weighted using geometric and visual density features. The approach used is hierarchical contextual parsing (HCP): Hierarchical in terms of starting with connected components and building to the symbol level using visual, spatial, and contextual features of connected components. Once connected components have been segmented into symbols, a new set of spatial, visual, and contextual features are extracted. One set of visual features is used for symbol classification, and another for parsing. The features are used in parsing to assign classifications and confidences to edges in a line of sight symbol graph. Layout trees describe expression structure in terms of spatial relations between symbols, such as horizontal, subscript, and superscript. From the weighted graph Edmonds\u27 algorithm is used to extract a maximum spanning tree. Segmentation and parsing are done without using symbol classification information, and symbol classification is done independently of expression structure recognition. The commonality between the recognition processes is the type of features they use, the visual densities. These visual densities are used for shape, spatial, and contextual information. The contextual information is shown to help in segmentation, parsing, and symbol recognition.
The hierarchical contextual parsing has been implemented in the Python and Graph-based Online/Offline Recognizer for Math (Pythagor^m) system and tested on the InftyMCCDB-2 dataset. We created InftyMCCDB-2 from InftyCDB-2 as a open source dataset for scanned typeset math expression recognition. In building InftyMCCDB-2 modified formula structure representations were used to better capture the spatial positioning of symbols in the expression structures. Namely, baseline punctuation and symbol accents were moved out of horizontal baselines as their positions are not horizontally aligned with symbols on a writing line. With the transformed spatial layouts and HCP, 95.97% of expressions were parsed correctly when given symbols and 93.95% correctly parsed when requiring symbol segmentation from connected components. Overall HCP reached 90.83% expression recognition rate from connected components
Understanding Optical Music Recognition
For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: Few introductory materials are available, and, furthermore, the field has struggled with defining itself and building a shared terminology. In this work, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, and (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords
Advances in Manipulation and Recognition of Digital Ink
Handwriting is one of the most natural ways for a human to record knowledge. Recently, this type of human-computer interaction has received increasing attention due to the rapid evolution of touch-based hardware and software. While hardware support for digital ink reached its maturity, algorithms for recognition of handwriting in certain domains, including mathematics, are lacking robustness. Simultaneously, users may possess several pen-based devices and sharing of training data in adaptive recognition setting can be challenging. In addition, resolution of pen-based devices keeps improving making the ink cumbersome to process and store. This thesis develops several advances for efficient processing, storage and recognition of handwriting, which are applicable to the classification methods based on functional approximation. In particular, we propose improvements to classification of isolated characters and groups of rotated characters, as well as symbols of substantially different size. We then develop an algorithm for adaptive classification of handwritten mathematical characters of a user. The adaptive algorithm can be especially useful in the cloud-based recognition framework, which is described further in the thesis. We investigate whether the training data available in the cloud can be useful to a new writer during the training phase by extracting styles of individuals with similar handwriting and recommending styles to the writer. We also perform factorial analysis of the algorithm for recognition of n-grams of rotated characters. Finally, we show a fast method for compression of linear pieces of handwritten strokes and compare it with an enhanced version of the algorithm based on functional approximation of strokes. Experimental results demonstrate validity of the theoretical contributions, which form a solid foundation for the next generation handwriting recognition systems
- …