12 research outputs found

    Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models

    Full text link
    [EN] This paper describes a formal model for the recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models. Hidden Markov models are used to recognize mathematical symbols, and a stochastic context-free grammar is used to model the relation between these symbols. This formal model makes possible to use classic algorithms for parsing and stochastic estimation. In this way, first, the model is able to capture many of variability phenomena that appear in on-line handwritten mathematical expressions during the training process. And second, the parsing process can make decisions taking into account only stochastic information, and avoiding heuristic decisions. The proposed model participated in a contest of mathematical expression recognition and it obtained the best results at different levels. 2012 Elsevier B.V. All rights reserved.Work supported by the EC (FEDER/ FSE) and the Spanish MEC/MICINN under the MIPRCV ‘‘Consolider Ingenio 2010’’ program (CSD2007-00018), the MITTRAL (TIN2009-14633-C03-01) project, the FPU Grant (AP2009-4363), and by the Generalitat Valenciana under the Grant Prometeo/2009/014.Álvaro Muñoz, F.; Sánchez Peiró, JA.; Benedí Ruiz, JM. (2014). Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models. Pattern Recognition Letters. 35:58-67. https://doi.org/10.1016/j.patrec.2012.09.023S58673

    A Global Online Handwriting Recognition Approach Based on Frequent Patterns

    Get PDF
    In this article, the handwriting signals are represented based on geometric and spatio-temporal characteristics to increase the feature vectors relevance of each object. The main goal was to extract features in the form of a numeric vector based on the extraction of frequent patterns. We used two types of frequent motifs (closed frequent patterns and maximal frequent patterns) that can represent handwritten characters pertinently. These common features patterns are generated from a raw data transformation method to achieve high relevance. A database of words consisting of two different letters was created. The proposed application gives promising results and highlights the advantages that frequent pattern extraction algorithms can achieve, as well as the central role played by the “minimum threshold” parameter in the overall description of the characters

    Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-38628-2_15n this paper we define a bidimensional extension of Stochastic Context-Free Grammars for page segmentation of structured documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the page segmentation is obtained as the most likely hypothesis according to a grammar. This approach is compared to Conditional Random Fields and results show significant improvements in several cases. Furthermore, grammars provide a detailed segmentation that allowed a semantic evaluation which also validates this model.Work partially supported by the Spanish MEC under the STraDA research project (TIN2012-37475-C02-01), the MITTRAL (TIN2009- 14633-C03-01) project, the Spanish projects TIN2009-14633-C03-01/03 and 2010- CONES-00029, the FPU grant (AP2009-4363), by the Generalitat Valenciana under the grant Prometeo/2009/014, and through the EU 7th Framework Programme grant tranScriptorium (Ref: 600707)Álvaro Muñoz, F.; Cruz Fernández, F.; Sánchez Peiró, JA.; Ramos Terrades, O.; Benedí Ruiz, JM. (2013). Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars. En Pattern Recognition and Image Analysis. Springer. 133-140. https://doi.org/10.1007/978-3-642-38628-2_15133140Álvaro, F., Sánchez, J.A., Benedí, J.M.: Recognition of on-line handwritten mathematical expressions using 2d stochastic context-free grammars and hidden markov models. Pattern Recognition Letters (2012)An, C., Bird, H.S., Xiu, P.: Iterated document content classification. In: Proc. of ICDAR, Brazil, vol. 1, pp. 252–256 (2007)Antonacopoulos, A., Clausner, C., Papadopoulos, C., Pletschacher, S.: Historical document layout analysis competition. In: Proc. of ICDAR, pp. 1516–1520 (2011)Bulacu, M., Koert, R., Schomaker, L., Zant, T.: Layout analysis of handwritten historical documents for searching the archive of the cabinet of the dutch queen. In: Proc. of ICDAR, Brazil, vol. 1, pp. 23–26 (2007)Crespi Reghizzi, S., Pradella, M.: A CKY parser for picture grammars. Information Processing Letters 105(6), 213–217 (2008)Cruz, F., Ramos Terrades, O.: Document segmentation using relative location features. In: Proc. of ICPR, Japan, pp. 1562–1565 (2012)Esteve, A., Cortina, C., Cabré, A.: Long term trends in marital age homogamy patterns: Spain, 1992-2006. Population 64(1), 173–202 (2009)Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. Journal of Computer Vision 80(3), 300–316 (2008)Handley, J.C., Namboodiri, A.M., Zanibbi, R.: Document understanding system using stochastic context-free grammars. In: Proc. of ICDAR, vol. 1, pp. 511–515 (2005)Jain, A.K., Namboodiri, A.M., Subrahmonia, J.: Structure in online documents. In: Proc. of ICDAR, vol. 1, pp. 844–848 (2001)Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of ICML, USA, pp. 282–289 (2001

    Data-free metrics for Dirichlet and generalized Dirichlet mixture-based HMMs - A practical study.

    Get PDF
    Approaches to design metrics between hidden Markov models (HMM) can be divided into two classes: data-based and parameter-based. The latter has the clear advantage of being deterministic and faster but only a very few similarity measures that can be applied to mixture-based HMMs have been proposed so far. Most of these metrics apply to the discrete or Gaussian HMMs and no comparative study have been led to the best of our knowledge. With the recent development of HMMs based on the Dirichlet and generalized Dirichlet distributions for proportional data modeling, we propose to design three new parametric similarity measures between these HMMs. Extensive experiments on synthetic data show the reliability of these new measures where the existing ones fail at giving expected results when some parameters vary. Illustration on real data show the clustering capability of these measures and their potential applications
    corecore