44 research outputs found

    Keyword spotting in historical handwritten documents based on graph matching

    Get PDF
    In the last decades historical handwritten documents have become increasingly available in digital form. Yet, the accessibility to these documents with respect to browsing and searching remained limited as full automatic transcription is often not possible or not sufficiently accurate. This paper proposes a novel reliable approach for template-based keyword spotting in historical handwritten documents. In particular, our framework makes use of different graph representations for segmented word images and a sophisticated matching procedure. Moreover, we extend our method to a spotting ensemble. In an exhaustive experimental evaluation on four widely used benchmark datasets we show that the proposed approach is able to keep up or even outperform several state-of-the-art methods for template- and learning-based keyword spotting.The Hasler Foundation Switzerlandhttp://www.elsevier.com/locate/patcog2019-09-01hj2018Informatic

    Out of vocabulary queries for word graph-based keyword spotting

    Full text link
    [EN] In this master thesis several approaches are presented to support out of vocabulary queries in a Word Graph (WG)-based Keyword Spotting (KWS) application for handwritten text lines. Generally, KWS assigns a score that estimates how likely is that a given keyword is present in a certain line image. WGbased KWS offers very fast search times but assumes a closed vocabulary and assigns null scores to any word not included in such vocabulary. This work tries to provide to the WG-based KWS the flexibility of non-restricted searches and the speed achieved by the usage of WG.[ES] En este trabajo fin de máster se presentan distintas alternativas para dar soporte a búsquedas con palabras fuera del vocabulario en Keyword Spotting (KWS) sobre líneas de texto manuscrito usando Word Graphs (WG). En general, en KWS se asigna una puntuación que indica cuán probable es que una palabra aparezca en una imagen de una línea de texto. El KWS basado en WG ofrece tiempos de búsqueda muy rápidos pero asume un vocabulario cerrado y asigna puntuaciones nulas a las palabras no incluidas en él. Con éste trabajo se pretende proporcionar al KWS basado en WG de la flexibilidad de búsquedas no restringidas al vocabulario de entrenamiento, junto a la velocidad que se consigue con el uso de WG.Puigcerver I Pérez, J. (2014). Out of vocabulary queries for word graph-based keyword spotting. http://hdl.handle.net/10251/53360Archivo delegad

    Spotting Keywords in Offline Handwritten Documents Using Hausdorff Edit Distance

    Get PDF
    Keyword spotting has become a crucial topic in handwritten document recognition, by enabling content-based retrieval of scanned documents using search terms. With a query keyword, one can search and index the digitized handwriting which in turn facilitates understanding of manuscripts. Common automated techniques address the keyword spotting problem through statistical representations. Structural representations such as graphs apprehend the complex structure of handwriting. However, they are rarely used, particularly for keyword spotting techniques, due to high computational costs. The graph edit distance, a powerful and versatile method for matching any type of labeled graph, has exponential time complexity to calculate the similarities of graphs. Hence, the use of graph edit distance is constrained to small size graphs. The recently developed Hausdorff edit distance algorithm approximates the graph edit distance with quadratic time complexity by efficiently matching local substructures. This dissertation speculates using Hausdorff edit distance could be a promising alternative to other template-based keyword spotting approaches in term of computational time and accuracy. Accordingly, the core contribution of this thesis is investigation and development of a graph-based keyword spotting technique based on the Hausdorff edit distance algorithm. The high representational power of graphs combined with the efficiency of the Hausdorff edit distance for graph matching achieves remarkable speedup as well as accuracy. In a comprehensive experimental evaluation, we demonstrate the solid performance of the proposed graph-based method when compared with state of the art, both, concerning precision and speed. The second contribution of this thesis is a keyword spotting technique which incorporates dynamic time warping and Hausdorff edit distance approaches. The structural representation of graph-based approach combined with statistical geometric features representation compliments each other in order to provide a more accurate system. The proposed system has been extensively evaluated with four types of handwriting graphs and geometric features vectors on benchmark datasets. The experiments demonstrate a performance boost in which outperforms individual systems

    Arabic Manuscripts Analysis and Retrieval

    Get PDF
    corecore