3 research outputs found

    Spotting Keywords in Offline Handwritten Documents Using Hausdorff Edit Distance

    Get PDF
    Keyword spotting has become a crucial topic in handwritten document recognition, by enabling content-based retrieval of scanned documents using search terms. With a query keyword, one can search and index the digitized handwriting which in turn facilitates understanding of manuscripts. Common automated techniques address the keyword spotting problem through statistical representations. Structural representations such as graphs apprehend the complex structure of handwriting. However, they are rarely used, particularly for keyword spotting techniques, due to high computational costs. The graph edit distance, a powerful and versatile method for matching any type of labeled graph, has exponential time complexity to calculate the similarities of graphs. Hence, the use of graph edit distance is constrained to small size graphs. The recently developed Hausdorff edit distance algorithm approximates the graph edit distance with quadratic time complexity by efficiently matching local substructures. This dissertation speculates using Hausdorff edit distance could be a promising alternative to other template-based keyword spotting approaches in term of computational time and accuracy. Accordingly, the core contribution of this thesis is investigation and development of a graph-based keyword spotting technique based on the Hausdorff edit distance algorithm. The high representational power of graphs combined with the efficiency of the Hausdorff edit distance for graph matching achieves remarkable speedup as well as accuracy. In a comprehensive experimental evaluation, we demonstrate the solid performance of the proposed graph-based method when compared with state of the art, both, concerning precision and speed. The second contribution of this thesis is a keyword spotting technique which incorporates dynamic time warping and Hausdorff edit distance approaches. The structural representation of graph-based approach combined with statistical geometric features representation compliments each other in order to provide a more accurate system. The proposed system has been extensively evaluated with four types of handwriting graphs and geometric features vectors on benchmark datasets. The experiments demonstrate a performance boost in which outperforms individual systems

    Graph-based keyword spotting in historical manuscripts using Hausdorff edit distance

    No full text
    Keyword spotting enables content-based retrieval of scanned historical manuscripts using search terms, which, in turn, facilitates the indexation in digital libraries. Recent approaches include graph-based representations that capture the complex structure of handwriting. However, the high representational power of graphs comes at the cost of high computational complexity for graph matching. In this article, we investigate the potential of Hausdorff edit distance (HED) for keyword spotting. It is an efficient quadratictime approximation of the graph edit distance. In a comprehensive experimental evaluation with four types of handwriting graphs and four benchmark datasets (George Washington, Parzival, Botany, and Alvermann Konzilsprotokolle), we demonstrate a strong performance of the proposed HED-based method when compared with the state of the art, both, in terms of precision and speed
    corecore