research

Introduction of statistical information in a syntactic analyser for document image recognition

Abstract

International audienceThis paper presents an improvement to document layout analysis systems, oering a possible solution to Sayre's paradox (which states that an element must be recognized before it can be segmented; and it must be segmented before it can be recognized). This improvement, based on stochastic parsing, allows integration of statistical information, obtained from recognizers, during syntactic layout analysis. We present how this fusion of numeric and symbolic information in a feedback loop can be applied to syntactic methods to improve document description expressiveness. To limit combinatorial explosion during exploration of solutions, we devised an operator that allows optional activation of the stochastic parsing mechanism. Our evaluation on 1250 handwritten business letters shows this method allows the improvement of global recognition scores

    Similar works

    Full text

    thumbnail-image

    Available Versions