In this paper, Slavonic manuscripts from the 11th
century written in Glagolitic script are
investigated. State-of-the-art optical character recognition methods produce poor results
for degraded handwritten document images. This is largely due to a lack of suitable
results from basic pre-processing steps such as binarization and image segmentation.
Therefore, a new, binarization-free approach will be presented that is independent of
pre-processing deficiencies. It additionally incorporates local information in order to
recognize also fragmented or faded characters. The proposed algorithm consists of
two steps: character classification and character localization. Firstly scale invariant
feature transform features are extracted and classified using support vector machines.
On this basis interest points are clustered according to their spatial information. Then,
characters are localized and eventually recognized by a weighted voting scheme of
pre-classified local descriptors. Preliminary results show that the proposed system can
handle highly degraded manuscript images with background noise, e.g. stains, tears,
and faded characters