Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2013.Thesis (Master's) -- Bilkent University, 2013.Includes bibliographical references leaves 80-88.Large archives of historical documents attract many researchers from all around
the world. The increasing demand to access those archives makes automatic retrieval
and recognition of historical documents crucial. Ottoman archives are one
of the largest collections of historical documents. Although Ottoman is not a
currently spoken language, many researchers from all around the world are interested
in accessing the archived material. This thesis proposes two Ottoman
document analysis studies; first one is a crucial pre-processing task for retrieval
and recognition which is segmentation of documents. Second one is a more specific
retrieval and recognition problem which aims matching Islamic patterns is
Kufic images. For the first segmentation task, layout, line and word segmentation
is studied. Layout segmentation is obtained via Log-Gabor filtering. Four
different algorithms are proposed for line segmentation and finally a simple morphological
method is preferred for word segmentation. Datasets are constructed
with documents from both Ottoman and other languages (English, Greek and
Bangla) to test the script-independency of the methods. Experiments show that
our segmentation steps give satisfactory results. The second task aims to detect
Islamic patterns in Kufic images. The sub-patterns are considered as basic units
and matching is used for the analysis. Graphs are preferred to represent subpatterns
where graph and sub-graph isomorphism are used for matching them.
Kufic images are analyzed in three different ways. Given a query pattern, all the
instances of the query can be found through retrieval. Going further, through
known patterns images can be automatically labeled in the entire dataset. Finally,
patterns that repeat inside an image can be automatically discovered. As
there is no existing Kufic dataset, a new one is constructed by collecting images
from the Internet and promising results are obtained on this dataset.Adıgüzel, HandeM.S