866 research outputs found

    How to separate between Machine-Printed/Handwritten and Arabic/Latin Words?

    Get PDF
    This paper gathers some contributions to script and its nature identification. Different sets of features have been employed successfully for discriminating between handwritten and machine-printed Arabic and Latin scripts. They include some well established features, previously used in the literature, and new structural features which are intrinsic to Arabic and Latin scripts. The performance of such features is studied towards this paper. We also compared the performance of five classifiers: Bayes (AODEsr), k-Nearest Neighbor (k-NN), Decision Tree (J48), Support Vector Machine (SVM) and Multilayer perceptron (MLP) used to identify the script at word level. These classifiers have been chosen enough different to test the feature contributions. Experiments have been conducted with handwritten and machine-printed words, covering a wide range of fonts. Experimental results show the capability of the proposed features to capture differences between scripts and the effectiveness of the three classifiers. An average identification precision and recall rates of 98.72% was achieved, using a set of 58 features and AODEsr classifier, which is slightly better than those reported in similar works

    A Novel Method for Multifont Arabic Characters Features Extraction

    Get PDF

    Agendas for Digital Palaeography in an Archaeological Context: Egypt 1800 BC

    Get PDF
    Handwriting raises issues alive in archaeological debates, philosophical and historical. In turn, by their extreme fragmentariness, the earliest archaeological manuscripts could generate usefully different questions for the field of palaeography. Here, digitisation offers new common ground for the separate disciplines in the study of the past. For current archaeological discussions of structure and agency, manuscripts pose the act of writing, between social and individual. For debates over literacy and power in part- literate societies, an archaeological hoard of manuscript fragments offers opportunities to assess our chances of knowing, for one time and place, how many writings and writers. The largest earliest group of writing on papyrus-paper comprises several thousand small fragments from Lahun in Egypt (about 1850–1750 BC). Traditional methods of recording similarity and difference across the collection can now be accelerated to a point of qualitative change, by applying image-matching software. This paper considers the potential of computer-aided palaeography for generating new research agendas

    Arabic/Latin and Machine-printed/Handwritten Word Discrimination using HOG-based Shape Descriptor

    Get PDF
    In this paper, we present an approach for Arabic and Latin script and its type identification based onHistogram of Oriented Gradients (HOG) descriptors. HOGs are first applied at word level based on writingorientation analysis. Then, they are extended to word image partitions to capture fine and discriminativedetails. Pyramid HOG are also used to study their effects on different observation levels of the image.Finally, co-occurrence matrices of HOG are performed to consider spatial information between pairs ofpixels which is not taken into account in basic HOG. A genetic algorithm is applied to select the potentialinformative features combinations which maximizes the classification accuracy. The output is a relativelyshort descriptor that provides an effective input to a Bayes-based classifier. Experimental results on a set ofwords, extracted from standard databases, show that our identification system is robust and provides goodword script and type identification: 99.07% of words are correctly classified
    corecore