190 research outputs found
Spatial and Textural Aspects for Arabic Handwritten Characters Recognition
The purpose of the present paper is the recognition of handwritten Arabic characters in their isolated form. The specificity of Arabic characters is taken into consideration, each of the proposed feature extraction method integrates one of the two aspects: spatial and textural. In the first step, a modified Bitmap Sampling method is proposed, which converts the character’s images into a binary Matrix and then constructs a Mask for each class. A matching rate is used between the input binary matrix and the masks to determinate the corresponding class. In the second step we investigate the use of an Artificial Neural Network as classifier with the binary matrices as features and then the histograms of Local Binary Patterns to capture the texture aspect of the characters. Finally, the results of these two methods are combined to take into consideration both aspects at the same time. Tested on the Arabic set of the Isolated Farsi Handwritten Character Database, the proposed method has 2.82% error rate
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
Digital Paleography: Using the Digital Representation of Jawi Manuscripts to Support Paleographic Analysis
Palaeography is the study of ancient handwritten manuscripts to date the age and to localize ancient and medieval scripts. It also deals with analysing the development of the letters shape. Ancient Jawi manuscripts are one of the least studiedarea. Nowadays, over 7789 known Jawi manuscripts are kept in custody of various libraries in Malaysia. Most of these manuscripts were undated with unknown authors and location of origin. Analysing the different types of writing styles and recognizing the manuscript illuminations can discover this important information. In this paper, we discuss the palaeographical analysis from the perspective of computer science and propose a general framework for that. This process
involves investigation of Arabic influence on the Jawi manuscript writings, establishing the palaeographical type of the script, and classification of writing styles based on local and global Jawi image features
Digital Paleography: Using the Digital Representation of Jawi Manuscripts to Support Paleographic Analysis
Palaeography is the study of ancient handwritten manuscripts to date the age and to localize ancient and medieval scripts. It also deals with analysing the development of the letters shape. Ancient Jawi manuscripts are one of the least studiedarea. Nowadays, over 7789 known Jawi manuscripts are kept in custody of various libraries in Malaysia. Most of these manuscripts were undated with unknown authors and location of origin. Analysing the different types of writing styles and recognizing the manuscript illuminations can discover this important information. In this paper, we discuss the palaeographical analysis from the perspective of computer science and propose a general framework for that. This process involves investigation of Arabic influence on the Jawi manuscript writings, establishing the palaeographical type of the script, and classification of writing styles based on local and global Jawi image features
Writer identification using curvature-free features
Feature engineering takes a very important role in writer identification which has been widely studied in the literature. Previous works have shown that the joint feature distribution of two properties can improve the performance. The joint feature distribution makes feature relationships explicit instead of roping that a trained classifier picks up a non-linear relation present in the data. In this paper, we propose two novel and curvature-free features: run-lengths of local binary pattern (LBPruns) and cloud of line distribution (COLD) features for writer identification. The LBPruns is the joint distribution of the traditional run-length and local binary pattern (LBP) methods, which computes the run-lengths of local binary patterns on both binarized and gray scale images. The COLD feature is the joint distribution of the relation between orientation and length of line segments obtained from writing contours in handwritten documents. Our proposed LBPruns and COLD are textural-based curvature-free features and capture the line information of handwritten texts instead of the curvature information. The combination of the LBPruns and COLD features provides a significant improvement on the CERUG data set, handwritten documents on which contain a large number of irregular-curvature strokes. The results of proposed features evaluated on other two widely used data sets (Firemaker and IAM) demonstrate promising results
Cross-document word matching for segmentation and retrieval of Ottoman divans
Cataloged from PDF version of article.Motivated by the need for the automatic
indexing and analysis of huge number of documents in
Ottoman divan poetry, and for discovering new knowledge
to preserve and make alive this heritage, in this study we
propose a novel method for segmenting and retrieving
words in Ottoman divans. Documents in Ottoman are dif-
ficult to segment into words without a prior knowledge of
the word. In this study, using the idea that divans have
multiple copies (versions) by different writers in different
writing styles, and word segmentation in some of those
versions may be relatively easier to achieve than in other
versions, segmentation of the versions (which are difficult,
if not impossible, with traditional techniques) is performed
using information carried from the simpler version. One
version of a document is used as the source dataset and the
other version of the same document is used as the target
dataset. Words in the source dataset are automatically
extracted and used as queries to be spotted in the target
dataset for detecting word boundaries. We present the idea
of cross-document word matching for a novel task of
segmenting historical documents into words. We propose a
matching scheme based on possible combinations of
sequence of sub-words. We improve the performance of
simple features through considering the words in a context.
The method is applied on two versions of Layla and
Majnun divan by Fuzuli. The results show that, the proposed
word-matching-based segmentation method is
promising in finding the word boundaries and in retrieving
the words across documents
Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform
In this research, off-line handwriting recognition system for Arabic alphabet is
introduced. The system contains three main stages: preprocessing, segmentation and
recognition stage. In the preprocessing stage, Radon transform was used in the design
of algorithms for page, line and word skew correction as well as for word slant
correction. In the segmentation stage, Hough transform approach was used for line
extraction. For line to words and word to characters segmentation, a statistical method
using mathematic representation of the lines and words binary image was used.
Unlike most of current handwriting recognition system, our system simulates the
human mechanism for image recognition, where images are encoded and saved in
memory as groups according to their similarity to each other. Characters are
decomposed into a coefficient vectors, using fast wavelet transform, then, vectors,
that represent a character in different possible shapes, are saved as groups with one
representative for each group. The recognition is achieved by comparing a vector of
the character to be recognized with group representatives.
Experiments showed that the proposed system is able to achieve the recognition task
with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a
single character in a text of 15 lines where each line has 10 words on average
Advances in Image Processing, Analysis and Recognition Technology
For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches
- …