409 research outputs found
Discerning Structure from Freeform Handwritten Notes
This paper presents an integrated approach to parsing textual structure in freeform handwritten notes. Textgraphics classification and text layout analysis are classical problems in printed document analysis, but the irregularity in handwriting and content in freeform notes reveals limitations in existing approaches. We advocate an integrated technique that solves the layout analysis and classification problems simultaneously: the problems are so tightly coupled that it is not possible to solve one without the other for real user notes. We tune and evaluate our approach on a large corpus of unscripted user files and reflect on the difficult recognition scenarios that we have encountered in practice
Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining
Online Arabic cursive character recognition is still a big challenge due to
the existing complexities including Arabic cursive script styles, writing
speed, writer mood and so forth. Due to these unavoidable constraints, the
accuracy of online Arabic character's recognition is still low and retain space
for improvement. In this research, an enhanced method of detecting the desired
critical points from vertical and horizontal direction-length of handwriting
stroke features of online Arabic script recognition is proposed. Each extracted
stroke feature divides every isolated character into some meaningful pattern
known as tokens. A minimum feature set is extracted from these tokens for
classification of characters using a multilayer perceptron with a
back-propagation learning algorithm and modified sigmoid function-based
activation function. In this work, two milestones are achieved; firstly, attain
a fixed number of tokens, secondly, minimize the number of the most repetitive
tokens. For experiments, handwritten Arabic characters are selected from the
OHASD benchmark dataset to test and evaluate the proposed method. The proposed
method achieves an average accuracy of 98.6% comparable in state of art
character recognition techniques.Comment: 16 page
Text Extraction From Natural Scene: Methodology And Application
With the popularity of the Internet and the smart mobile device, there is an increasing demand for the techniques and applications of image/video-based analytics and information retrieval. Most of these applications can benefit from text information extraction in natural scene. However, scene text extraction is a challenging problem to be solved, due to cluttered background of natural scene and multiple patterns of scene text itself. To solve these problems, this dissertation proposes a framework of scene text extraction.
Scene text extraction in our framework is divided into two components, detection and recognition. Scene text detection is to find out the regions containing text from camera captured images/videos. Text layout analysis based on gradient and color analysis is performed to extract candidates of text strings from cluttered background in natural scene. Then text structural analysis is performed to design effective text structural features for distinguishing text from non-text outliers among the candidates of text strings. Scene text recognition is to transform image-based text in detected regions into readable text codes. The most basic and significant step in text recognition is scene text character (STC) prediction, which is multi-class classification among a set of text character categories. We design robust and discriminative feature representations for STC structure, by integrating multiple feature descriptors, coding/pooling schemes, and learning models. Experimental results in benchmark datasets demonstrate the effectiveness and robustness of our proposed framework, which obtains better performance than previously published methods.
Our proposed scene text extraction framework is applied to 4 scenarios, 1) reading print labels in grocery package for hand-held object recognition; 2) combining with car detection to localize license plate in camera captured natural scene image; 3) reading indicative signage for assistant navigation in indoor environments; and 4) combining with object tracking to perform scene text extraction in video-based natural scene. The proposed prototype systems and associated evaluation results show that our framework is able to solve the challenges in real applications
A visual approach to sketched symbol recognition
There is increasing interest in building systems that can automatically interpret hand-drawn sketches. However, many challenges remain in terms of recognition accuracy, robustness to different drawing styles, and ability to generalize across multiple domains. To address these challenges, we propose a new approach to sketched symbol recognition that focuses on the visual appearance of the symbols. This allows us to better handle the range of visual and stroke-level variations found in freehand drawings. We also present a new symbol classifier that is computationally efficient and invariant to rotation and local deformations. We show that our method exceeds state-of-the-art performance on all three domains we evaluated, including handwritten digits, PowerPoint shapes, and electrical circuit symbols
A Study of Sindhi Related and Arabic Script Adapted languages Recognition
A large number of publications are available for the Optical Character
Recognition (OCR). Significant researches, as well as articles are present for
the Latin, Chinese and Japanese scripts. Arabic script is also one of mature
script from OCR perspective. The adaptive languages which share Arabic script
or its extended characters; still lacking the OCRs for their language. In this
paper we present the efforts of researchers on Arabic and its related and
adapted languages. This survey is organized in different sections, in which
introduction is followed by properties of Sindhi Language. OCR process
techniques and methods used by various researchers are presented. The last
section is dedicated for future work and conclusion is also discussed.Comment: 11 pages, 8 Figures, Sindh Univ. Res. Jour. (Sci. Ser.
Joint Energy-based Detection and Classificationon of Multilingual Text Lines
This paper proposes a new hierarchical MDL-based model for a joint detection
and classification of multilingual text lines in im- ages taken by hand-held
cameras. The majority of related text detec- tion methods assume alphabet-based
writing in a single language, e.g. in Latin. They use simple clustering
heuristics specific to such texts: prox- imity between letters within one line,
larger distance between separate lines, etc. We are interested in a
significantly more ambiguous problem where images combine alphabet and
logographic characters from multiple languages and typographic rules vary a lot
(e.g. English, Korean, and Chinese). Complexity of detecting and classifying
text lines in multiple languages calls for a more principled approach based on
information- theoretic principles. Our new MDL model includes data costs
combining geometric errors with classification likelihoods and a hierarchical
sparsity term based on label costs. This energy model can be efficiently
minimized by fusion moves. We demonstrate robustness of the proposed algorithm
on a large new database of multilingual text images collected in the pub- lic
transit system of Seoul
A New Approach in Persian Handwritten Letters Recognition Using Error Correcting Output Coding
Classification Ensemble, which uses the weighed polling of outputs, is the
art of combining a set of basic classifiers for generating high-performance,
robust and more stable results. This study aims to improve the results of
identifying the Persian handwritten letters using Error Correcting Output
Coding (ECOC) ensemble method. Furthermore, the feature selection is used to
reduce the costs of errors in our proposed method. ECOC is a method for
decomposing a multi-way classification problem into many binary classification
tasks; and then combining the results of the subtasks into a hypothesized
solution to the original problem. Firstly, the image features are extracted by
Principal Components Analysis (PCA). After that, ECOC is used for
identification the Persian handwritten letters which it uses Support Vector
Machine (SVM) as the base classifier. The empirical results of applying this
ensemble method using 10 real-world data sets of Persian handwritten letters
indicate that this method has better results in identifying the Persian
handwritten letters than other ensemble methods and also single
classifications. Moreover, by testing a number of different features, this
paper found that we can reduce the additional cost in feature selection stage
by using this method.Comment: Journal of Advances in Computer Researc
Active Scene Learning
Sketch recognition allows natural and efficient interaction in pen-based
interfaces. A key obstacle to building accurate sketch recognizers has been the
difficulty of creating large amounts of annotated training data. Several
authors have attempted to address this issue by creating synthetic data, and by
building tools that support efficient annotation. Two prominent sets of
approaches stand out from the rest of the crowd. They use interim classifiers
trained with a small set of labeled data to aid the labeling of the remainder
of the data. The first set of approaches uses a classifier trained with a
partially labeled dataset to automatically label unlabeled instances. The
others, based on active learning, save annotation effort by giving priority to
labeling informative data instances. The former is sub-optimal since it doesn't
prioritize the order of labeling to favor informative instances, while the
latter makes the strong assumption that unlabeled data comes in an already
segmented form (i.e. the ink in the training data is already assembled into
groups forming isolated object instances). In this paper, we propose an active
learning framework that combines the strengths of these methods, while
addressing their weaknesses. In particular, we propose two methods for deciding
how batches of unsegmented sketch scenes should be labeled. The first method,
scene-wise selection, assesses the informativeness of each drawing (sketch
scene) as a whole, and asks the user to annotate all objects in the drawing.
The latter, segment-wise selection, attempts more precise targeting to locate
informative fragments of drawings for user labeling. We show that both
selection schemes outperform random selection. Furthermore, we demonstrate that
precise targeting yields superior performance. Overall, our approach allows
reaching top accuracy figures with up to 30% savings in annotation cost.Comment: To be submitted to the Pattern Recognition Journa
An improved sex specific and age dependent classification model for Parkinson's diagnosis using handwriting measurement
Accurate diagnosis is crucial for preventing the progression of Parkinson's,
as well as improving the quality of life with individuals with Parkinson's
disease. In this paper, we develop a sex-specific and age-dependent
classification method to diagnose the Parkinson's disease using the online
handwriting recorded from individuals with
Parkinson's(n=37;m/f-19/18;age-69.3+-10.9years) and healthy
controls(n=38;m/f-20/18;age-62.4+-11.3 years).The sex specific and age
dependent classifier was observed significantly outperforming the generalized
classifier. An improved accuracy of 83.75%(SD+1.63) with female specific
classifier, and 79.55%(SD=1.58) with old age dependent classifier was observed
in comparison to 75.76%(SD=1.17) accuracy with the generalized classifier.
Finally, combining the age and sex information proved to be encouraging in
classification. We performed a rigorous analysis to observe the dominance of
sex specific and age dependent features for Parkinson's detection and ranked
them using the support vector machine(SVM) ranking method. Distinct set of
features were observed to be dominating for higher classification accuracy in
different category of classification.Comment: Journal of Computer Methods and Programs in Biomedicine(Accepted on
27 December 2019
Scene Text Detection via Holistic, Multi-Channel Prediction
Recently, scene text detection has become an active research topic in
computer vision and document analysis, because of its great importance and
significant challenge. However, vast majority of the existing methods detect
text within local regions, typically through extracting character, word or line
level candidates followed by candidate aggregation and false positive
elimination, which potentially exclude the effect of wide-scope and long-range
contextual cues in the scene. To take full advantage of the rich information
available in the whole natural image, we propose to localize text in a holistic
manner, by casting scene text detection as a semantic segmentation problem. The
proposed algorithm directly runs on full images and produces global, pixel-wise
prediction maps, in which detections are subsequently formed. To better make
use of the properties of text, three types of information regarding text
region, individual characters and their relationship are estimated, with a
single Fully Convolutional Network (FCN) model. With such predictions of text
properties, the proposed algorithm can simultaneously handle horizontal,
multi-oriented and curved text in real-world natural images. The experiments on
standard benchmarks, including ICDAR 2013, ICDAR 2015 and MSRA-TD500,
demonstrate that the proposed algorithm substantially outperforms previous
state-of-the-art approaches. Moreover, we report the first baseline result on
the recently-released, large-scale dataset COCO-Text.Comment: 10 pages, 9 figures, 5 table
- …