Search CORE

17 research outputs found

Segmentation of the overlapping Kannada Characters

Author: Soumyadeep Sinha
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/09/2016
Field of study

Kannada is a widely spoken language in the southern part of India. Character segmentation of Kannada text is difficult, since adjacent characters in Kannada sometimes overlap in the vertical projection profile. In such cases, the usual method of character segmentation using projection profile is not efficient. In this paper we present a segmentation method in which overlapped characters are separated by connected component analysis

International Journal on Recent and Innovation Trends in Computing and Communication

Text Line Segmentation of Historical Documents: a Survey

Author: A. Amin
A. Bozzi
A. Downton
A. Jain
A. Kolcz
Abderrazak Zahour
Bruno Taconet
C.L. Tan
C.V. Lakshmi
E. Cohen
E. Oztop
G. Seni
I.-K. Kim
K. Wong
L. Likforman-Sulem
L. Likforman-Sulem
L. Likforman-Sulem
L. O’Gorman
L.A. Fletcher
Laurence Likforman-Sulem
R. Plamondon
R.D. Lins
U. Pal
V. Shapiro
Ventadert Gusnard de de
Y. Solihin
Y.H. Tseng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/04/2007
Field of study

There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade, and dedicated to documents of historical interest.Comment: 25 pages, submitted version, To appear in International Journal on Document Analysis and Recognition, On line version available at http://www.springerlink.com/content/k2813176280456k3

arXiv.org e-Print Archive

Crossref

Research and Development of Feature Extraction from Myanmar Palm Leaf Manuscripts for the Myanmar Character Recognition System

Author: Htay Win
Soe Nwe Nwe
Publication venue: 'Insight Society'
Publication date: 31/12/2019
Field of study

This paper proposed Myanmar palm leaf manuscript handwriting OCR system. Each text area in the Myanmar palm-leaf manuscript is segmented. This segmented character text image is needed to be recognized to transform to Myanmar handwritten characters which express Myanmar’s precious historical and invaluable information. This paper involves two essential steps: preprocessing and feature extraction. The preprocessing is carried out to extract the attractive palm-leaf manuscript region from the Images automatically are taken by the camera and to support the enhanced images for subsequence processes of Myanmar character recognition from Myanmar palm leaves. The one-dimensional segmentation approach is used to crop leaf area in the image which is taken with high resolution. Line count analysis is also done to extract the region for using enough line count. After that, line segmentation is carried out using Object Frequency Histogram along the horizontal lines which can find the best optimal points between the lines. Similarly, the same technique but vertically is used to get each character or smallest group of characters. Totally 18 features are extracted to recognize the Myanmar palm-leaf manuscript characters. Although the experimental results are good enough but some difficulties are still needed to take account related to the connected components.

International Journal on Advanced Science, Engineering and Information Technology

Multi-Oriented Text Line Extraction from Handwritten Arabic Documents

Author: Belaïd Abdel
Ouwayed Nazih
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2008
Field of study

International audienceIn this paper, we present a novel approach for the multi-oriented text line extraction from handwritten Arabic documents. After image pre-processing, the local orientations are determined in small windows obtained by image paving. The orientation of the text within each window is estimated using the projection profile technique considering several projection angles. Then, the windows which close angles are gathered into largest zones. We use the Wigner-Ville Distribution (WVD) to estimate the global orientation of each zone. The WVD is more precise than the classical projection profile technique. Afterwards, the text lines are extracted in each zone basing on the follow-up of the baselines and the proximity of connected components. The experimental results prove the efficiency of the proposed scheme. It has been evaluated on 50 documents reaching an accuracy of about 97.6%

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Model-Based Approach for Extracting Femur Contours in X-ray Images

Author: CHEN YING
Publication venue
Publication date: 30/11/2005
Field of study

Master'sMASTER OF SCIENC

ScholarBank@NUS

The application of new methods for offline recognition in printed Arabic documents

Author: Bouressace Hassina
Publication venue
Publication date: 29/05/2020
Field of study

SZTE Doktori Értekezések Repozitórium (SZTE Repository of Dissertations)

Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

Author: E. GUMAH MOHAMED
Publication venue
Publication date: 01/01/2010
Field of study

In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

UTPedia

Arabic Manuscripts Analysis and Retrieval

Author
Publication venue
Publication date
Field of study