5 research outputs found
Neighborhood Label Extension for Handwritten/Printed Text Separation in Arabic Documents
International audienceThis paper addresses the problem of handwritten and printed text separation in Arabic document images. The objective is to extract handwritten text from other parts of the document. This allows the application, in a second time, of a specialized processing on the extracted handwritten part or even on the printed one. Documents are first preprocessed in order to remove eventual noise and correct document orientation. Then, the document is segmented into pseudo-lines that are segmented in turn into pseudo-words. A local classification step, using a Gaussian kernel SVM, associates each pseudo-word into handwritten or printed classes. This label is then propagated in the pseudo-word's neighborhood in order to recover from classification errors. The proposed methodology has been tested on a set of public real Arabic documents achieving a separation rate of around 90%
On the Role of Context at Different Scales in Scene Parsing
Scene parsing can be formulated as a labeling problem where each
visual data element, e.g., each pixel of an image or each 3D
point in a point cloud, is assigned a semantic class label. One
can approach this problem by training a classifier and predicting
a class label for the data elements purely based on their local
properties. This approach, however, does not take into account
any kind of contextual information between different elements in
the image or point cloud. For example, in an application where we
are interested in labeling roadside objects, the fact that most
of the utility poles are connected to some power wires can be
very helpful in disambiguating them from other similar looking
classes. Recurrence of certain class combinations can be also
considered as a good contextual hint since they are very likely
to co-occur again. These forms of high-level contextual
information are often formulated using pairwise and higher-order
Conditional Random Fields (CRFs). A CRF is a probabilistic
graphical model that encodes the contextual relationships between
the data elements in a scene. In this thesis, we study the
potential of contextual information at different scales (ranges)
in scene parsing problems.
First, we propose a model that utilizes the local context of the
scene via a pairwise CRF. Our model acquires contextual
interactions between different classes by assessing their
misclassification rates using only the local properties of data.
In other words, no extra training is required for obtaining the
class interaction information.
Next, we expand the context field of view from a local range to a
longer range, and make use of higher-order models to encode more
complex contextual cues. More specifically, we introduce a new
model to employ geometric higher-order terms in a CRF for
semantic labeling of 3D point cloud data.
Despite the potential of the above models at capturing the
contextual cues in the scene, there are higher-level context cues
that cannot be encoded via pairwise and higher-order CRFs. For
instance, a vehicle is very unlikely to appear in a sea scene, or
buildings are frequently observed in a street scene. Such
information can be described using scene context and are modeled
using global image descriptors. In particular, through an image
retrieval procedure, we find images whose content is similar to
that of the query image, and use them for scene parsing. Another
problem of the above methods is that they rely on a
computationally expensive training process for the classification
using the local properties of data elements, which needs to be
repeated every time the training data is modified. We address
this issue by proposing a fast and efficient approach that
exempts us from the cumbersome training task, by transferring the
ground-truth information directly from the training data to the
test data
Segmentation and labeling of documents using Conditional Random Fields
The paper describes the use of Conditional Random Fields(CRF) utilizing contextual information in automatically labeling extracted segments of scanned documents as Machine-print, Handwriting and Noise. The result of such a labeling can serve as an indexing step for a context-based image retrieval system or a bio-metric signature verification system. A simple region growing algorithm is first used to segment the document into a number of patches. A label for each such segmented patch is inferred using a CRF model. The model is flexible enough to include signatures as a type of handwriting and isolate it from machine-print and noise. The robustness of the model is due to the inherent nature of modeling neighboring spatial dependencies in the labels as well as the observed data using CRF. Maximum pseudo-likelihood estimates for the parameters of the CRF model are learnt using conjugate gradient descent. Inference of labels is done by computing the probability of the labels under the model with Gibbs sampling. Experimental results show that this approach provides for 95.75 % of the data being assigned correct labels. The CRF based model is shown to be superior to Neural Networks and Naive Bayes. Keywords: Conditional Random Field(CRF); labeling scanned documents; handwritten text extractio
Segmentation and labeling of documents using Conditional Random Fields
The paper describes the use of Conditional Random Fields(CRF) utilizing contextual information in automatically labeling extracted segments of scanned documents as Machine-print, Handwriting and Noise. The result of such a labeling can serve as an indexing step for a context-based image retrieval system or a bio-metric signature verification system. A simple region growing algorithm is first used to segment the document into a number of patches. A label for each such segmented patch is inferred using a CRF model. The model is flexible enough to include signatures as a type of handwriting and isolate it from machine-print and noise. The robustness of the model is due to the inherent nature of modeling neighboring spatial dependencies in the labels as well as the observed data using CRF. Maximum pseudo-likelihood estimates for the parameters of the CRF model are learnt using conjugate gradient descent. Inference of labels is done by computing the probability of the labels under the model with Gibbs sampling. Experimental results show that this approach provides for 95.75 % of the data being assigned correct labels. The CRF based model is shown to be superior to Neural Networks and Naive Bayes. Keywords: Conditional Random Field(CRF); labeling scanned documents; handwritten text extractio