Search CORE

152 research outputs found

Reports of the DAS02 Working Groups

Author: Barney Smith Elisa
El-Nasan Adnan
Ingold Rolf
Kise Koichi
Malizia Alessio
Monn David
Todoran Leon
Veeramachaneni Harsha
Publication venue: 'IUScholarWorks'
Publication date: 16/03/2004
Field of study

This document is a collection of four working group reports in the areas of digital libraries, document image retrieval, layout analysis, and Web document analysis. These reports were the outcome of discussions by participants at the Fifth IAPR International Workshop on Document Analysis Systems held in Princeton, NJ on 19-21 August 2002

Boise State University - ScholarWorks

Colour Text Segmentation in Web Images Based on Human Perception

Author: A. Antonacopoulos
Antonacopoulos
Antonacopoulos
Bedford
Clark
D. Karatzas
Jain
Kim
Lopresti
Messelodi
Moghaddamzadeh
Silverstein
Wyszecki
Publication venue
Publication date: 01/01/2007
Field of study

There is a significant need to extract and analyse the text in images on Web documents, for effective indexing, semantic analysis and even presentation by non-visual means (e.g., audio). This paper argues that the challenging segmentation stage for such images benefits from a human perspective of colour perception in preference to RGB colour space analysis. The proposed approach enables the segmentation of text in complex situations such as in the presence of varying colour and texture (characters and background). More precisely, characters are segmented as distinct regions with separate chromaticity and/or lightness by performing a layer decomposition of the image. The method described here is a result of the authors’ systematic approach to approximate the human colour perception characteristics for the identification of character regions. In this instance, the image is decomposed by performing histogram analysis of Hue and Lightness in the HLS colour space and merging using information on human discrimination of wavelength and luminance

CiteSeerX

Southampton (e-Prints Soton)

University of Salford Institutional Repository

Crossref

Visual Representation of Text in Web Documents and Its Interpretation

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

This paper examines the uses of text and its representation on Web documents in terms of the challenges in its interpretation. Particular attention is paid to the significant problem of non-uniform representation of text. This non-uniformity is mainly due to the presence of semantically important text in image form as opposed to the standard encoded text. The issues surrounding text representation in Web documents are discussed in the context of colour perception and spatial representation. The characteristics of the representation of text in image form are examined and research towards interpreting these images of text is briefly described

Southampton (e-Prints Soton)

University of Salford Institutional Repository

Visual Representation of Text in Web Documents and Its Interpretation

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Southampton (e-Prints Soton)

Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue
Publication date: 01/01/2004
Field of study

This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness- Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves

Southampton (e-Prints Soton)

A Fuzzy Approach to Text Segmentation in Web Images Based on Human Colour Perception

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue: World Scientific Publishing Company
Publication date: 01/12/2003
Field of study

This chapter describes a new approach for the segmentation of text in images on Web pages. In the same spirit as the authors’ previous work on this subject, this approach attempts to model the ability of humans to differentiate between colours. In this case, pixels of similar colour are first grouped using a colour distance defined in a perceptually uniform colour space (as opposed to the commonly used RGB). The resulting colour connected components are then grouped to form larger (character-like) regions with the aid of a propinquity measure, which is the output of a fuzzy inference system. This measure expresses the likelihood for merging two components based on two features. The first feature is the colour distance between the components, in the L*a*b* colour space. The second feature expresses the topological relationship of two components. The results of the method indicate a better performance than previous methods devised by the authors and possibly better (a direct comparison is not really possible due to the differences in application domain characteristics between this and previous methods) performance to other existing methods

Southampton (e-Prints Soton)

Informatics Research Institute (IRIS) October 2005 newsletter

Author: Rezgui Y
Publication venue: University of Salford, UK
Publication date: 01/10/2005
Field of study

University of Salford Institutional Repository