Search CORE

51,213 research outputs found

Text Segmentation in Web Images Using Colour Perception and Topological Features

Author: Karatzas Dimosthenis
Publication venue
Publication date: 01/01/2002
Field of study

The research presented in this thesis addresses the problem of Text Segmentation in Web images. Text is routinely created in image form (headers, banners etc.) on Web pages, as an attempt to overcome the stylistic limitations of HTML. This text however, has a potentially high semantic value in terms of indexing and searching for the corresponding Web pages. As current search engine technology does not allow for text extraction and recognition in images, the text in image form is ignored. Moreover, it is desirable to obtain a uniform representation of all visible text of a Web page (for applications such as voice browsing or automated content analysis). This thesis presents two methods for text segmentation in Web images using colour perception and topological features. The nature of Web images and the implicit problems to text segmentation are described, and a study is performed to assess the magnitude of the problem and establish the need for automated text segmentation methods. Two segmentation methods are subsequently presented: the Split-and-Merge segmentation method and the Fuzzy segmentation method. Although approached in a distinctly different way in each method, the safe assumption that a human being should be able to read the text in any given Web Image is the foundation of both methods’ reasoning. This anthropocentric character of the methods along with the use of topological features of connected components, comprise the underlying working principles of the methods. An approach for classifying the connected components resulting from the segmentation methods as either characters or parts of the background is also presented

CiteSeerX

Southampton (e-Prints Soton)

OpenGrey Repository

Tandem 2.0: Image and Text Data Generation Application

Author: Vitale Christopher J
Publication venue: CUNY Academic Works
Publication date: 01/02/2017
Field of study

First created as part of the Digital Humanities Praxis course in the spring of 2012 at the CUNY Graduate Center, Tandem explores the generation of datasets comprised of text and image data by leveraging Optical Character Recognition (OCR), Natural Language Processing (NLP) and Computer Vision (CV). This project builds upon that earlier work in a new programming framework. While other developers and digital humanities scholars have created similar tools specifically geared toward NLP (e.g. Voyant-Tools), as well as algorithms for image processing and feature extraction on the CV side, Tandem explores the process of developing a more robust and user-friendly web-based multimodal data generator using modern development processes with the intention of expanding the use of the tool among interested academics. Tandem functions as a full-stack JavaScript in-browser web application that allows a user to login, upload a corpus of image files for OCR, NLP, and CV based image processing to facilitate data generation. The corpora intended for this tool includes picture books, comics, and other types of image and text based manuscripts and is discussed in detail. Once images are processed, the application provides some key initial insights and data lightly visualized in a dashboard view for the user. As a research question, this project explores the viability of full-stack JavaScript application development for academic end products by looking at a variety of courses and literature that inspired the work alongside the documented process of development of the application and proposed future enhancements for the tool. For those interested in further research or development, the full codebase for this project is available for download

City University of New York

Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue
Publication date: 01/01/2004
Field of study

This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness- Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves

Southampton (e-Prints Soton)

Colour Text Segmentation in Web Images Based on Human Perception

Author: A. Antonacopoulos
Antonacopoulos
Antonacopoulos
Bedford
Clark
D. Karatzas
Jain
Kim
Lopresti
Messelodi
Moghaddamzadeh
Silverstein
Wyszecki
Publication venue
Publication date: 01/01/2007
Field of study

There is a significant need to extract and analyse the text in images on Web documents, for effective indexing, semantic analysis and even presentation by non-visual means (e.g., audio). This paper argues that the challenging segmentation stage for such images benefits from a human perspective of colour perception in preference to RGB colour space analysis. The proposed approach enables the segmentation of text in complex situations such as in the presence of varying colour and texture (characters and background). More precisely, characters are segmented as distinct regions with separate chromaticity and/or lightness by performing a layer decomposition of the image. The method described here is a result of the authors’ systematic approach to approximate the human colour perception characteristics for the identification of character regions. In this instance, the image is decomposed by performing histogram analysis of Hue and Lightness in the HLS colour space and merging using information on human discrimination of wavelength and luminance

CiteSeerX

Southampton (e-Prints Soton)

University of Salford Institutional Repository

Crossref

Visual Representation of Text in Web Documents and Its Interpretation

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

This paper examines the uses of text and its representation on Web documents in terms of the challenges in its interpretation. Particular attention is paid to the significant problem of non-uniform representation of text. This non-uniformity is mainly due to the presence of semantically important text in image form as opposed to the standard encoded text. The issues surrounding text representation in Web documents are discussed in the context of colour perception and spatial representation. The characteristics of the representation of text in image form are examined and research towards interpreting these images of text is briefly described

Southampton (e-Prints Soton)

Visual Representation of Text in Web Documents and Its Interpretation

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Southampton (e-Prints Soton)

University of Salford Institutional Repository

Two Approaches for Text Segmentation in Web Images

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue
Publication date: 01/01/2003
Field of study

There is a significant need to recognise the text in images on web pages, both for effective indexing and for presentation by non-visual means (e.g., audio). This paper presents and compares two novel methods for the segmentation of characters for subsequent extraction and recognition. The novelty of both approaches is the combination of (different in each case) topological features of characters with an anthropocentric perspective of colour perception— in preference to RGB space analysis. Both approaches enable the extraction of text in complex situations such as in the presence of varying colour and texture (characters and background)

Southampton (e-Prints Soton)

University of Salford Institutional Repository

Two Approaches for Text Segmentation in Web Images

Author: Antonacopoulos Apostolos
Karatzas Dimosthenis
Publication venue
Publication date: 01/01/2003
Field of study

Southampton (e-Prints Soton)

Image processing for the extraction of nutritional information from food labels

Author: Matsunaga Nate
Sullivan Rick
Publication venue: Scholar Commons
Publication date: 09/06/2015
Field of study

Current techniques for tracking nutritional data require undesirable amounts of either time or man-power. People must choose between tediously recording and updating dietary information or depending on unreliable crowd-sourced or costly maintained databases. Our project looks to overcome these pitfalls by providing a programming interface for image analysis that will read and report the information present on a nutrition label directly. Our solution involves a C++ library that combines image pre-processing, optical character recognition, and post-processing techniques to pull the relevant information from an image of a nutrition label. We apply an understanding of a nutrition label\u27s content and data organization to approach the accuracy of traditional data-entry methods. Our system currently provides around 80% accuracy for most label images, and we will continue to work to improve our accuracy

Scholar Commons - Santa Clara University