Search CORE

12,618 research outputs found

Text Line Segmentation of Historical Documents: a Survey

Author: A. Amin
A. Bozzi
A. Downton
A. Jain
A. Kolcz
Abderrazak Zahour
Bruno Taconet
C.L. Tan
C.V. Lakshmi
E. Cohen
E. Oztop
G. Seni
I.-K. Kim
K. Wong
L. Likforman-Sulem
L. Likforman-Sulem
L. Likforman-Sulem
L. O’Gorman
L.A. Fletcher
Laurence Likforman-Sulem
R. Plamondon
R.D. Lins
U. Pal
V. Shapiro
Ventadert Gusnard de de
Y. Solihin
Y.H. Tseng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/04/2007
Field of study

There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade, and dedicated to documents of historical interest.Comment: 25 pages, submitted version, To appear in International Journal on Document Analysis and Recognition, On line version available at http://www.springerlink.com/content/k2813176280456k3

arXiv.org e-Print Archive

Crossref

Non-Visual Representation of Complex Documents for Use in Digital Talking Books

Author: Nazemi Azadeh
Publication venue: Curtin University
Publication date: 01/01/2015
Field of study

Essential written information such as text books, bills, and catalogues needs to be accessible by everyone. However, access is not always available to vision-impaired people. As they require electronic documents to be available in specific formats. In order to address the accessibility issues of electronic documents, this research aims to design an affordable, portable, standalone and simple to use complete reading system that will convert and describe complex components in electronic documents to print disabled users

Irish Universities

DCU Online Research Access Service

espace@Curtin

Recommended from our members

Simultaneous mesoscopic and two-photon imaging of neuronal activity in cortical circuits.

Author: Barson Daniel
Cardin Jessica A
Constable R Todd
Crair Michael C
Hamodi Ali S
Higley Michael J
Lur Gyorgy
Shen Xilin
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Spontaneous and sensory-evoked activity propagates across varying spatial scales in the mammalian cortex, but technical challenges have limited conceptual links between the function of local neuronal circuits and brain-wide network dynamics. We present a method for simultaneous cellular-resolution two-photon calcium imaging of a local microcircuit and mesoscopic widefield calcium imaging of the entire cortical mantle in awake mice. Our multi-scale approach involves a microscope with an orthogonal axis design where the mesoscopic objective is oriented above the brain and the two-photon objective is oriented horizontally, with imaging performed through a microprism. We also introduce a viral transduction method for robust and widespread gene delivery in the mouse brain. These approaches allow us to identify the behavioral state-dependent functional connectivity of pyramidal neurons and vasoactive intestinal peptide-expressing interneurons with long-range cortical networks. Our imaging system provides a powerful strategy for investigating cortical architecture across a wide range of spatial scales

eScholarship - University of California

Information Preserving Processing of Noisy Handwritten Document Images

Author: Chen Jin
Publication venue: Lehigh Preserve
Publication date
Field of study

Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

Lehigh University: Lehigh Preserve

Web-Based Visualization of Very Large Scientific Astronomy Imagery

Author: Bertin Emmanuel
Marmo Chiara
Pillay Ruven
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Visualizing and navigating through large astronomy images from a remote location with current astronomy display tools can be a frustrating experience in terms of speed and ergonomics, especially on mobile devices. In this paper, we present a high performance, versatile and robust client-server system for remote visualization and analysis of extremely large scientific images. Applications of this work include survey image quality control, interactive data query and exploration, citizen science, as well as public outreach. The proposed software is entirely open source and is designed to be generic and applicable to a variety of datasets. It provides access to floating point data at terabyte scales, with the ability to precisely adjust image settings in real-time. The proposed clients are light-weight, platform-independent web applications built on standard HTML5 web technologies and compatible with both touch and mouse-based devices. We put the system to the test and assess the performance of the system and show that a single server can comfortably handle more than a hundred simultaneous users accessing full precision 32 bit astronomy data.Comment: Published in Astronomy & Computing. IIPImage server available from http://iipimage.sourceforge.net . Visiomatic code and demos available from http://www.visiomatic.org

arXiv.org e-Print Archive

CiteSeerX

HAL-INSU

Hal-Diderot

Investigation of techniques for inventorying forested regions. Volume 2: Forestry information system requirements and joint use of remotely sensed and ancillary data

Author: Cicone R. C.
Crist E. P.
Malila W. A.
Nalepka R. F.
Publication venue
Publication date
Field of study

The author has identified the following significant results. Effects of terrain topography in mountainous forested regions on LANDSAT signals and classifier training were found to be significant. The aspect of sloping terrain relative to the sun's azimuth was the major cause of variability. A relative insolation factor could be defined which, in a single variable, represents the joint effects of slope and aspect and solar geometry on irradiance. Forest canopy reflectances were bound, both through simulation, and empirically, to have nondiffuse reflectance characteristics. Training procedures could be improved by stratifying in the space of ancillary variables and training in each stratum. Application of the Tasselled-Cap transformation for LANDSAT data acquired over forested terrain could provide a viable technique for data compression and convenient physical interpretations

NASA Technical Reports Server

Structure Diagram Recognition in Financial Announcements

Author: Hou Qiyu
Li Ruixuan
Qiao Meixuan
Wang Jun
Xiang Junfu
Publication venue
Publication date: 25/04/2023
Field of study

Accurately extracting structured data from structure diagrams in financial announcements is of great practical importance for building financial knowledge graphs and further improving the efficiency of various financial applications. First, we proposed a new method for recognizing structure diagrams in financial announcements, which can better detect and extract different types of connecting lines, including straight lines, curves, and polylines of different orientations and angles. Second, we developed a two-stage method to efficiently generate the industry's first benchmark of structure diagrams from Chinese financial announcements, where a large number of diagrams were synthesized and annotated using an automated tool to train a preliminary recognition model with fairly good performance, and then a high-quality benchmark can be obtained by automatically annotating the real-world structure diagrams using the preliminary model and then making few manual corrections. Finally, we experimentally verified the significant performance advantage of our structure diagram recognition method over previous methods

arXiv.org e-Print Archive

Modern Information Systems

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The development of modern information systems is a demanding task. New technologies and tools are designed, implemented and presented in the market on a daily bases. User needs change dramatically fast and the IT industry copes to reach the level of efficiency and adaptability for its systems in order to be competitive and up-to-date. Thus, the realization of modern information systems with great characteristics and functionalities implemented for specific areas of interest is a fact of our modern and demanding digital society and this is the main scope of this book. Therefore, this book aims to present a number of innovative and recently developed information systems. It is titled "Modern Information Systems" and includes 8 chapters. This book may assist researchers on studying the innovative functions of modern systems in various areas like health, telematics, knowledge management, etc. It can also assist young students in capturing the new research tendencies of the information systems' development

Directory of Open Access Books (DOAB)