16,470 research outputs found
Screen Content Image Segmentation Using Sparse-Smooth Decomposition
Sparse decomposition has been extensively used for different applications
including signal compression and denoising and document analysis. In this
paper, sparse decomposition is used for image segmentation. The proposed
algorithm separates the background and foreground using a sparse-smooth
decomposition technique such that the smooth and sparse components correspond
to the background and foreground respectively. This algorithm is tested on
several test images from HEVC test sequences and is shown to have superior
performance over other methods, such as the hierarchical k-means clustering in
DjVu. This segmentation algorithm can also be used for text extraction, video
compression and medical image segmentation.Comment: Asilomar Conference on Signals, Systems and Computers, IEEE, 2015,
(to Appear
Logical segmentation for article extraction in digitized old newspapers
Newspapers are documents made of news item and informative articles. They are
not meant to be red iteratively: the reader can pick his items in any order he
fancies. Ignoring this structural property, most digitized newspaper archives
only offer access by issue or at best by page to their content. We have built a
digitization workflow that automatically extracts newspaper articles from
images, which allows indexing and retrieval of information at the article
level. Our back-end system extracts the logical structure of the page to
produce the informative units: the articles. Each image is labelled at the
pixel level, through a machine learning based method, then the page logical
structure is constructed up from there by the detection of structuring entities
such as horizontal and vertical separators, titles and text lines. This logical
structure is stored in a METS wrapper associated to the ALTO file produced by
the system including the OCRed text. Our front-end system provides a web high
definition visualisation of images, textual indexing and retrieval facilities,
searching and reading at the article level. Articles transcriptions can be
collaboratively corrected, which as a consequence allows for better indexing.
We are currently testing our system on the archives of the Journal de Rouen,
one of France eldest local newspaper. These 250 years of publication amount to
300 000 pages of very variable image quality and layout complexity. Test year
1808 can be consulted at plair.univ-rouen.fr.Comment: ACM Document Engineering, France (2012
A Deep Learning Approach to Denoise Optical Coherence Tomography Images of the Optic Nerve Head
Purpose: To develop a deep learning approach to de-noise optical coherence
tomography (OCT) B-scans of the optic nerve head (ONH).
Methods: Volume scans consisting of 97 horizontal B-scans were acquired
through the center of the ONH using a commercial OCT device (Spectralis) for
both eyes of 20 subjects. For each eye, single-frame (without signal
averaging), and multi-frame (75x signal averaging) volume scans were obtained.
A custom deep learning network was then designed and trained with 2,328 "clean
B-scans" (multi-frame B-scans), and their corresponding "noisy B-scans" (clean
B-scans + gaussian noise) to de-noise the single-frame B-scans. The performance
of the de-noising algorithm was assessed qualitatively, and quantitatively on
1,552 B-scans using the signal to noise ratio (SNR), contrast to noise ratio
(CNR), and mean structural similarity index metrics (MSSIM).
Results: The proposed algorithm successfully denoised unseen single-frame OCT
B-scans. The denoised B-scans were qualitatively similar to their corresponding
multi-frame B-scans, with enhanced visibility of the ONH tissues. The mean SNR
increased from dB (single-frame) to dB
(denoised). For all the ONH tissues, the mean CNR increased from (single-frame) to (denoised). The MSSIM increased from
(single frame) to (denoised) when compared with
the corresponding multi-frame B-scans.
Conclusions: Our deep learning algorithm can denoise a single-frame OCT
B-scan of the ONH in under 20 ms, thus offering a framework to obtain superior
quality OCT B-scans with reduced scanning times and minimal patient discomfort
Text Line Segmentation of Historical Documents: a Survey
There is a huge amount of historical documents in libraries and in various
National Archives that have not been exploited electronically. Although
automatic reading of complete pages remains, in most cases, a long-term
objective, tasks such as word spotting, text/image alignment, authentication
and extraction of specific fields are in use today. For all these tasks, a
major step is document segmentation into text lines. Because of the low quality
and the complexity of these documents (background noise, artifacts due to
aging, interfering lines),automatic text line segmentation remains an open
research field. The objective of this paper is to present a survey of existing
methods, developed during the last decade, and dedicated to documents of
historical interest.Comment: 25 pages, submitted version, To appear in International Journal on
Document Analysis and Recognition, On line version available at
http://www.springerlink.com/content/k2813176280456k3
Pointwise Convolutional Neural Networks
Deep learning with 3D data such as reconstructed point clouds and CAD models
has received great research interests recently. However, the capability of
using point clouds with convolutional neural network has been so far not fully
explored. In this paper, we present a convolutional neural network for semantic
segmentation and object recognition with 3D point clouds. At the core of our
network is pointwise convolution, a new convolution operator that can be
applied at each point of a point cloud. Our fully convolutional network design,
while being surprisingly simple to implement, can yield competitive accuracy in
both semantic segmentation and object recognition task.Comment: 10 pages, 6 figures, 10 tables. Paper accepted to CVPR 201
- …