151,771 research outputs found
Scene Text Eraser
The character information in natural scene images contains various personal
information, such as telephone numbers, home addresses, etc. It is a high risk
of leakage the information if they are published. In this paper, we proposed a
scene text erasing method to properly hide the information via an inpainting
convolutional neural network (CNN) model. The input is a scene text image, and
the output is expected to be text erased image with all the character regions
filled up the colors of the surrounding background pixels. This work is
accomplished by a CNN model through convolution to deconvolution with
interconnection process. The training samples and the corresponding inpainting
images are considered as teaching signals for training. To evaluate the text
erasing performance, the output images are detected by a novel scene text
detection method. Subsequently, the same measurement on text detection is
utilized for testing the images in benchmark dataset ICDAR2013. Compared with
direct text detection way, the scene text erasing process demonstrates a
drastically decrease on the precision, recall and f-score. That proves the
effectiveness of proposed method for erasing the text in natural scene images
WordFences: Text localization and recognition
En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)In recent years, text recognition has achieved remarkable success in recognizing scanned
document text. However, word recognition in natural images is still an open problem,
which generally requires time consuming post-processing steps. We present a novel architecture
for individual word detection in scene images based on semantic segmentation.
Our contributions are twofold: the concept of WordFence, which detects border areas
surrounding each individual word and a unique pixelwise weighted softmax loss function
which penalizes background and emphasizes small text regions. WordFence ensures that
each word is detected individually, and the new loss function provides a strong training
signal to both text and word border localization. The proposed technique avoids intensive
post-processing by combining semantic word segmentation with a voting scheme
for merging segmentations of multiple scales, producing an end-to-end word detection
system. We achieve superior localization recall on common benchmark datasets - 92%
recall on ICDAR11 and ICDAR13 and 63% recall on SVT. Furthermore, end-to-end
word recognition achieves state-of-the-art 86% F-Score on ICDAR13
Automatic detection and extraction of artificial text in video
A significant challenge in large multimedia databases is the
provision of efficient means for semantic indexing and retrieval of visual information. Artificial text in video is normally generated in order to supplement or summarise the visual content and thus is an important carrier of information that is highly relevant to the content of the video. As such, it is a potential ready-to-use source of semantic information. In this paper we present an algorithm for detection and localisation of artificial text in video using a horizontal difference magnitude measure and morphological processing. The result of character segmentation, based on a modified version of the Wolf-Jolion
algorithm [1][2] is enhanced using smoothing and multiple
binarisation. The output text is input to an âoff-the-shelfâ noncommercial OCR. Detection, localisation and recognition results for a 20min long MPEG-1 encoded television programme are presented
- âŠ