Search CORE

15,905 research outputs found

COCO_TS Dataset: Pixel-level Annotations Based on Weak Supervision for Scene Text Segmentation

Author: B. Gatos
LC Chen
Max Jaderberg
N Otsu
P Andreini
S Bonechi
T-Y Lin
Y Tang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The absence of large scale datasets with pixel-level supervisions is a significant obstacle for the training of deep convolutional networks for scene text segmentation. For this reason, synthetic data generation is normally employed to enlarge the training dataset. Nonetheless, synthetic data cannot reproduce the complexity and variability of natural images. In this paper, a weakly supervised learning approach is used to reduce the shift between training on real and synthetic data. Pixel-level supervisions for a text detection dataset (i.e. where only bounding-box annotations are available) are generated. In particular, the COCO-Text-Segmentation (COCO_TS) dataset, which provides pixel-level supervisions for the COCO-Text dataset, is created and released. The generated annotations are used to train a deep convolutional neural network for semantic segmentation. Experiments show that the proposed dataset can be used instead of synthetic data, allowing us to use only a fraction of the training samples and significantly improving the performances

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università degli Studi di Siena

e-Counterfeit: a mobile-server platform for document counterfeit detection

Author: Berenguel Albert
Cañero Cristina
Lladós Josep
Terrades Oriol Ramos
Publication venue
Publication date: 21/08/2017
Field of study

This paper presents a novel application to detect counterfeit identity documents forged by a scan-printing operation. Texture analysis approaches are proposed to extract validation features from security background that is usually printed in documents as IDs or banknotes. The main contribution of this work is the end-to-end mobile-server architecture, which provides a service for non-expert users and therefore can be used in several scenarios. The system also provides a crowdsourcing mode so labeled images can be gathered, generating databases for incremental training of the algorithms.Comment: 6 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Screen Content Image Segmentation Using Sparse-Smooth Decomposition

Author: Abdolrashidi Amirali
Minaee Shervin
Wang Yao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/11/2015
Field of study

Sparse decomposition has been extensively used for different applications including signal compression and denoising and document analysis. In this paper, sparse decomposition is used for image segmentation. The proposed algorithm separates the background and foreground using a sparse-smooth decomposition technique such that the smooth and sparse components correspond to the background and foreground respectively. This algorithm is tested on several test images from HEVC test sequences and is shown to have superior performance over other methods, such as the hierarchical k-means clustering in DjVu. This segmentation algorithm can also be used for text extraction, video compression and medical image segmentation.Comment: Asilomar Conference on Signals, Systems and Computers, IEEE, 2015, (to Appear

arXiv.org e-Print Archive

Crossref

Efficient Scene Text Localization and Recognition with Local Character Refinement

Author: Matas Jiří
Neumann Lukáš
Publication venue
Publication date: 14/04/2015
Field of study

An unconstrained end-to-end text localization and recognition method is presented. The method detects initial text hypothesis in a single pass by an efficient region-based method and subsequently refines the text hypothesis using a more robust local text model, which deviates from the common assumption of region-based methods that all characters are detected as connected components. Additionally, a novel feature based on character stroke area estimation is introduced. The feature is efficiently computed from a region distance map, it is invariant to scaling and rotations and allows to efficiently detect text regions regardless of what portion of text they capture. The method runs in real time and achieves state-of-the-art text localization and recognition results on the ICDAR 2013 Robust Reading dataset

arXiv.org e-Print Archive

Crossref

Learning to Generate Posters of Scientific Papers

Author: Fu Yanwei
Guo Yanwen
Qiang Yuting
Sigal Leonid
Zhou Zhi-Hua
Publication venue
Publication date: 21/02/2016
Field of study

Researchers often summarize their work in the form of posters. Posters provide a coherent and efficient way to convey core ideas from scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including panel layout and attributes of each panel, are learned and inferred from data. Then, given inferred layout and attributes, composition of graphical elements within each panel is synthesized. To learn and validate our model, we collect and make public a Poster-Paper dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach.Comment: in Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI'16), Phoenix, AZ, 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications