26,926 research outputs found
An Approach to a Comprehensive Test Framework for Analysis and Evaluation of Text Line Segmentation Algorithms
The paper introduces a testing framework for the evaluation and validation of text line segmentation algorithms. Text line segmentation represents the key action for correct optical character recognition. Many of the tests for the evaluation of text line segmentation algorithms deal with text databases as reference templates. Because of the mismatch, the reliable testing framework is required. Hence, a new approach to a comprehensive experimental framework for the evaluation of text line segmentation algorithms is proposed. It consists of synthetic multi-like text samples and real handwritten text as well. Although the tests are mutually independent, the results are cross-linked. The proposed method can be used for different types of scripts and languages. Furthermore, two different procedures for the evaluation of algorithm efficiency based on the obtained error type classification are proposed. The first is based on the segmentation line error description, while the second one incorporates well-known signal detection theory. Each of them has different capabilities and convenience, but they can be used as supplements to make the evaluation process efficient. Overall the proposed procedure based on the segmentation line error description has some advantages, characterized by five measures that describe measurement procedures
Image operator learning coupled with CNN classification and its application to staff line removal
Many image transformations can be modeled by image operators that are
characterized by pixel-wise local functions defined on a finite support window.
In image operator learning, these functions are estimated from training data
using machine learning techniques. Input size is usually a critical issue when
using learning algorithms, and it limits the size of practicable windows. We
propose the use of convolutional neural networks (CNNs) to overcome this
limitation. The problem of removing staff-lines in music score images is chosen
to evaluate the effects of window and convolutional mask sizes on the learned
image operator performance. Results show that the CNN based solution
outperforms previous ones obtained using conventional learning algorithms or
heuristic algorithms, indicating the potential of CNNs as base classifiers in
image operator learning. The implementations will be made available on the
TRIOSlib project site.Comment: To appear in ICDAR 201
Veni Vidi Vici, A Three-Phase Scenario For Parameter Space Analysis in Image Analysis and Visualization
Automatic analysis of the enormous sets of images is a critical task in life
sciences. This faces many challenges such as: algorithms are highly
parameterized, significant human input is intertwined, and lacking a standard
meta-visualization approach. This paper proposes an alternative iterative
approach for optimizing input parameters, saving time by minimizing the user
involvement, and allowing for understanding the workflow of algorithms and
discovering new ones. The main focus is on developing an interactive
visualization technique that enables users to analyze the relationships between
sampled input parameters and corresponding output. This technique is
implemented as a prototype called Veni Vidi Vici, or "I came, I saw, I
conquered." This strategy is inspired by the mathematical formulas of numbering
computable functions and is developed atop ImageJ, a scientific image
processing program. A case study is presented to investigate the proposed
framework. Finally, the paper explores some potential future issues in the
application of the proposed approach in parameter space analysis in
visualization
Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground
We provide a comprehensive evaluation of salient object detection (SOD)
models. Our analysis identifies a serious design bias of existing SOD datasets
which assumes that each image contains at least one clearly outstanding salient
object in low clutter. The design bias has led to a saturated high performance
for state-of-the-art SOD models when evaluated on existing datasets. The
models, however, still perform far from being satisfactory when applied to
real-world daily scenes. Based on our analyses, we first identify 7 crucial
aspects that a comprehensive and balanced dataset should fulfill. Then, we
propose a new high quality dataset and update the previous saliency benchmark.
Specifically, our SOC (Salient Objects in Clutter) dataset, includes images
with salient and non-salient objects from daily object categories. Beyond
object category annotations, each salient image is accompanied by attributes
that reflect common challenges in real-world scenes. Finally, we report
attribute-based performance assessment on our dataset.Comment: ECCV 201
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
- …