3,136 research outputs found
Query by String word spotting based on character bi-gram indexing
In this paper we propose a segmentation-free query by string word spotting
method. Both the documents and query strings are encoded using a recently
proposed word representa- tion that projects images and strings into a common
atribute space based on a pyramidal histogram of characters(PHOC). These
attribute models are learned using linear SVMs over the Fisher Vector
representation of the images along with the PHOC labels of the corresponding
strings. In order to search through the whole page, document regions are
indexed per character bi- gram using a similar attribute representation. On top
of that, we propose an integral image representation of the document using a
simplified version of the attribute model for efficient computation. Finally we
introduce a re-ranking step in order to boost retrieval performance. We show
state-of-the-art results for segmentation-free query by string word spotting in
single-writer and multi-writer standard datasetsComment: To be published in ICDAR201
Rotation-invariant features for multi-oriented text detection in natural images.
Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes
"'Who are you?' - Learning person specific classifiers from video"
We investigate the problem of automatically labelling
faces of characters in TV or movie material with their
names, using only weak supervision from automaticallyaligned
subtitle and script text. Our previous work (Everingham
et al. [8]) demonstrated promising results on the
task, but the coverage of the method (proportion of video
labelled) and generalization was limited by a restriction to
frontal faces and nearest neighbour classification.
In this paper we build on that method, extending the coverage
greatly by the detection and recognition of characters
in profile views. In addition, we make the following contributions:
(i) seamless tracking, integration and recognition
of profile and frontal detections, and (ii) a character specific
multiple kernel classifier which is able to learn the features
best able to discriminate between the characters.
We report results on seven episodes of the TV series
“Buffy the Vampire Slayer”, demonstrating significantly increased
coverage and performance with respect to previous
methods on this material
- …