1 research outputs found
Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images
Word spotting in natural scene images has many applications in scene
understanding and visual assistance. In this paper we propose a technique to
create and exploit an intermediate representation of images based on text
attributes which are character probability maps. Our representation extends the
concept of the Pyramidal Histogram Of Characters (PHOC) by exploiting Fully
Convolutional Networks to derive a pixel-wise mapping of the character
distribution within candidate word regions. We call this representation the
Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word
spotting tasks in egocentric camera streams through an efficient text line
proposal algorithm. This is based on the Hough Transform over character
attribute maps followed by scoring using Dynamic Time Warping (DTW). We
evaluate our results on ICDAR 2015 Challenge 4 dataset of incidental scene text
captured by an egocentric camera.Comment: 9 pages, 10 figures, The Third International Workshop on Egocentric
Perception, Interaction and Computing (EPIC) at ECCV201