54,434 research outputs found
A fine-grained approach to scene text script identification
This paper focuses on the problem of script identification in unconstrained
scenarios. Script identification is an important prerequisite to recognition,
and an indispensable condition for automatic text understanding systems
designed for multi-language environments. Although widely studied for document
images and handwritten documents, it remains an almost unexplored territory for
scene text images.
We detail a novel method for script identification in natural images that
combines convolutional features and the Naive-Bayes Nearest Neighbor
classifier. The proposed framework efficiently exploits the discriminative
power of small stroke-parts, in a fine-grained classification framework.
In addition, we propose a new public benchmark dataset for the evaluation of
joint text detection and script identification in natural scenes. Experiments
done in this new dataset demonstrate that the proposed method yields state of
the art results, while it generalizes well to different datasets and variable
number of scripts. The evidence provided shows that multi-lingual scene text
recognition in the wild is a viable proposition. Source code of the proposed
method is made available online
The Robust Reading Competition Annotation and Evaluation Platform
The ICDAR Robust Reading Competition (RRC), initiated in 2003 and
re-established in 2011, has become a de-facto evaluation standard for robust
reading systems and algorithms. Concurrent with its second incarnation in 2011,
a continuous effort started to develop an on-line framework to facilitate the
hosting and management of competitions. This paper outlines the Robust Reading
Competition Annotation and Evaluation Platform, the backbone of the
competitions. The RRC Annotation and Evaluation Platform is a modular
framework, fully accessible through on-line interfaces. It comprises a
collection of tools and services for managing all processes involved with
defining and evaluating a research task, from dataset definition to annotation
management, evaluation specification and results analysis. Although the
framework has been designed with robust reading research in mind, many of the
provided tools are generic by design. All aspects of the RRC Annotation and
Evaluation Framework are available for research use.Comment: 6 pages, accepted to DAS 201
Multimedia information technology and the annotation of video
The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning
Story Development in Cinematography
First off, I’ve got to argue for the use of the word “cinematography” over “camera”. One is to utilize a word I would like to further unpack. Another is to utilize a word that simply implies a relationship to another art form entirely – photography. I often say to my students that some cinematographers initially come from the lighting point of view and some come from the camera, but ultimately what great cinematographers do is understand a story (not just a moment that tells a story – there is a significant difference) – and tell it. If I say that storytelling is the most and primary function of a cinematographer, then how do we teach storytelling to our students in a classroom? Obviously it is possible to teach them tools of “photography” – lenses/optics, composition, chemistry, sensitometry etc. and lighting – this is an HMI, this is flicker, memorize WAV, etc. However, how do we teach them how to tell a story with these tools? I have been working the last few years on teaching my students story development tools that are appropriate for cinematographers. Tools which as they go forward into their own practice have begun to give real results in terms of not only storytelling, but in the students creating their own relevant visual styles. For them to utilize these tools they need to engage not only in pre-production time, but in story development time – which is a period rarely engaged in at the student level, but is crucial if we want them to become anything other than the takers of pretty pictures
- …