4,194 research outputs found
WordSup: Exploiting Word Annotations for Character based Text Detection
Imagery texts are usually organized as a hierarchy of several visual
elements, i.e. characters, words, text lines and text blocks. Among these
elements, character is the most basic one for various languages such as
Western, Chinese, Japanese, mathematical expression and etc. It is natural and
convenient to construct a common text detection engine based on character
detectors. However, training character detectors requires a vast of location
annotated characters, which are expensive to obtain. Actually, the existing
real text datasets are mostly annotated in word or line level. To remedy this
dilemma, we propose a weakly supervised framework that can utilize word
annotations, either in tight quadrangles or the more loose bounding boxes, for
character detector training. When applied in scene text detection, we are thus
able to train a robust character detector by exploiting word annotations in the
rich large-scale real scene text datasets, e.g. ICDAR15 and COCO-text. The
character detector acts as a key role in the pipeline of our text detection
engine. It achieves the state-of-the-art performance on several challenging
scene text detection benchmarks. We also demonstrate the flexibility of our
pipeline by various scenarios, including deformed text detection and math
expression recognition.Comment: 2017 International Conference on Computer Visio
Driver Distraction Identification with an Ensemble of Convolutional Neural Networks
The World Health Organization (WHO) reported 1.25 million deaths yearly due
to road traffic accidents worldwide and the number has been continuously
increasing over the last few years. Nearly fifth of these accidents are caused
by distracted drivers. Existing work of distracted driver detection is
concerned with a small set of distractions (mostly, cell phone usage).
Unreliable ad-hoc methods are often used.In this paper, we present the first
publicly available dataset for driver distraction identification with more
distraction postures than existing alternatives. In addition, we propose a
reliable deep learning-based solution that achieves a 90% accuracy. The system
consists of a genetically-weighted ensemble of convolutional neural networks,
we show that a weighted ensemble of classifiers using a genetic algorithm
yields in a better classification confidence. We also study the effect of
different visual elements in distraction detection by means of face and hand
localizations, and skin segmentation. Finally, we present a thinned version of
our ensemble that could achieve 84.64% classification accuracy and operate in a
real-time environment.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0949
- …