4,763 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition
Recognizing irregular text in natural scene images is challenging due to the
large variance in text appearance, such as curvature, orientation and
distortion. Most existing approaches rely heavily on sophisticated model
designs and/or extra fine-grained annotations, which, to some extent, increase
the difficulty in algorithm implementation and data collection. In this work,
we propose an easy-to-implement strong baseline for irregular scene text
recognition, using off-the-shelf neural network components and only word-level
annotations. It is composed of a -layer ResNet, an LSTM-based
encoder-decoder framework and a 2-dimensional attention module. Despite its
simplicity, the proposed method is robust and achieves state-of-the-art
performance on both regular and irregular scene text recognition benchmarks.
Code is available at: https://tinyurl.com/ShowAttendReadComment: Accepted to Proc. AAAI Conference on Artificial Intelligence 201
WordSup: Exploiting Word Annotations for Character based Text Detection
Imagery texts are usually organized as a hierarchy of several visual
elements, i.e. characters, words, text lines and text blocks. Among these
elements, character is the most basic one for various languages such as
Western, Chinese, Japanese, mathematical expression and etc. It is natural and
convenient to construct a common text detection engine based on character
detectors. However, training character detectors requires a vast of location
annotated characters, which are expensive to obtain. Actually, the existing
real text datasets are mostly annotated in word or line level. To remedy this
dilemma, we propose a weakly supervised framework that can utilize word
annotations, either in tight quadrangles or the more loose bounding boxes, for
character detector training. When applied in scene text detection, we are thus
able to train a robust character detector by exploiting word annotations in the
rich large-scale real scene text datasets, e.g. ICDAR15 and COCO-text. The
character detector acts as a key role in the pipeline of our text detection
engine. It achieves the state-of-the-art performance on several challenging
scene text detection benchmarks. We also demonstrate the flexibility of our
pipeline by various scenarios, including deformed text detection and math
expression recognition.Comment: 2017 International Conference on Computer Visio
A Review of Recent Advances and Challenges in Grocery Label Detection and Recognition
When compared with traditional local shops where the customer has a personalised service,
in large retail departments, the client has to make his purchase decisions independently, mostly
supported by the information available in the package. Additionally, people are becoming more
aware of the importance of the food ingredients and demanding about the type of products they buy
and the information provided in the package, despite it often being hard to interpret. Big shops such
as supermarkets have also introduced important challenges for the retailer due to the large number
of different products in the store, heterogeneous affluence and the daily needs of item repositioning.
In this scenario, the automatic detection and recognition of products on the shelves or off the shelves
has gained increased interest as the application of these technologies may improve the shopping
experience through self-assisted shopping apps and autonomous shopping, or even benefit stock
management with real-time inventory, automatic shelf monitoring and product tracking. These
solutions can also have an important impact on customers with visual impairments. Despite recent
developments in computer vision, automatic grocery product recognition is still very challenging,
with most works focusing on the detection or recognition of a small number of products, often under
controlled conditions. This paper discusses the challenges related to this problem and presents a
review of proposed methods for retail product label processing, with a special focus on assisted
analysis for customer support, including for the visually impaired. Moreover, it details the public
datasets used in this topic and identifies their limitations, and discusses future research directions of
related fields.info:eu-repo/semantics/publishedVersio
- …