91 research outputs found
Feature Enhancement Network: A Refined Scene Text Detector
In this paper, we propose a refined scene text detector with a \textit{novel}
Feature Enhancement Network (FEN) for Region Proposal and Text Detection
Refinement. Retrospectively, both region proposal with \textit{only} sliding-window feature and text detection refinement with \textit{single
scale} high level feature are insufficient, especially for smaller scene text.
Therefore, we design a new FEN network with \textit{task-specific},
\textit{low} and \textit{high} level semantic features fusion to improve the
performance of text detection. Besides, since \textit{unitary}
position-sensitive RoI pooling in general object detection is unreasonable for
variable text regions, an \textit{adaptively weighted} position-sensitive RoI
pooling layer is devised for further enhancing the detecting accuracy. To
tackle the \textit{sample-imbalance} problem during the refinement stage, we
also propose an effective \textit{positives mining} strategy for efficiently
training our network. Experiments on ICDAR 2011 and 2013 robust text detection
benchmarks demonstrate that our method can achieve state-of-the-art results,
outperforming all reported methods in terms of F-measure.Comment: 8 pages, 5 figures, 2 tables. This paper is accepted to appear in
AAAI 201
SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction
Facial beauty prediction (FBP) is a significant visual recognition problem to
make assessment of facial attractiveness that is consistent to human
perception. To tackle this problem, various data-driven models, especially
state-of-the-art deep learning techniques, were introduced, and benchmark
dataset become one of the essential elements to achieve FBP. Previous works
have formulated the recognition of facial beauty as a specific supervised
learning problem of classification, regression or ranking, which indicates that
FBP is intrinsically a computation problem with multiple paradigms. However,
most of FBP benchmark datasets were built under specific computation
constrains, which limits the performance and flexibility of the computational
model trained on the dataset. In this paper, we argue that FBP is a
multi-paradigm computation problem, and propose a new diverse benchmark
dataset, called SCUT-FBP5500, to achieve multi-paradigm facial beauty
prediction. The SCUT-FBP5500 dataset has totally 5500 frontal faces with
diverse properties (male/female, Asian/Caucasian, ages) and diverse labels
(face landmarks, beauty scores within [1,~5], beauty score distribution), which
allows different computational models with different FBP paradigms, such as
appearance-based/shape-based facial beauty classification/regression model for
male/female of Asian/Caucasian. We evaluated the SCUT-FBP5500 dataset for FBP
using different combinations of feature and predictor, and various deep
learning methods. The results indicates the improvement of FBP and the
potential applications based on the SCUT-FBP5500.Comment: 6 pages, 14 figures, conference pape
- …