Although a polygon is a more accurate representation than an upright bounding
box for text detection, the annotations of polygons are extremely expensive and
challenging. Unlike existing works that employ fully-supervised training with
polygon annotations, we propose a novel text detection system termed SelfText
Beyond Polygon (SBP) with Bounding Box Supervision (BBS) and Dynamic Self
Training (DST), where training a polygon-based text detector with only a
limited set of upright bounding box annotations. For BBS, we firstly utilize
the synthetic data with character-level annotations to train a Skeleton
Attention Segmentation Network (SASN). Then the box-level annotations are
adopted to guide the generation of high-quality polygon-liked pseudo labels,
which can be used to train any detectors. In this way, our method achieves the
same performance as text detectors trained with polygon annotations (i.e., both
are 85.0% F-score for PSENet on ICDAR2015 ). For DST, through dynamically
removing the false alarms, it is able to leverage limited labeled data as well
as massive unlabeled data to further outperform the expensive baseline. We hope
SBP can provide a new perspective for text detection to save huge labeling
costs. Code is available at: github.com/weijiawu/SBP