32 research outputs found
AON: Towards Arbitrarily-Oriented Text Recognition
Recognizing text from natural images is a hot research topic in computer
vision due to its various applications. Despite the enduring research of
several decades on optical character recognition (OCR), recognizing texts from
natural images is still a challenging task. This is because scene texts are
often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted)
arrangements, which have not yet been well addressed in the literature.
Existing methods on text recognition mainly work with regular (horizontal and
frontal) texts and cannot be trivially generalized to handle irregular texts.
In this paper, we develop the arbitrary orientation network (AON) to directly
capture the deep features of irregular texts, which are combined into an
attention-based decoder to generate character sequence. The whole network can
be trained end-to-end by using only images and word-level annotations.
Extensive experiments on various benchmarks, including the CUTE80,
SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed
AON-based method achieves the-state-of-the-art performance in irregular
datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201
Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting
End-to-end text spotting has attached great attention recently due to its
benefits on global optimization and high maintainability for real applications.
However, the input scale has always been a tough trade-off since recognizing a
small text instance usually requires enlarging the whole image, which brings
high computational costs. In this paper, to address this problem, we propose a
novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting
framework, which aims to infer images in different small but recognizable
resolutions and achieve a better balance between accuracy and efficiency.
Concretely, we adopt a resolution selector to dynamically decide the input
resolutions for different images, which is constraint by both inference
accuracy and computational cost. Another sequential knowledge distillation
strategy is conducted on the text recognition branch, making the low-res input
obtains comparable performance to a high-res image. The proposed method can be
optimized end-to-end and adopted in any current text spotting framework to
improve the practicability. Extensive experiments on several text spotting
benchmarks show that the proposed method vastly improves the usability of
low-res models. The code is available at
https://github.com/hikopensource/DAVAR-Lab-OCR/.Comment: Accept by ECCV202
E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network
Expandable networks have demonstrated their advantages in dealing with
catastrophic forgetting problem in incremental learning. Considering that
different tasks may need different structures, recent methods design dynamic
structures adapted to different tasks via sophisticated skills. Their routine
is to search expandable structures first and then train on the new tasks,
which, however, breaks tasks into multiple training stages, leading to
suboptimal or overmuch computational cost. In this paper, we propose an
end-to-end trainable adaptively expandable network named E2-AEN, which
dynamically generates lightweight structures for new tasks without any accuracy
drop in previous tasks. Specifically, the network contains a serial of powerful
feature adapters for augmenting the previously learned representations to new
tasks, and avoiding task interference. These adapters are controlled via an
adaptive gate-based pruning strategy which decides whether the expanded
structures can be pruned, making the network structure dynamically changeable
according to the complexity of the new tasks. Moreover, we introduce a novel
sparsity-activation regularization to encourage the model to learn
discriminative features with limited parameters. E2-AEN reduces cost and can be
built upon any feed-forward architectures in an end-to-end manner. Extensive
experiments on both classification (i.e., CIFAR and VDD) and detection (i.e.,
COCO, VOC and ICCV2021 SSLAD challenge) benchmarks demonstrate the
effectiveness of the proposed method, which achieves the new remarkable
results
A Kind of New Surface Modeling Method Based on DEM Data
Surface elevation changes greatly in the river erosion area. Due to the limitation of the acquisition equipment and cost, the traditional seismic acquisition data has sparse physical points both horizontally and longitudinally, the density of surface measurement data is not enough to survey the surface structure in detail. With the development of science and technology, and the application of satellite technology, the DEM elevation data obtained from the geographic information system (GIS) are becoming more and more accurate. In this paper, a precise modeling is performed on the surface based on the geographic information from the river erosion area and combined with the results of the surface survey control points, a good effect is achieved.Key words: River erosion area; Geographic information; Similarity coefficient; Kriging interpolation; Surface modeling; High and low frequency static