Search CORE

215 research outputs found

AON: Towards Arbitrarily-Oriented Text Recognition

Author: Bai Fan
Cheng Zhanzhan
Niu Yi
Pu Shiliang
Xu Yangliu
Zhou Shuigeng
Publication venue
Publication date: 22/03/2018
Field of study

Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201

arXiv.org e-Print Archive

Crossref

Learned Quality Enhancement via Multi-Frame Priors for HEVC Compliant Low-Delay Applications

Author: Cheng Ming
Lu Ming
Ma Zhan
Pu Shiliang
Shen Qiu
Xu Yiling
Publication venue
Publication date: 02/05/2019
Field of study

Networked video applications, e.g., video conferencing, often suffer from poor visual quality due to unexpected network fluctuation and limited bandwidth. In this paper, we have developed a Quality Enhancement Network (QENet) to reduce the video compression artifacts, leveraging the spatial and temporal priors generated by respective multi-scale convolutions spatially and warped temporal predictions in a recurrent fashion temporally. We have integrated this QENet as a standard-alone post-processing subsystem to the High Efficiency Video Coding (HEVC) compliant decoder. Experimental results show that our QENet demonstrates the state-of-the-art performance against default in-loop filters in HEVC and other deep learning based methods with noticeable objective gains in Peak-Signal-to-Noise Ratio (PSNR) and subjective gains visually

arXiv.org e-Print Archive

Crossref

Spin gap and magnetic resonance in superconducting BaFe $_{1.9}$ Ni $%_{0.1}$ As $_{2}$

Author: Cao Guanghan
Chang Sung
Chen Ying
Dai Pengcheng
Li Linjun
Li Shiliang
Luo Yongkang
Lynn Jeffrey W.
Xu Zhu'an
Publication venue: 'American Physical Society (APS)'
Publication date: 04/02/2009
Field of study

We use neutron spectroscopy to determine the nature of the magnetic excitations in superconducting BaFe

_{1.9}

_{0.1}

_{2}

(

T_{c}=20

K). Above

T_{c}

the excitations are gapless and centered at the commensurate antiferromagnetic wave vector of the parent compound, while the intensity exhibits a sinusoidal modulation along the c-axis. As the superconducting state is entered a spin gap gradually opens, whose magnitude tracks the

T

-dependence of the superconducting gap observed by angle resolved photoemission. Both the spin gap and magnetic resonance energies are temperature \textit{and} wave vector dependent, but their ratio is the same within uncertainties. These results suggest that the spin resonance is a singlet-triplet excitation related to electron pairing and superconductivity.Comment: 4 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Machine Learning with Applications to Autonomous Systems

Author: Busoniu Lucian
He Haibo
Sun Shiliang
Xu Xin
Yang Simon X.
Zhao Dongbin
Publication venue: DigitalCommons@URI
Publication date: 01/01/2015
Field of study

Crossref

Directory of Open Access Journals

DigitalCommons@URI

E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network

Author: Cao Guimei
Cheng Zhanzhan
Li Duo
Niu Yi
Pu Shiliang
Wu Fei
Xu Yunlu
Publication venue
Publication date: 14/07/2022
Field of study

Expandable networks have demonstrated their advantages in dealing with catastrophic forgetting problem in incremental learning. Considering that different tasks may need different structures, recent methods design dynamic structures adapted to different tasks via sophisticated skills. Their routine is to search expandable structures first and then train on the new tasks, which, however, breaks tasks into multiple training stages, leading to suboptimal or overmuch computational cost. In this paper, we propose an end-to-end trainable adaptively expandable network named E2-AEN, which dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks. Specifically, the network contains a serial of powerful feature adapters for augmenting the previously learned representations to new tasks, and avoiding task interference. These adapters are controlled via an adaptive gate-based pruning strategy which decides whether the expanded structures can be pruned, making the network structure dynamically changeable according to the complexity of the new tasks. Moreover, we introduce a novel sparsity-activation regularization to encourage the model to learn discriminative features with limited parameters. E2-AEN reduces cost and can be built upon any feed-forward architectures in an end-to-end manner. Extensive experiments on both classification (i.e., CIFAR and VDD) and detection (i.e., COCO, VOC and ICCV2021 SSLAD challenge) benchmarks demonstrate the effectiveness of the proposed method, which achieves the new remarkable results

arXiv.org e-Print Archive

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

Author: Chen Ying
Cheng Zhanzhan
Niu Yi
Pu Shiliang
Qiao Liang
Wu Fei
Xu Yunlu
Publication venue
Publication date: 18/05/2021
Field of study

Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications. Most methods attempt to develop various region of interest (RoI) operations to concatenate the detection part and the sequence recognition part into a two-stage text spotting framework. However, in such framework, the recognition part is highly sensitive to the detected results (e.g.), the compactness of text contours). To address this problem, in this paper, we propose a novel Mask AttentioN Guided One-stage text spotting framework named MANGO, in which character sequences can be directly recognized without RoI operation. Concretely, a position-aware mask attention module is developed to generate attention weights on each text instance and its characters. It allows different text instances in an image to be allocated on different feature map channels which are further grouped as a batch of instance features. Finally, a lightweight sequence decoder is applied to generate the character sequences. It is worth noting that MANGO inherently adapts to arbitrary-shaped text spotting and can be trained end-to-end with only coarse position information (e.g.), rectangular bounding box) and text annotations. Experimental results show that the proposed method achieves competitive and even new state-of-the-art performance on both regular and irregular text spotting benchmarks, i.e., ICDAR 2013, ICDAR 2015, Total-Text, and SCUT-CTW1500.Comment: Accepted to AAAI2021. Code is available at https://davar-lab.github.io/publication.html or https://github.com/hikopensource/DAVAR-Lab-OC

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Deep Active Learning for Computer Vision: Past and Future

Author: Chen Marco Tianyu
Feng Zhanpeng
Liu Xu
Mao Shunan
Takezoe Rinyoichi
Wang Xiaoyu
Zhang Shiliang
Publication venue: 'Now Publishers'
Publication date: 24/12/2022
Field of study

As an important data selection schema, active learning emerges as the essential component when iterating an Artificial Intelligence (AI) model. It becomes even more critical given the dominance of deep neural network based models, which are composed of a large number of parameters and data hungry, in application. Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions. In this paper, we present a review of active learning through deep active learning approaches from the following perspectives: 1) technical advancements in active learning, 2) applications of active learning in computer vision, 3) industrial systems leveraging or with potential to leverage active learning for data iteration, 4) current limitations and future research directions. We expect this paper to clarify the significance of active learning in a modern AI model manufacturing process and to bring additional research attention to active learning. By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies by boosting model production at scale.Comment: Accepted by APSIPA Transactions on Signal and Information Processin

arXiv.org e-Print Archive