49 research outputs found

    Decoupled Attention Network for Text Recognition

    Full text link
    Text recognition has attracted considerable research interests because of its various applications. The cutting-edge text recognition methods are based on attention mechanisms. However, most of attention methods usually suffer from serious alignment problem due to its recurrency alignment operation, where the alignment relies on historical decoding results. To remedy this issue, we propose a decoupled attention network (DAN), which decouples the alignment operation from using historical decoding results. DAN is an effective, flexible and robust end-to-end text recognizer, which consists of three components: 1) a feature encoder that extracts visual features from the input image; 2) a convolutional alignment module that performs the alignment operation based on visual features from the encoder; and 3) a decoupled text decoder that makes final prediction by jointly using the feature map and attention maps. Experimental results show that DAN achieves state-of-the-art performance on multiple text recognition tasks, including offline handwritten text recognition and regular/irregular scene text recognition.Comment: 9 pages, 8 figures, 6 tables, accepted by AAAI-202

    Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

    Full text link
    Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection and recognition) and information extraction, which completely ignored the high correlation among them during optimization. In this paper, we propose a robust visual information extraction system (VIES) towards real-world scenarios, which is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction by taking a single document image as input and outputting the structured information. Specifically, the information extraction branch collects abundant visual and semantic representations from text spotting for multimodal feature fusion and conversely, provides higher-level semantic clues to contribute to the optimization of text spotting. Moreover, regarding the shortage of public benchmarks, we construct a fully-annotated dataset called EPHOIE (https://github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances. Compared with the state-of-the-art methods, our VIES shows significant superior performance on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used SROIE dataset under the end-to-end scenario.Comment: 8 pages, 5 figures, to be published in AAAI 202

    A New Viscoelastic Model for Polycarbonate Compressing Flow

    Get PDF
    To overcome the weakness of conventional models in describing compressing flow especially in start and end stages the shear rate derivative was added to the right side of PTT constitutive equation. The ability of describing the well-known ‘shear shinning’ and ‘stretch harden’ phenomena was first illustrated by theoretical analysis. Then the governing equations for compressing flow were established in terms of incompressible and isothermal fluid, and the numerical method was constructed to discretize the equations and get the compressing flow solutions. The experiments with four melt temperatures were conducted and the corresponding simulations were performed. The better agreements with experimental data indicates the modified PPT model is superior to the original PTT model in prediction of compressing flow. In addition, the proposed model is also validated with low and high compressing speed experiments

    Structural and magnetic characterization of Al microalloying nanocrystalline FeSiBNbCu alloys

    No full text
    The magnetic properties of nanocrystalline Fe77Si10B9Cu1Nb3-xAlx (x = 0 and 1) alloys have been investigated and their structural and electromagnetic parameters have been quantitatively studied by X-ray diffraction, transmission electron microscopy and Mossbauer spectra under one-step and two-step annealing processes. The nanocrystalline structure consists of single alpha-Fe(Si) phase embedded in a residual amorphous phase. Both saturation magnetic flux density (B-s) and permeability (mu) of the nanocrystalline alloys are increased by substituting 1 at% Al for Nb and one-step annealing, from B-s= 1.41 T to 1.47 T and from mu = 18,000 to 23,000 at 1 kHz, respectively. The two-step annealing has little effect on the B-s, coercivity (H-c) and grain size of the nanocrystalline alloys, but greatly improves the mu of the Al-doped alloy, reaching up to 28,000 at 1 kHz. The improved mu can be attributed to the increased magnetic moment and exchange stiffness constant, homogeneous chemical structure and reduced magnetostriction. The Al-doped nanocrystalline alloy with high B-s, high mu, low H-c and good frequency stability are good candidates for magnetic shielding pieces of wireless charging

    Temperature Forecasting Correction Based on Operational GRAPES-3km Model Using Machine Learning Methods

    No full text
    Postprocess correction is essential to improving the model forecasting result, in which machine learning methods play more and more important roles. In this study, three machine learning (ML) methods of Linear Regression, LSTM-FCN and LightGBM were used to carry out the correction of temperature forecasting of an operational high-resolution model GRAPES-3km. The input parameters include 2 m temperature, relative humidity, local pressure and wind speed forecasting and observation data in Shaanxi province of China from 1 January 2019 to 31 December 2020. The dataset from September 2018 was used for model evaluation using the metrics of root mean square error (RMSE), average absolute error (MAE) and coefficient of determination (R2). All three machine learning methods perform very well in correcting the temperature forecast of GRAPES-3km model. The RMSE decreased by 33%, 32% and 40%, respectively, the MAE decreased by 33%, 34% and 41%, respectively, the R2 increased by 21.4%, 21.5% and 25.2%, respectively. Among the three methods, LightGBM performed the best with the forecast accuracy rate reaching above 84%

    Data_Sheet_1_Structure and assembly process of fungal communities in the Yangtze River Estuary.pdf

    No full text
    Marine fungi are essential for the ecological function of estuarine ecosystems. However, limited studies have reported on the structure and assembly pattern of the fungal communities in estuaries. The purpose of this study is to reveal the structure and the ecological process of the fungal community in the Yangtze River Estuary (YRE) by using the amplicon sequencing method. Phyla of Ascomycota, Basidiomycota, and Chytridiomycota were dominant in the seawater and sediment samples from YRE. The null model analysis, community-neutral community model (NCM), and phylogenetic normalized stochasticity ratio (pNST) showed that the stochastic process dominated the assembly of fungal communities in YRE. Drift and homogeneous dispersal were the predominant stochastic processes for the fungal community assembly in seawater and sediment samples, respectively. The co-occurrence network analysis showed that fungal communities were more complex and closely connected in the sediment than in the seawater samples. Phyla Ascomycota, Basidiomycota, and Mucoromycota were the potential keystone taxa in the network. These findings demonstrated the importance of stochastic processes for the fungal community assembly, thereby widening our knowledge of the community structure and dynamics of fungi for future study and utilization in the YRE ecosystem.</p
    corecore