205 research outputs found

    Study on Modern Port Logistics (MPL) and forecasting container throughput of Shanghai port

    Get PDF

    Chinese Character Recognition with Radical-Structured Stroke Trees

    Full text link
    The flourishing blossom of deep learning has witnessed the rapid development of Chinese character recognition. However, it remains a great challenge that the characters for testing may have different distributions from those of the training dataset. Existing methods based on a single-level representation (character-level, radical-level, or stroke-level) may be either too sensitive to distribution changes (e.g., induced by blurring, occlusion, and zero-shot problems) or too tolerant to one-to-many ambiguities. In this paper, we represent each Chinese character as a stroke tree, which is organized according to its radical structures, to fully exploit the merits of both radical and stroke levels in a decent way. We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions, and a Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions. The generated radical structures and stroke sequences are encoded as a Radical-Structured Stroke Tree (RSST), which is fed to a Tree-to-Character Translator based on the proposed Weighted Edit Distance to match the closest candidate character in the RSST lexicon. Our extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art single-level methods by increasing margins as the distribution difference becomes more severe in the blurring, occlusion, and zero-shot scenarios, which indeed validates the robustness of the proposed method

    TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags

    Full text link
    So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging methods, which benefit from the flexible design of tagging systems and model architectures, have shown superiority to adapt for various information extraction tasks. In this paper, we follow the line of such methods and propose a competitive grid-tagging model for discontinuous NER. We call our model TOE because we incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model that casts the NER problem into word-word relationship prediction. First, we design a Tag Representation Embedding Module (TREM) to force our model to consider not only word-word relationships but also word-tag and tag-tag relationships. Concretely, we construct tag representations and embed them into TREM, so that TREM can treat tag and word representations as queries/keys/values and utilize self-attention to model their relationships. On the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word (THW) tags in the SOTA model, we add two new symmetric tags, namely Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more fine-grained word-word relationships and alleviate error propagation from tag prediction. In the experiments of three benchmark datasets, namely CADEC, ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness
    corecore