7 research outputs found

    A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends

    Full text link
    As more and more Arabic texts emerged on the Internet, extracting important information from these Arabic texts is especially useful. As a fundamental technology, Named entity recognition (NER) serves as the core component in information extraction technology, while also playing a critical role in many other Natural Language Processing (NLP) systems, such as question answering and knowledge graph building. In this paper, we provide a comprehensive review of the development of Arabic NER, especially the recent advances in deep learning and pre-trained language model. Specifically, we first introduce the background of Arabic NER, including the characteristics of Arabic and existing resources for Arabic NER. Then, we systematically review the development of Arabic NER methods. Traditional Arabic NER systems focus on feature engineering and designing domain-specific rules. In recent years, deep learning methods achieve significant progress by representing texts via continuous vector representations. With the growth of pre-trained language model, Arabic NER yields better performance. Finally, we conclude the method gap between Arabic NER and NER methods from other languages, which helps outline future directions for Arabic NER.Comment: Accepted by IEEE TKD

    Transformer-based NMT : modeling, training and implementation

    Get PDF
    International trade and industrial collaborations enable countries and regions to concentrate their developments on specific industries while making the most of other countries' specializations, which significantly accelerates global development. However, globalization also increases the demand for cross-region communication. Language barriers between many languages worldwide create a challenge for achieving deep collaboration between groups speaking different languages, increasing the need for translation. Language technology, specifically, Machine Translation (MT) holds the promise to enable communication between languages efficiently in real-time with minimal costs. Even though nowadays computers can perform computation in parallel very fast, which provides machine translation users with translations with very low latency, and although the evolution from Statistical Machine Translation (SMT) to Neural Machine Translation (NMT) with the utilization of advanced deep learning algorithms has significantly boosted translation quality, current machine translation algorithms are still far from accurately translating all input. Thus, how to further improve the performance of state-of-the-art NMT algorithm remains a valuable open research question which has received a wide range of attention. In the research presented in this thesis, we first investigate the long-distance relation modeling ability of the state-of-the-art NMT model, the Transformer. We propose to learn source phrase representations and incorporate them into the Transformer translation model, aiming to enhance its ability to capture long-distance dependencies well. Second, though previous work (Bapna et al., 2018) suggests that deep Transformers have difficulty in converging, we empirically find that the convergence of deep Transformers depends on the interaction between the layer normalization and residual connections employed to stabilize its training. We conduct a theoretical study about how to ensure the convergence of Transformers, especially for deep Transformers, and propose to ensure the convergence of deep Transformers by putting the Lipschitz constraint on its parameter initialization. Finally, we investigate how to dynamically determine proper and efficient batch sizes during the training of the Transformer model. We find that the gradient direction gets stabilized with increasing batch size during gradient accumulation. Thus we propose to dynamically adjust batch sizes during training by monitoring the gradient direction change within gradient accumulation, and to achieve a proper and efficient batch size by stopping the gradient accumulation when the gradient direction starts to fluctuate. For our research in this thesis, we also implement our own NMT toolkit, the Neutron implementation of the Transformer and its variants. In addition to providing fundamental features as the basis of our implementations for the approaches presented in this thesis, we support many advanced features from recent cutting-edge research work. Implementations of all our approaches in this thesis are also included and open-sourced in the toolkit. To compare with previous approaches, we mainly conducted our experiments on the data from the WMT 14 English to German (En-De) and English to French (En-Fr) news translation tasks, except when studying the convergence of deep Transformers, where we alternated the WMT 14 En-Fr task with the WMT 15 Czech to English (Cs-En) news translation task to compare with Bapna et al. (2018). The sizes of these datasets vary from medium (the WMT 14 En-De, ~ 4.5M sentence pairs) to very large (the WMT 14 En-Fr, ~ 36M sentence pairs), thus we suggest our approaches help improve the translation quality between popular language pairs which are widely used and have sufficient data.China Scholarship Counci

    A study of the translation of sentiment in user-generated text

    Get PDF
    A thesis submitted in partial ful filment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Emotions are biological states of feeling that humans may verbally express to communicate their negative or positive mood, influence others, or even afflict harm. Although emotions such as anger, happiness, affection, or fear are supposedly universal experiences, the lingual realisation of the emotional experience may vary in subtle ways across different languages. For this reason, preserving the original sentiment of the source text has always been a challenging task that draws in a translator's competence and fi nesse. In the professional translation industry, an incorrect translation of the sentiment-carrying lexicon is considered a critical error as it can be either misleading or in some cases harmful since it misses the fundamental aspect of the source text, i.e. the author's sentiment. Since the advent of Neural Machine Translation (NMT), there has been a tremendous improvement in the quality of automatic translation. This has lead to an extensive use of NMT online tools to translate User-Generated Text (UGT) such as reviews, tweets, and social media posts, where the main message is often the author's positive or negative attitude towards an entity. In such scenarios, the process of translating the user's sentiment is entirely automatic with no human intervention, neither for post-editing nor for accuracy checking. However, NMT output still lacks accuracy in some low-resource languages and sometimes makes critical translation errors that may not only distort the sentiment but at times flips the polarity of the source text to its exact opposite. In this thesis, we tackle the translation of sentiment in UGT by NMT systems from two perspectives: analytical and experimental. First, the analytical approach introduces a list of linguistic features that can lead to a mistranslation of ne-grained emotions between different language pairs in the UGT domain. It also presents an error-typology specifi c to Arabic UGT illustrating the main linguistic phenomena that can cause mistranslation of sentiment polarity when translating Arabic UGT into English by NMT systems. Second, the experimental approach attempts to improve the translation of sentiment by addressing some of the linguistic challenges identifi ed in the analysis as causing mistranslation of sentiment both on the word-level and on the sentence-level. On the word-level, we propose a Transformer NMT model trained on a sentiment-oriented vector space model (VSM) of UGT data that is capable of translating the correct sentiment polarity of challenging contronyms. On the sentence-level, we propose a semi-supervised approach to overcome the problem of translating sentiment expressed by dialectical language in UGT data. We take the translation of dialectical Arabic UGT into English as a case study. Our semi-supervised AR-EN NMT model shows improved performance over the online MT Twitter tool in translating dialectical Arabic UGT not only in terms of translation quality but also in the preservation of the sentiment polarity of the source text. The experimental section also presents an empirical method to quantify the notion of sentiment transfer by an MT system and, more concretely, to modify automatic metrics such that its MT ranking comes closer to a human judgement of a poor or good translation of sentiment
    corecore