416 research outputs found

    Machine translation of morphologically rich languages using deep neural networks

    Get PDF
    This thesis addresses some of the challenges of translating morphologically rich languages (MRLs). Words in MRLs have more complex structures than those in other languages, so that a word can be viewed as a hierarchical structure with several internal subunits. Accordingly, word-based models in which words are treated as atomic units are not suitable for this set of languages. As a commonly used and eff ective solution, morphological decomposition is applied to segment words into atomic and meaning-preserving units, but this raises other types of problems some of which we study here. We mainly use neural networks (NNs) to perform machine translation (MT) in our research and study their diff erent properties. However, our research is not limited to neural models alone as we also consider some of the difficulties of conventional MT methods. First we try to model morphologically complex words (MCWs) and provide better word-level representations. Words are symbolic concepts which are represented numerically in order to be used in NNs. Our first goal is to tackle this problem and find the best representation for MCWs. In the next step we focus on language modeling (LM) and work at the sentence level. We propose new morpheme-segmentation models by which we finetune existing LMs for MRLs. In this part of our research we try to find the most efficient neural language model for MRLs. After providing word- and sentence-level neural information in the first two steps, we try to use such information to enhance the translation quality in the statistical machine translation (SMT) pipeline using several diff erent models. Accordingly, the main goal in this part is to find methods by which deep neural networks (DNNs) can improve SMT. One of the main interests of the thesis is to study neural machine translation (NMT) engines from diff erent perspectives, and finetune them to work with MRLs. In the last step we target this problem and perform end-to-end sequence modeling via NN-based models. NMT engines have recently improved significantly and perform as well as state-of-the-art systems, but still have serious problems with morphologically complex constituents. This shortcoming of NMT is studied in two separate chapters in the thesis, where in one chapter we investigate the impact of diff erent non-linguistic morpheme-segmentation models on the NMT pipeline, and in the other one we benefit from a linguistically motivated morphological analyzer and propose a novel neural architecture particularly for translating from MRLs. Our overall goal for this part of the research is to find the most suitable neural architecture to translate MRLs. We evaluated our models on diff erent MRLs such as Czech, Farsi, German, Russian, and Turkish, and observed significant improvements. The main goal targeted in this research was to incorporate morphological information into MT and define architectures which are able to model the complex nature of MRLs. The results obtained from our experimental studies confirm that we were able to achieve our goal

    A Deep Network Model for Paraphrase Detection in Short Text Messages

    Full text link
    This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is that existing paraphrase systems perform well when applied on clean texts, but they do not necessarily deliver good performance against noisy texts. Challenges with paraphrase detection on user generated short texts, such as Twitter, include language irregularity and noise. To cope with these challenges, we propose a novel deep neural network-based approach that relies on coarse-grained sentence modeling using a convolutional neural network and a long short-term memory model, combined with a specific fine-grained word-level similarity matching model. Our experimental results show that the proposed approach outperforms existing state-of-the-art approaches on user-generated noisy social media data, such as Twitter texts, and achieves highly competitive performance on a cleaner corpus

    From feature to paradigm: deep learning in machine translation

    No full text
    In the last years, deep learning algorithms have highly revolutionized several areas including speech, image and natural language processing. The specific field of Machine Translation (MT) has not remained invariant. Integration of deep learning in MT varies from re-modeling existing features into standard statistical systems to the development of a new architecture. Among the different neural networks, research works use feed- forward neural networks, recurrent neural networks and the encoder-decoder schema. These architectures are able to tackle challenges as having low-resources or morphology variations. This manuscript focuses on describing how these neural networks have been integrated to enhance different aspects and models from statistical MT, including language modeling, word alignment, translation, reordering, and rescoring. Then, we report the new neural MT approach together with a description of the foundational related works and recent approaches on using subword, characters and training with multilingual languages, among others. Finally, we include an analysis of the corresponding challenges and future work in using deep learning in MTPostprint (author's final draft

    Machine translation of morphologically rich languages using deep neural networks

    Get PDF
    This thesis addresses some of the challenges of translating morphologically rich languages (MRLs). Words in MRLs have more complex structures than those in other languages, so that a word can be viewed as a hierarchical structure with several internal subunits. Accordingly, word-based models in which words are treated as atomic units are not suitable for this set of languages. As a commonly used and eff ective solution, morphological decomposition is applied to segment words into atomic and meaning-preserving units, but this raises other types of problems some of which we study here. We mainly use neural networks (NNs) to perform machine translation (MT) in our research and study their diff erent properties. However, our research is not limited to neural models alone as we also consider some of the difficulties of conventional MT methods. First we try to model morphologically complex words (MCWs) and provide better word-level representations. Words are symbolic concepts which are represented numerically in order to be used in NNs. Our first goal is to tackle this problem and find the best representation for MCWs. In the next step we focus on language modeling (LM) and work at the sentence level. We propose new morpheme-segmentation models by which we finetune existing LMs for MRLs. In this part of our research we try to find the most efficient neural language model for MRLs. After providing word- and sentence-level neural information in the first two steps, we try to use such information to enhance the translation quality in the statistical machine translation (SMT) pipeline using several diff erent models. Accordingly, the main goal in this part is to find methods by which deep neural networks (DNNs) can improve SMT. One of the main interests of the thesis is to study neural machine translation (NMT) engines from diff erent perspectives, and finetune them to work with MRLs. In the last step we target this problem and perform end-to-end sequence modeling via NN-based models. NMT engines have recently improved significantly and perform as well as state-of-the-art systems, but still have serious problems with morphologically complex constituents. This shortcoming of NMT is studied in two separate chapters in the thesis, where in one chapter we investigate the impact of diff erent non-linguistic morpheme-segmentation models on the NMT pipeline, and in the other one we benefit from a linguistically motivated morphological analyzer and propose a novel neural architecture particularly for translating from MRLs. Our overall goal for this part of the research is to find the most suitable neural architecture to translate MRLs. We evaluated our models on diff erent MRLs such as Czech, Farsi, German, Russian, and Turkish, and observed significant improvements. The main goal targeted in this research was to incorporate morphological information into MT and define architectures which are able to model the complex nature of MRLs. The results obtained from our experimental studies confirm that we were able to achieve our goal

    Word Sequence Modeling using Deep Learning:an End-to-end Approach and its Applications

    Get PDF
    For a long time, natural language processing (NLP) has relied on generative models with task specific and manually engineered features. Recently, there has been a resurgence of interest for neural networks in the machine learning community, obtaining state-of-the-art results in various fields such as computer vision, speech processing and natural language processing. The central idea behind these approaches is to learn features and models simultaneously, in an end-to-end manner, and making as few assumptions as possible. In NLP, word embeddings, mapping words in a dictionary on a continuous low-dimensional vector space, have proven to be very efficient for a large variety of tasks while requiring almost no a-priori linguistic assumptions. In this thesis, we investigate continuous representations of segments in a sentence for the purpose of solving NLP tasks that involve complex sentence-level relationships. Our sequence modelling approach is based on neural networks and takes advantage of word embeddings. A first approach models words in context in the form of continuous vector representations which are used to solve the task of interest. With the use of a compositional procedure, allowing arbitrarily-sized segments to be compressed onto continuous vectors, the model is able to consider long-range word dependencies as well. We first validate our approach on the task of bilingual word alignment, consisting in finding word correspondences between a sentence in two different languages. Source and target words in context are modeled using convolutional neural networks, obtaining representations that are later used to compute alignment scores. An aggregation operation enables unsupervised training for this task. We show that our model outperforms a standard generative model. The model above is extended to tackle phrase prediction tasks where phrases rather than single words are to be tagged. These tasks have been typically cast as classic word tagging problems using special tagging schemes to identify the segments boundaries. The proposed neural model focuses on learning fixed-size representations of arbitrarily-sized chunks of words that are used to solve the tagging task. A compositional operation is introduced in this work for the purpose of computing these representations. We demonstrate the viability of the proposed representations by evaluating the approach on the multiwork expression tagging task. The remainder of this thesis addresses the task of syntactic constituency parsing which, as opposed to the above tasks, aims at producing a structured output, in the form of a tree, of an input sentence. Syntactic parsing is cast as multiple phrase prediction problems that are solved recursively in a greedy manner. An extension using recursive compositional vector representations, allowing for lexical infor- mation to be propagated from early stages, is explored as well. This approach is evaluated on a standard corpus obtaining performance comparable to generative models with much shorter computation time. Finally, morphological tags are included as additional features, using a similar composition procedure, to improve the parsing performance for morphologically rich languages. State-of-the-art results were obtained for these task and languages

    A review of sentiment analysis research in Arabic language

    Full text link
    Sentiment analysis is a task of natural language processing which has recently attracted increasing attention. However, sentiment analysis research has mainly been carried out for the English language. Although Arabic is ramping up as one of the most used languages on the Internet, only a few studies have focused on Arabic sentiment analysis so far. In this paper, we carry out an in-depth qualitative study of the most important research works in this context by presenting limits and strengths of existing approaches. In particular, we survey both approaches that leverage machine translation or transfer learning to adapt English resources to Arabic and approaches that stem directly from the Arabic language

    Proposed Hybrid model for Sentiment Classification using CovNet-DualLSTM Techniques

    Get PDF
    The fast growth of Internet and social media has resulted in a significant quantity of texts based review that is posted on the platforms like social media. In the age of social media, analyzing the emotional context of comments using machine learning technology helps in understanding of QoS for any product or service. Analysis and classification of user's review helps in improving the QoS (Quality of Services). Machine Learning techniques have evolved as a great tool for performing sentiment analysis of user's. In contrast to traditional classification models. Bidirectional Long Short-Term Memory (BiLSTM) has obtained substantial outcomes and Convolution Neural Network (CNN) has shown promising outcomes in sentiment classification. CNN can successfully retrieve local information by utilizing convolutions and pooling layers. BiLSTM employs dual LSTM orientations for increasing the background knowledge accessible to deep learning based models. The hybrid model proposed here is to utilize the advantages of these two deep learning based models. Tweets of users for reviews of Indian Railway Services have been used as data source for analysis and classification. Keras Embedding technique is used as input source to the proposed hybrid model. The proposed model receives inputs and generates features with lower dimensions which generate a classification result. The performance of proposed hybrid model was compared using Keras and Word2Vec and observed effective improvement in the response of the proposed model with an accuracy of 95.19%

    Representation Learning for Natural Language Processing

    Get PDF
    This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing
    • 

    corecore