365 research outputs found

    Neural Graph Transfer Learning in Natural Language Processing Tasks

    Get PDF
    Natural language is essential in our daily lives as we rely on languages to communicate and exchange information. A fundamental goal for natural language processing (NLP) is to let the machine understand natural language to help or replace human experts to mine knowledge and complete tasks. Many NLP tasks deal with sequential data. For example, a sentence is considered as a sequence of works. Very recently, deep learning-based language models (i.e.,BERT \citep{devlin2018bert}) achieved significant improvement in many existing tasks, including text classification and natural language inference. However, not all tasks can be formulated using sequence models. Specifically, graph-structured data is also fundamental in NLP, including entity linking, entity classification, relation extraction, abstractive meaning representation, and knowledge graphs \citep{santoro2017simple,hamilton2017representation,kipf2016semi}. In this scenario, BERT-based pretrained models may not be suitable. Graph Convolutional Neural Network (GCN) \citep{kipf2016semi} is a deep neural network model designed for graphs. It has shown great potential in text classification, link prediction, question answering and so on. This dissertation presents novel graph models for NLP tasks, including text classification, prerequisite chain learning, and coreference resolution. We focus on different perspectives of graph convolutional network modeling: for text classification, a novel graph construction method is proposed which allows interpretability for the prediction; for prerequisite chain learning, we propose multiple aggregation functions that utilize neighbors for better information exchange; for coreference resolution, we study how graph pretraining can help when labeled data is limited. Moreover, an important branch is to apply pretrained language models for the mentioned tasks. So, this dissertation also focuses on the transfer learning method that generalizes pretrained models to other domains, including medical, cross-lingual, and web data. Finally, we propose a new task called unsupervised cross-domain prerequisite chain learning, and study novel graph-based methods to transfer knowledge over graphs

    Overview of the Multi-Task Mutual Learning Technique: A Comparative Analysis of Different Models for Sentiment Analysis and Topic Detection

    Get PDF
    openThis research aims to provide a clearer overview of a new technique called Multi-task Mutual Learning in the field of Natural Language Processing, specifically in sentiment analysis and topic detection. The objective is to understand whether employing different models within this technique may impact its performance. With the growing collection of natural language-based data, private companies, public organizations, and various entities are increasingly seeking to extract information from this vast amount of data, which can be in the form of audio, text, or video. This underscores the need to study systems that can analyze this data effectively and do so in the shortest possible time, providing a competitive advantage in the private sector and a social analysis of the current historical moment in the public domain. The method employed is Mutual Learning, and within this technique, we analyzed specific models, including Variational Autoencoder, Dirichlet Variational Autoencoder, Recurrent Neural Network, and Bidirectional Encoder Representation from Transformer. These methods were executed with two datasets: YELP, containing reviews of commercial activities, and IMDB, containing reviews of films. The main findings highlight the complexity of the model, the computational power required, and the customization of the model according to specific needs.This research aims to provide a clearer overview of a new technique called Multi-task Mutual Learning in the field of Natural Language Processing, specifically in sentiment analysis and topic detection. The objective is to understand whether employing different models within this technique may impact its performance. With the growing collection of natural language-based data, private companies, public organizations, and various entities are increasingly seeking to extract information from this vast amount of data, which can be in the form of audio, text, or video. This underscores the need to study systems that can analyze this data effectively and do so in the shortest possible time, providing a competitive advantage in the private sector and a social analysis of the current historical moment in the public domain. The method employed is Mutual Learning, and within this technique, we analyzed specific models, including Variational Autoencoder, Dirichlet Variational Autoencoder, Recurrent Neural Network, and Bidirectional Encoder Representation from Transformer. These methods were executed with two datasets: YELP, containing reviews of commercial activities, and IMDB, containing reviews of films. The main findings highlight the complexity of the model, the computational power required, and the customization of the model according to specific needs

    AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization

    Full text link
    The rapid growth of information on the Internet has led to an overwhelming amount of opinions and comments on various activities, products, and services. This makes it difficult and time-consuming for users to process all the available information when making decisions. Text summarization, a Natural Language Processing (NLP) task, has been widely explored to help users quickly retrieve relevant information by generating short and salient content from long or multiple documents. Recent advances in pre-trained language models, such as ChatGPT, have demonstrated the potential of Large Language Models (LLMs) in text generation. However, LLMs require massive amounts of data and resources and are challenging to implement as offline applications. Furthermore, existing text summarization approaches often lack the ``adaptive" nature required to capture diverse aspects in opinion summarization, which is particularly detrimental to users with specific requirements or preferences. In this paper, we propose an Aspect-adaptive Knowledge-based Opinion Summarization model for product reviews, which effectively captures the adaptive nature required for opinion summarization. The model generates aspect-oriented summaries given a set of reviews for a particular product, efficiently providing users with useful information on specific aspects they are interested in, ensuring the generated summaries are more personalized and informative. Extensive experiments have been conducted using real-world datasets to evaluate the proposed model. The results demonstrate that our model outperforms state-of-the-art approaches and is adaptive and efficient in generating summaries that focus on particular aspects, enabling users to make well-informed decisions and catering to their diverse interests and preferences.Comment: 21 pages, 4 figures, 7 table

    Unsupervised Opinion Summarization with Noising and Denoising

    Full text link
    The supervised training of high-capacity models on large datasets containing hundreds of thousands of document-summary pairs is critical to the recent success of deep learning techniques for abstractive summarization. Unfortunately, in most domains (other than news) such training data is not available and cannot be easily sourced. In this paper we enable the use of supervised learning for the setting where there are only documents available (e.g.,~product or business reviews) without ground truth summaries. We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof which we treat as pseudo-review input. We introduce several linguistically motivated noise generation functions and a summarization model which learns to denoise the input and generate the original review. At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise. Extensive automatic and human evaluation shows that our model brings substantial improvements over both abstractive and extractive baselines.Comment: ACL 202

    Deep Learning for Text Style Transfer: A Survey

    Full text link
    Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data. We also provide discussions on a variety of important topics regarding the future development of this task. Our curated paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202
    • …
    corecore