365 research outputs found
Neural Graph Transfer Learning in Natural Language Processing Tasks
Natural language is essential in our daily lives as we rely on languages to communicate and exchange information. A fundamental goal for natural language processing (NLP) is to let the machine understand natural language to help or replace human experts to mine knowledge and complete tasks. Many NLP tasks deal with sequential data. For example, a sentence is considered as a sequence of works. Very recently, deep learning-based language models (i.e.,BERT \citep{devlin2018bert}) achieved significant improvement in many existing tasks, including text classification and natural language inference. However, not all tasks can be formulated using sequence models. Specifically, graph-structured data is also fundamental in NLP, including entity linking, entity classification, relation extraction, abstractive meaning representation, and knowledge graphs \citep{santoro2017simple,hamilton2017representation,kipf2016semi}. In this scenario, BERT-based pretrained models may not be suitable. Graph Convolutional Neural Network (GCN) \citep{kipf2016semi} is a deep neural network model designed for graphs. It has shown great potential in text classification, link prediction, question answering and so on. This dissertation presents novel graph models for NLP tasks, including text classification, prerequisite chain learning, and coreference resolution. We focus on different perspectives of graph convolutional network modeling: for text classification, a novel graph construction method is proposed which allows interpretability for the prediction; for prerequisite chain learning, we propose multiple aggregation functions that utilize neighbors for better information exchange; for coreference resolution, we study how graph pretraining can help when labeled data is limited. Moreover, an important branch is to apply pretrained language models for the mentioned tasks. So, this dissertation also focuses on the transfer learning method that generalizes pretrained models to other domains, including medical, cross-lingual, and web data. Finally, we propose a new task called unsupervised cross-domain prerequisite chain learning, and study novel graph-based methods to transfer knowledge over graphs
Overview of the Multi-Task Mutual Learning Technique: A Comparative Analysis of Different Models for Sentiment Analysis and Topic Detection
openThis research aims to provide a clearer overview of a new technique called Multi-task Mutual Learning in the field of Natural Language Processing, specifically in sentiment analysis and topic detection. The objective is to understand whether employing different models within this technique may impact its performance. With the growing collection of natural language-based data, private companies, public organizations, and various entities are increasingly seeking to extract information from this vast amount of data, which can be in the form of audio, text, or video. This underscores the need to study systems that can analyze this data effectively and do so in the shortest possible time, providing a competitive advantage in the private sector and a social analysis of the current historical moment in the public domain. The method employed is Mutual Learning, and within this technique, we analyzed specific models, including Variational Autoencoder, Dirichlet Variational Autoencoder, Recurrent Neural Network, and Bidirectional Encoder Representation from Transformer. These methods were executed with two datasets: YELP, containing reviews of commercial activities, and IMDB, containing reviews of films. The main findings highlight the complexity of the model, the computational power required, and the customization of the model according to specific needs.This research aims to provide a clearer overview of a new technique called Multi-task Mutual Learning in the field of Natural Language Processing, specifically in sentiment analysis and topic detection. The objective is to understand whether employing different models within this technique may impact its performance. With the growing collection of natural language-based data, private companies, public organizations, and various entities are increasingly seeking to extract information from this vast amount of data, which can be in the form of audio, text, or video. This underscores the need to study systems that can analyze this data effectively and do so in the shortest possible time, providing a competitive advantage in the private sector and a social analysis of the current historical moment in the public domain. The method employed is Mutual Learning, and within this technique, we analyzed specific models, including Variational Autoencoder, Dirichlet Variational Autoencoder, Recurrent Neural Network, and Bidirectional Encoder Representation from Transformer. These methods were executed with two datasets: YELP, containing reviews of commercial activities, and IMDB, containing reviews of films. The main findings highlight the complexity of the model, the computational power required, and the customization of the model according to specific needs
AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization
The rapid growth of information on the Internet has led to an overwhelming
amount of opinions and comments on various activities, products, and services.
This makes it difficult and time-consuming for users to process all the
available information when making decisions. Text summarization, a Natural
Language Processing (NLP) task, has been widely explored to help users quickly
retrieve relevant information by generating short and salient content from long
or multiple documents. Recent advances in pre-trained language models, such as
ChatGPT, have demonstrated the potential of Large Language Models (LLMs) in
text generation. However, LLMs require massive amounts of data and resources
and are challenging to implement as offline applications. Furthermore, existing
text summarization approaches often lack the ``adaptive" nature required to
capture diverse aspects in opinion summarization, which is particularly
detrimental to users with specific requirements or preferences. In this paper,
we propose an Aspect-adaptive Knowledge-based Opinion Summarization model for
product reviews, which effectively captures the adaptive nature required for
opinion summarization. The model generates aspect-oriented summaries given a
set of reviews for a particular product, efficiently providing users with
useful information on specific aspects they are interested in, ensuring the
generated summaries are more personalized and informative. Extensive
experiments have been conducted using real-world datasets to evaluate the
proposed model. The results demonstrate that our model outperforms
state-of-the-art approaches and is adaptive and efficient in generating
summaries that focus on particular aspects, enabling users to make
well-informed decisions and catering to their diverse interests and
preferences.Comment: 21 pages, 4 figures, 7 table
Unsupervised Opinion Summarization with Noising and Denoising
The supervised training of high-capacity models on large datasets containing
hundreds of thousands of document-summary pairs is critical to the recent
success of deep learning techniques for abstractive summarization.
Unfortunately, in most domains (other than news) such training data is not
available and cannot be easily sourced. In this paper we enable the use of
supervised learning for the setting where there are only documents available
(e.g.,~product or business reviews) without ground truth summaries. We create a
synthetic dataset from a corpus of user reviews by sampling a review,
pretending it is a summary, and generating noisy versions thereof which we
treat as pseudo-review input. We introduce several linguistically motivated
noise generation functions and a summarization model which learns to denoise
the input and generate the original review. At test time, the model accepts
genuine reviews and generates a summary containing salient opinions, treating
those that do not reach consensus as noise. Extensive automatic and human
evaluation shows that our model brings substantial improvements over both
abstractive and extractive baselines.Comment: ACL 202
Deep Learning for Text Style Transfer: A Survey
Text style transfer is an important task in natural language generation,
which aims to control certain attributes in the generated text, such as
politeness, emotion, humor, and many others. It has a long history in the field
of natural language processing, and recently has re-gained significant
attention thanks to the promising performance brought by deep neural models. In
this paper, we present a systematic survey of the research on neural text style
transfer, spanning over 100 representative articles since the first neural text
style transfer work in 2017. We discuss the task formulation, existing datasets
and subtasks, evaluation, as well as the rich methodologies in the presence of
parallel and non-parallel data. We also provide discussions on a variety of
important topics regarding the future development of this task. Our curated
paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202
- …