Search CORE

365 research outputs found

Neural Graph Transfer Learning in Natural Language Processing Tasks

Author: Li Irene
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/04/2022
Field of study

Natural language is essential in our daily lives as we rely on languages to communicate and exchange information. A fundamental goal for natural language processing (NLP) is to let the machine understand natural language to help or replace human experts to mine knowledge and complete tasks. Many NLP tasks deal with sequential data. For example, a sentence is considered as a sequence of works. Very recently, deep learning-based language models (i.e.,BERT \citep{devlin2018bert}) achieved significant improvement in many existing tasks, including text classification and natural language inference. However, not all tasks can be formulated using sequence models. Specifically, graph-structured data is also fundamental in NLP, including entity linking, entity classification, relation extraction, abstractive meaning representation, and knowledge graphs \citep{santoro2017simple,hamilton2017representation,kipf2016semi}. In this scenario, BERT-based pretrained models may not be suitable. Graph Convolutional Neural Network (GCN) \citep{kipf2016semi} is a deep neural network model designed for graphs. It has shown great potential in text classification, link prediction, question answering and so on. This dissertation presents novel graph models for NLP tasks, including text classification, prerequisite chain learning, and coreference resolution. We focus on different perspectives of graph convolutional network modeling: for text classification, a novel graph construction method is proposed which allows interpretability for the prediction; for prerequisite chain learning, we propose multiple aggregation functions that utilize neighbors for better information exchange; for coreference resolution, we study how graph pretraining can help when labeled data is limited. Moreover, an important branch is to apply pretrained language models for the mentioned tasks. So, this dissertation also focuses on the transfer learning method that generalizes pretrained models to other domains, including medical, cross-lingual, and web data. Finally, we propose a new task called unsupervised cross-domain prerequisite chain learning, and study novel graph-based methods to transfer knowledge over graphs

Yale University

Overview of the Multi-Task Mutual Learning Technique: A Comparative Analysis of Different Models for Sentiment Analysis and Topic Detection

Author: POSENATO MATTEO
Publication venue
Publication date: 21/12/2023
Field of study

openThis research aims to provide a clearer overview of a new technique called Multi-task Mutual Learning in the field of Natural Language Processing, specifically in sentiment analysis and topic detection. The objective is to understand whether employing different models within this technique may impact its performance. With the growing collection of natural language-based data, private companies, public organizations, and various entities are increasingly seeking to extract information from this vast amount of data, which can be in the form of audio, text, or video. This underscores the need to study systems that can analyze this data effectively and do so in the shortest possible time, providing a competitive advantage in the private sector and a social analysis of the current historical moment in the public domain. The method employed is Mutual Learning, and within this technique, we analyzed specific models, including Variational Autoencoder, Dirichlet Variational Autoencoder, Recurrent Neural Network, and Bidirectional Encoder Representation from Transformer. These methods were executed with two datasets: YELP, containing reviews of commercial activities, and IMDB, containing reviews of films. The main findings highlight the complexity of the model, the computational power required, and the customization of the model according to specific needs.This research aims to provide a clearer overview of a new technique called Multi-task Mutual Learning in the field of Natural Language Processing, specifically in sentiment analysis and topic detection. The objective is to understand whether employing different models within this technique may impact its performance. With the growing collection of natural language-based data, private companies, public organizations, and various entities are increasingly seeking to extract information from this vast amount of data, which can be in the form of audio, text, or video. This underscores the need to study systems that can analyze this data effectively and do so in the shortest possible time, providing a competitive advantage in the private sector and a social analysis of the current historical moment in the public domain. The method employed is Mutual Learning, and within this technique, we analyzed specific models, including Variational Autoencoder, Dirichlet Variational Autoencoder, Recurrent Neural Network, and Bidirectional Encoder Representation from Transformer. These methods were executed with two datasets: YELP, containing reviews of commercial activities, and IMDB, containing reviews of films. The main findings highlight the complexity of the model, the computational power required, and the customization of the model according to specific needs

Padua Thesis and Dissertation Archive

AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization

Author: Bai Quan
Lai Edmund M-K.
Li Weihua
Wang Guan
Publication venue
Publication date: 25/05/2023
Field of study

The rapid growth of information on the Internet has led to an overwhelming amount of opinions and comments on various activities, products, and services. This makes it difficult and time-consuming for users to process all the available information when making decisions. Text summarization, a Natural Language Processing (NLP) task, has been widely explored to help users quickly retrieve relevant information by generating short and salient content from long or multiple documents. Recent advances in pre-trained language models, such as ChatGPT, have demonstrated the potential of Large Language Models (LLMs) in text generation. However, LLMs require massive amounts of data and resources and are challenging to implement as offline applications. Furthermore, existing text summarization approaches often lack the ``adaptive" nature required to capture diverse aspects in opinion summarization, which is particularly detrimental to users with specific requirements or preferences. In this paper, we propose an Aspect-adaptive Knowledge-based Opinion Summarization model for product reviews, which effectively captures the adaptive nature required for opinion summarization. The model generates aspect-oriented summaries given a set of reviews for a particular product, efficiently providing users with useful information on specific aspects they are interested in, ensuring the generated summaries are more personalized and informative. Extensive experiments have been conducted using real-world datasets to evaluate the proposed model. The results demonstrate that our model outperforms state-of-the-art approaches and is adaptive and efficient in generating summaries that focus on particular aspects, enabling users to make well-informed decisions and catering to their diverse interests and preferences.Comment: 21 pages, 4 figures, 7 table

arXiv.org e-Print Archive

Unsupervised Opinion Summarization with Noising and Denoising

Author: Amplayo Reinald Kim
Lapata Mirella
Publication venue
Publication date: 01/01/2020
Field of study

The supervised training of high-capacity models on large datasets containing hundreds of thousands of document-summary pairs is critical to the recent success of deep learning techniques for abstractive summarization. Unfortunately, in most domains (other than news) such training data is not available and cannot be easily sourced. In this paper we enable the use of supervised learning for the setting where there are only documents available (e.g.,~product or business reviews) without ground truth summaries. We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof which we treat as pseudo-review input. We introduce several linguistically motivated noise generation functions and a summarization model which learns to denoise the input and generate the original review. At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise. Extensive automatic and human evaluation shows that our model brings substantial improvements over both abstractive and extractive baselines.Comment: ACL 202

arXiv.org e-Print Archive

Crossref

Deep Learning for Text Style Transfer: A Survey

Author: Hu Zhiting
Jin Di
Jin Zhijing
Mihalcea Rada
Vechtomova Olga
Publication venue
Publication date: 16/12/2021
Field of study

Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data. We also provide discussions on a variety of important topics regarding the future development of this task. Our curated paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202

arXiv.org e-Print Archive

Repository for Publications and Research Data