33,226 research outputs found
Adversarial Multi-task Learning for Text Classification
Neural network models have shown their promising opportunities for multi-task
learning, which focus on learning the shared layers to extract the common and
task-invariant features. However, in most existing approaches, the extracted
shared features are prone to be contaminated by task-specific features or the
noise brought by other tasks. In this paper, we propose an adversarial
multi-task learning framework, alleviating the shared and private latent
feature spaces from interfering with each other. We conduct extensive
experiments on 16 different text classification tasks, which demonstrates the
benefits of our approach. Besides, we show that the shared knowledge learned by
our proposed model can be regarded as off-the-shelf knowledge and easily
transferred to new tasks. The datasets of all 16 tasks are publicly available
at \url{http://nlp.fudan.edu.cn/data/}Comment: Accepted by ACL201
A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking
Text classification is a very classic NLP task, but it has two prominent
shortcomings: On the one hand, text classification is deeply domain-dependent.
That is, a classifier trained on the corpus of one domain may not perform so
well in another domain. On the other hand, text classification models require a
lot of annotated data for training. However, for some domains, there may not
exist enough annotated data. Therefore, it is valuable to investigate how to
efficiently utilize text data from different domains to improve the performance
of models in various domains. Some multi-domain text classification models are
trained by adversarial training to extract shared features among all domains
and the specific features of each domain. We noted that the distinctness of the
domain-specific features is different, so in this paper, we propose to use a
curriculum learning strategy based on keyword weight ranking to improve the
performance of multi-domain text classification models. The experimental
results on the Amazon review and FDU-MTL datasets show that our curriculum
learning strategy effectively improves the performance of multi-domain text
classification models based on adversarial learning and outperforms
state-of-the-art methods.Comment: Submitted to ICASSP2023 (currently under review
DEFTri: A Few-Shot Label Fused Contextual Representation Learning For Product Defect Triage in e-Commerce
Defect Triage is a time-sensitive and critical process in a large-scale agile
software development lifecycle for e-commerce. Inefficiencies arising from
human and process dependencies in this domain have motivated research in
automated approaches using machine learning to accurately assign defects to
qualified teams. This work proposes a novel framework for automated defect
triage (DEFTri) using fine-tuned state-of-the-art pre-trained BERT on labels
fused text embeddings to improve contextual representations from
human-generated product defects. For our multi-label text classification defect
triage task, we also introduce a Walmart proprietary dataset of product defects
using weak supervision and adversarial learning, in a few-shot setting.Comment: In Proceedings of the Fifth Workshop on e-Commerce and NLP ECNLP 5
2022 Pages 1-
Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification
Implicit discourse relation classification is of great challenge due to the
lack of connectives as strong linguistic cues, which motivates the use of
annotated implicit connectives to improve the recognition. We propose a feature
imitation framework in which an implicit relation network is driven to learn
from another neural network with access to connectives, and thus encouraged to
extract similarly salient features for accurate classification. We develop an
adversarial model to enable an adaptive imitation scheme through competition
between the implicit network and a rival feature discriminator. Our method
effectively transfers discriminability of connectives to the implicit features,
and achieves state-of-the-art performance on the PDTB benchmark.Comment: To appear in ACL201
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Transfer learning has fundamentally changed the landscape of natural language
processing (NLP) research. Many existing state-of-the-art models are first
pre-trained on a large text corpus and then fine-tuned on downstream tasks.
However, due to limited data resources from downstream tasks and the extremely
large capacity of pre-trained models, aggressive fine-tuning often causes the
adapted model to overfit the data of downstream tasks and forget the knowledge
of the pre-trained model. To address the above issue in a more principled
manner, we propose a new computational framework for robust and efficient
fine-tuning for pre-trained language models. Specifically, our proposed
framework contains two important ingredients: 1. Smoothness-inducing
regularization, which effectively manages the capacity of the model; 2. Bregman
proximal point optimization, which is a class of trust-region methods and can
prevent knowledge forgetting. Our experiments demonstrate that our proposed
method achieves the state-of-the-art performance on multiple NLP benchmarks.Comment: The 58th annual meeting of the Association for Computational
Linguistics (ACL 2020
- …