16 research outputs found
A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis
Aspect-Based Sentiment Analysis is a fine-grained sentiment analysis task,
which focuses on detecting the sentiment polarity towards the aspect in a
sentence. However, it is always sensitive to the multi-aspect challenge, where
features of multiple aspects in a sentence will affect each other. To mitigate
this issue, we design a novel training framework, called Contrastive
Cross-Channel Data Augmentation (C3DA). A source sentence will be fed a
domain-specific generator to obtain some synthetic sentences and is
concatenated with these generated sentences to conduct supervised training and
proposed contrastive training. To be specific, considering the limited ABSA
labeled data, we also introduce some parameter-efficient approaches to complete
sentences generation. This novel generation method consists of an Aspect
Augmentation Channel (AAC) to generate aspect-specific sentences and a Polarity
Augmentation (PAC) to generate polarity-inverted sentences. According to our
extensive experiments, our C3DA framework can outperform those baselines
without any augmentations by about 1\% on accuracy and Macro-F1
Self-Evolution Learning for Discriminative Language Model Pretraining
Masked language modeling, widely used in discriminative language model (e.g.,
BERT) pretraining, commonly adopts a random masking strategy. However, random
masking does not consider the importance of the different words in the sentence
meaning, where some of them are more worthy to be predicted. Therefore, various
masking strategies (e.g., entity-level masking) are proposed, but most of them
require expensive prior knowledge and generally train from scratch without
reusing existing model weights. In this paper, we present Self-Evolution
learning (SE), a simple and effective token masking and learning method to
fully and wisely exploit the knowledge from data. SE focuses on learning the
informative yet under-explored tokens and adaptively regularizes the training
by introducing a novel Token-specific Label Smoothing approach. Experiments on
10 tasks show that our SE brings consistent and significant improvements
(+1.43~2.12 average scores) upon different PLMs. In-depth analyses demonstrate
that SE improves linguistic knowledge learning and generalization.Comment: Accepted to Findings of ACL202
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Token dropping is a recently-proposed strategy to speed up the pretraining of
masked language models, such as BERT, by skipping the computation of a subset
of the input tokens at several middle layers. It can effectively reduce the
training time without degrading much performance on downstream tasks. However,
we empirically find that token dropping is prone to a semantic loss problem and
falls short in handling semantic-intense tasks. Motivated by this, we propose a
simple yet effective semantic-consistent learning method (ScTD) to improve the
token dropping. ScTD aims to encourage the model to learn how to preserve the
semantic information in the representation space. Extensive experiments on 12
tasks show that, with the help of our ScTD, token dropping can achieve
consistent and significant performance gains across all task types and model
sizes. More encouragingly, ScTD saves up to 57% of pretraining time and brings
up to +1.56% average improvement over the vanilla token dropping.Comment: Accepted to ACL2023 Main Conferenc
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Text classification tasks often encounter few shot scenarios with limited
labeled data, and addressing data scarcity is crucial. Data augmentation with
mixup has shown to be effective on various text classification tasks. However,
most of the mixup methods do not consider the varying degree of learning
difficulty in different stages of training and generate new samples with one
hot labels, resulting in the model over confidence. In this paper, we propose a
self evolution learning (SE) based mixup approach for data augmentation in text
classification, which can generate more adaptive and model friendly pesudo
samples for the model training. SE focuses on the variation of the model's
learning ability. To alleviate the model confidence, we introduce a novel
instance specific label smoothing approach, which linearly interpolates the
model's output and one hot labels of the original samples to generate new soft
for label mixing up. Through experimental analysis, in addition to improving
classification accuracy, we demonstrate that SE also enhances the model's
generalize ability
Towards Making the Most of ChatGPT for Machine Translation
ChatGPT shows remarkable capabilities for machine translation (MT). Several
prior studies have shown that it achieves comparable results to commercial
systems for high-resource languages, but lags behind in complex tasks, e.g,
low-resource and distant-language-pairs translation. However, they usually
adopt simple prompts which can not fully elicit the capability of ChatGPT. In
this report, we aim to further mine ChatGPT's translation ability by revisiting
several aspects: temperature, task information, and domain information, and
correspondingly propose two (simple but effective) prompts: Task-Specific
Prompts (TSP) and Domain-Specific Prompts (DSP). We show that: 1) The
performance of ChatGPT depends largely on temperature, and a lower temperature
usually can achieve better performance; 2) Emphasizing the task information
further improves ChatGPT's performance, particularly in complex MT tasks; 3)
Introducing domain information can elicit ChatGPT's generalization ability and
improve its performance in the specific domain; 4) ChatGPT tends to generate
hallucinations for non-English-centric MT tasks, which can be partially
addressed by our proposed prompts but still need to be highlighted for the
MT/NLP community. We also explore the effects of advanced in-context learning
strategies and find a (negative but interesting) observation: the powerful
chain-of-thought prompt leads to word-by-word translation behavior, thus
bringing significant translation degradation.Comment: Work in progress, 9 page
Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis
Aspect-based sentiment analysis (ABSA) is a fine-grained task of sentiment
analysis. To better comprehend long complicated sentences and obtain accurate
aspect-specific information, linguistic and commonsense knowledge are generally
required in this task. However, most methods employ complicated and inefficient
approaches to incorporate external knowledge, e.g., directly searching the
graph nodes. Additionally, the complementarity between external knowledge and
linguistic information has not been thoroughly studied. To this end, we propose
a knowledge graph augmented network (KGAN), which aims to effectively
incorporate external knowledge with explicitly syntactic and contextual
information. In particular, KGAN captures the sentiment feature representations
from multiple different perspectives, i.e., context-, syntax- and
knowledge-based. First, KGAN learns the contextual and syntactic
representations in parallel to fully extract the semantic features. Then, KGAN
integrates the knowledge graphs into the embedding space, based on which the
aspect-specific knowledge representations are further obtained via an attention
mechanism. Last, we propose a hierarchical fusion module to complement these
multiview representations in a local-to-global manner. Extensive experiments on
three popular ABSA benchmarks demonstrate the effectiveness and robustness of
our KGAN. Notably, with the help of the pretrained model of RoBERTa, KGAN
achieves a new record of state-of-the-art performance.Comment: Under revie
Surface modification induced by perovskite quantum dots for triple-cation perovskite solar cells
Organic-inorganic hybrid perovskite solar cells are regarded as the most promising new-generation photovoltaic technology, owing to their high power conversion efficiencies and low cost. However, surface imperfections of perovskite films impede improvement in device performances, since surface imperfections can introduce undesired energy losses under sunlight illumination. Here, we show that the incorporation of zero-dimensional perovskite quantum dots into three-dimensional perovskite films can heal surface imperfections in perovskite films. Introducing perovskite quantum dots also leads to a more uniform surface topography and potential, along with an improved crystal quality of the triple-cation perovskite films, benefiting charge carrier kinetics between the perovskite films and the charge extraction layers. Ultimately, we achieve a power conversion efficiency exceeding 21% in triple-cation perovskite solar cells