140 research outputs found
Active Relation Discovery: Towards General and Label-aware Open Relation Extraction
Open Relation Extraction (OpenRE) aims to discover novel relations from open
domains. Previous OpenRE methods mainly suffer from two problems: (1)
Insufficient capacity to discriminate between known and novel relations. When
extending conventional test settings to a more general setting where test data
might also come from seen classes, existing approaches have a significant
performance decline. (2) Secondary labeling must be performed before practical
application. Existing methods cannot label human-readable and meaningful types
for novel relations, which is urgently required by the downstream tasks. To
address these issues, we propose the Active Relation Discovery (ARD) framework,
which utilizes relational outlier detection for discriminating known and novel
relations and involves active learning for labeling novel relations. Extensive
experiments on three real-world datasets show that ARD significantly
outperforms previous state-of-the-art methods on both conventional and our
proposed general OpenRE settings. The source code and datasets will be
available for reproducibility.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Bidirectional End-to-End Learning of Retriever-Reader Paradigm for Entity Linking
Entity Linking (EL) is a fundamental task for Information Extraction and
Knowledge Graphs. The general form of EL (i.e., end-to-end EL) aims to first
find mentions in the given input document and then link the mentions to
corresponding entities in a specific knowledge base. Recently, the paradigm of
retriever-reader promotes the progress of end-to-end EL, benefiting from the
advantages of dense entity retrieval and machine reading comprehension.
However, the existing study only trains the retriever and the reader separately
in a pipeline manner, which ignores the benefit that the interaction between
the retriever and the reader can bring to the task. To advance the
retriever-reader paradigm to perform more perfectly on end-to-end EL, we
propose BEER, a Bidirectional End-to-End training framework for Retriever
and Reader. Through our designed bidirectional end-to-end training, BEER
guides the retriever and the reader to learn from each other, make progress
together, and ultimately improve EL performance. Extensive experiments on
benchmarks of multiple domains demonstrate the effectiveness of our proposed
BEER.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
A Survey of Natural Language Generation
This paper offers a comprehensive review of the research on Natural Language
Generation (NLG) over the past two decades, especially in relation to
data-to-text generation and text-to-text generation deep learning methods, as
well as new applications of NLG technology. This survey aims to (a) give the
latest synthesis of deep learning research on the NLG core tasks, as well as
the architectures adopted in the field; (b) detail meticulously and
comprehensively various NLG tasks and datasets, and draw attention to the
challenges in NLG evaluation, focusing on different evaluation methods and
their relationships; (c) highlight some future emphasis and relatively recent
research issues that arise due to the increasing synergy between NLG and other
artificial intelligence areas, such as computer vision, text and computational
creativity.Comment: Accepted by ACM Computing Survey (CSUR) 202
Towards All-around Knowledge Transferring: Learning From Task-irrelevant Labels
Deep neural models have hitherto achieved significant performances on
numerous classification tasks, but meanwhile require sufficient manually
annotated data. Since it is extremely time-consuming and expensive to annotate
adequate data for each classification task, learning an empirically effective
model with generalization on small dataset has received increased attention.
Existing efforts mainly focus on transferring task-relevant knowledge from
other similar data to tackle the issue. These approaches have yielded
remarkable improvements, yet neglecting the fact that the task-irrelevant
features could bring out massive negative transfer effects. To date, no
large-scale studies have been performed to investigate the impact of
task-irrelevant features, let alone the utilization of this kind of features.
In this paper, we firstly propose Task-Irrelevant Transfer Learning (TIRTL) to
exploit task-irrelevant features, which mainly are extracted from
task-irrelevant labels. Particularly, we suppress the expression of
task-irrelevant information and facilitate the learning process of
classification. We also provide a theoretical explanation of our method. In
addition, TIRTL does not conflict with those that have previously exploited
task-relevant knowledge and can be well combined to enable the simultaneous
utilization of task-relevant and task-irrelevant features for the first time.
In order to verify the effectiveness of our theory and method, we conduct
extensive experiments on facial expression recognition and digit recognition
tasks. Our source code will be also available in the future for
reproducibility
Accelerating Primal Solution Findings for Mixed Integer Programs Based on Solution Prediction
Mixed Integer Programming (MIP) is one of the most widely used modeling
techniques for combinatorial optimization problems. In many applications, a
similar MIP model is solved on a regular basis, maintaining remarkable
similarities in model structures and solution appearances but differing in
formulation coefficients. This offers the opportunity for machine learning
methods to explore the correlations between model structures and the resulting
solution values. To address this issue, we propose to represent an MIP instance
using a tripartite graph, based on which a Graph Convolutional Network (GCN) is
constructed to predict solution values for binary variables. The predicted
solutions are used to generate a local branching type cut which can be either
treated as a global (invalid) inequality in the formulation resulting in a
heuristic approach to solve the MIP, or as a root branching rule resulting in
an exact approach. Computational evaluations on 8 distinct types of MIP
problems show that the proposed framework improves the primal solution finding
performance significantly on a state-of-the-art open-source MIP solver
Automatic Context Pattern Generation for Entity Set Expansion
Entity Set Expansion (ESE) is a valuable task that aims to find entities of
the target semantic class described by given seed entities. Various NLP and IR
downstream applications have benefited from ESE due to its ability to discover
knowledge. Although existing bootstrapping methods have achieved great
progress, most of them still rely on manually pre-defined context patterns. A
non-negligible shortcoming of the pre-defined context patterns is that they
cannot be flexibly generalized to all kinds of semantic classes, and we call
this phenomenon as "semantic sensitivity". To address this problem, we devise a
context pattern generation module that utilizes autoregressive language models
(e.g., GPT-2) to automatically generate high-quality context patterns for
entities. In addition, we propose the GAPA, a novel ESE framework that
leverages the aforementioned GenerAted PAtterns to expand target entities.
Extensive experiments and detailed analyses on three widely used datasets
demonstrate the effectiveness of our method. All the codes of our experiments
will be available for reproducibility.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters
Writing assistance is an application closely related to human life and is
also a fundamental Natural Language Processing (NLP) research field. Its aim is
to improve the correctness and quality of input texts, with character checking
being crucial in detecting and correcting wrong characters. From the
perspective of the real world where handwriting occupies the vast majority,
characters that humans get wrong include faked characters (i.e., untrue
characters created due to writing errors) and misspelled characters (i.e., true
characters used incorrectly due to spelling errors). However, existing datasets
and related studies only focus on misspelled characters mainly caused by
phonological or visual confusion, thereby ignoring faked characters which are
more common and difficult. To break through this dilemma, we present
Visual-C, a human-annotated Visual Chinese Character Checking dataset with
faked and misspelled Chinese characters. To the best of our knowledge,
Visual-C is the first real-world visual and the largest human-crafted
dataset for the Chinese character checking scenario. Additionally, we also
propose and evaluate novel baseline methods on Visual-C. Extensive
empirical results and analyses show that Visual-C is high-quality yet
challenging. The Visual-C dataset and the baseline methods will be publicly
available to facilitate further research in the community.Comment: Work in progres
Recommended from our members
SNARE Zippering Is Suppressed by a Conformational Constraint that Is Removed by v-SNARE Splitting
Intracellular vesicle fusion is catalyzed by soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNAREs). Vesicle-anchored v-SNAREs pair with target membrane-associated t-SNAREs to form trans-SNARE complexes, releasing free energy to drive membrane fusion. However, trans-SNARE complexes are unable to assemble efficiently unless activated by Sec1/Munc18 (SM) proteins. Here, we demonstrate that SNAREs become fully active when the v-SNARE is split into two fragments, eliminating the requirement of SM protein activation. Mechanistically, v-SNARE splitting accelerates the zippering of trans-SNARE complexes, mimicking the stimulatory function of SM proteins. Thus, SNAREs possess the full potential to drive efficient membrane fusion but are suppressed by a conformational constraint. This constraint is removed by SM protein activation or v-SNARE splitting. We suggest that ancestral SNAREs originally evolved to be fully active in the absence of SM proteins. Later, a conformational constraint coevolved with SM proteins to achieve the vesicle fusion specificity demanded by complex endomembrane systems.
</p
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding
Large language models (LLMs) have shown impressive ability for open-domain
NLP tasks. However, LLMs are sometimes too footloose for natural language
understanding (NLU) tasks which always have restricted output and input format.
Their performances on NLU tasks are highly related to prompts or demonstrations
and are shown to be poor at performing several representative NLU tasks, such
as event extraction and entity typing. To this end, we present SeqGPT, a
bilingual (i.e., English and Chinese) open-source autoregressive model
specially enhanced for open-domain natural language understanding. We express
all NLU tasks with two atomic tasks, which define fixed instructions to
restrict the input and output format but still ``open'' for arbitrarily varied
label sets. The model is first instruction-tuned with extremely fine-grained
labeled data synthesized by ChatGPT and then further fine-tuned by 233
different atomic tasks from 152 datasets across various domains. The
experimental results show that SeqGPT has decent classification and extraction
ability, and is capable of performing language understanding tasks on unseen
domains. We also conduct empirical studies on the scaling of data and model
size as well as on the transfer across tasks. Our model is accessible at
https://github.com/Alibaba-NLP/SeqGPT.Comment: Initial version of SeqGP
- …