139 research outputs found
Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
A capsule is a group of neurons, whose activity vector represents the
instantiation parameters of a specific type of entity. In this paper, we
explore the capsule networks used for relation extraction in a multi-instance
multi-label learning framework and propose a novel neural approach based on
capsule networks with attention mechanisms. We evaluate our method with
different benchmarks, and it is demonstrated that our method improves the
precision of the predicted relations. Particularly, we show that capsule
networks improve multiple entity pairs relation extraction.Comment: To be published in EMNLP 201
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
We propose a distance supervised relation extraction approach for
long-tailed, imbalanced data which is prevalent in real-world settings. Here,
the challenge is to learn accurate "few-shot" models for classes existing at
the tail of the class distribution, for which little data is available.
Inspired by the rich semantic correlations between classes at the long tail and
those at the head, we take advantage of the knowledge from data-rich classes at
the head of the distribution to boost the performance of the data-poor classes
at the tail. First, we propose to leverage implicit relational knowledge among
class labels from knowledge graph embeddings and learn explicit relational
knowledge using graph convolution networks. Second, we integrate that
relational knowledge into relation extraction model by coarse-to-fine
knowledge-aware attention mechanism. We demonstrate our results for a
large-scale benchmark dataset which show that our approach significantly
outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201
Generative Knowledge Graph Construction: A Review
Generative Knowledge Graph Construction (KGC) refers to those methods that
leverage the sequence-to-sequence framework for building knowledge graphs,
which is flexible and can be adapted to widespread tasks. In this study, we
summarize the recent compelling progress in generative knowledge graph
construction. We present the advantages and weaknesses of each paradigm in
terms of different generation targets and provide theoretical insight and
empirical analysis. Based on the review, we suggest promising research
directions for the future. Our contributions are threefold: (1) We present a
detailed, complete taxonomy for the generative KGC methods; (2) We provide a
theoretical and empirical analysis of the generative KGC methods; (3) We
propose several research directions that can be developed in the future.Comment: Accepted to EMNLP 2022 (oral) and a public repository is available in
https://github.com/zjunlp/Generative_KG_Construction_Paper
Domain-Agnostic Molecular Generation with Self-feedback
The generation of molecules with desired properties has gained tremendous
popularity, revolutionizing the way scientists design molecular structures and
providing valuable support for chemical and drug design. However, despite the
potential of language models in molecule generation, they face numerous
challenges such as the generation of syntactically or chemically flawed
molecules, narrow domain focus, and limitations in creating diverse and
directionally feasible molecules due to a dearth of annotated data or external
molecular databases. To this end, we introduce MolGen, a pre-trained molecular
language model tailored specifically for molecule generation. MolGen acquires
intrinsic structural and grammatical insights by reconstructing over 100
million molecular SELFIES, while facilitating knowledge transfer between
different domains through domain-agnostic molecular prefix tuning. Moreover, we
present a self-feedback paradigm that inspires the pre-trained model to align
with the ultimate goal of producing molecules with desirable properties.
Extensive experiments on well-known benchmarks confirm MolGen's optimization
capabilities, encompassing penalized logP, QED, and molecular docking
properties. Further analysis shows that MolGen can accurately capture molecule
distributions, implicitly learn their structural characteristics, and
efficiently explore chemical space. The pre-trained model, codes, and datasets
are publicly available for future research at https://github.com/zjunlp/MolGen.Comment: Work in progress. Add results of binding affinit
Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study
This paper presents an empirical study to build relation extraction systems
in low-resource settings. Based upon recent pre-trained language models, we
comprehensively investigate three schemes to evaluate the performance in
low-resource settings: (i) different types of prompt-based methods with
few-shot labeled data; (ii) diverse balancing methods to address the
long-tailed distribution issue; (iii) data augmentation technologies and
self-training to generate more labeled in-domain data. We create a benchmark
with 8 relation extraction (RE) datasets covering different languages, domains
and contexts and perform extensive comparisons over the proposed schemes with
combinations. Our experiments illustrate: (i) Though prompt-based tuning is
beneficial in low-resource RE, there is still much potential for improvement,
especially in extracting relations from cross-sentence contexts with multiple
relational triples; (ii) Balancing methods are not always helpful for RE with
long-tailed distribution; (iii) Data augmentation complements existing
baselines and can bring much performance gain, while self-training may not
consistently achieve advancement to low-resource RE. Code and datasets are in
https://github.com/zjunlp/LREBench.Comment: Accepted to EMNLP 2022 (Findings) and the project website is
https://zjunlp.github.io/project/LREBench
Revisiting k-NN for Pre-trained Language Models
Pre-trained Language Models (PLMs), as parametric-based eager learners, have
become the de-facto choice for current paradigms of Natural Language Processing
(NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning
paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we
revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the
methodological level, we propose to adopt k-NN with textual representations of
PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the
training process. (2) Linearly interpolate the probability distribution
predicted by k-NN with that of the PLMs' classifier. At the heart of our
approach is the implementation of k-NN-calibrated training, which treats
predicted results as indicators for easy versus hard examples during the
training process. From the perspective of the diversity of application
scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning
paradigms and zero-shot, few-shot and fully-supervised settings, respectively,
across eight diverse end-tasks. We hope our exploration will encourage the
community to revisit the power of classical methods for efficient
NLP\footnote{Code and datasets are available in
https://github.com/zjunlp/Revisit-KNN.Comment: Work in progres
Schema-adaptable Knowledge Graph Construction
Conventional Knowledge Graph Construction (KGC) approaches typically follow
the static information extraction paradigm with a closed set of pre-defined
schema. As a result, such approaches fall short when applied to dynamic
scenarios or domains, whereas a new type of knowledge emerges. This
necessitates a system that can handle evolving schema automatically to extract
information for KGC. To address this need, we propose a new task called
schema-adaptable KGC, which aims to continually extract entity, relation, and
event based on a dynamically changing schema graph without re-training. We
first split and convert existing datasets based on three principles to build a
benchmark, i.e., horizontal schema expansion, vertical schema expansion, and
hybrid schema expansion; then investigate the schema-adaptable performance of
several well-known approaches such as Text2Event, TANL, UIE and GPT-3.5. We
further propose a simple yet effective baseline dubbed \textsc{AdaKGC}, which
contains schema-enriched prefix instructor and schema-conditioned dynamic
decoding to better handle evolving schema. Comprehensive experimental results
illustrate that AdaKGC can outperform baselines but still have room for
improvement. We hope the proposed work can deliver benefits to the community.
Code and datasets available at https://github.com/zjunlp/AdaKGC.Comment: EMNLP 2023 (Findings
Relation Adversarial Network for Low Resource Knowledge Graph Completion
Knowledge Graph Completion (KGC) has been proposed to improve Knowledge
Graphs by filling in missing connections via link prediction or relation
extraction. One of the main difficulties for KGC is a low resource problem.
Previous approaches assume sufficient training triples to learn versatile
vectors for entities and relations, or a satisfactory number of labeled
sentences to train a competent relation extraction model. However, low resource
relations are very common in KGs, and those newly added relations often do not
have many known samples for training. In this work, we aim at predicting new
facts under a challenging setting where only limited training instances are
available. We propose a general framework called Weighted Relation Adversarial
Network, which utilizes an adversarial procedure to help adapt
knowledge/features learned from high resource relations to different but
related low resource relations. Specifically, the framework takes advantage of
a relation discriminator to distinguish between samples from different
relations, and help learn relation-invariant features more transferable from
source relations to target relations. Experimental results show that the
proposed approach outperforms previous methods regarding low resource settings
for both link prediction and relation extraction.Comment: WWW202
- …