35 research outputs found
Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
A capsule is a group of neurons, whose activity vector represents the
instantiation parameters of a specific type of entity. In this paper, we
explore the capsule networks used for relation extraction in a multi-instance
multi-label learning framework and propose a novel neural approach based on
capsule networks with attention mechanisms. We evaluate our method with
different benchmarks, and it is demonstrated that our method improves the
precision of the predicted relations. Particularly, we show that capsule
networks improve multiple entity pairs relation extraction.Comment: To be published in EMNLP 201
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
We propose a distance supervised relation extraction approach for
long-tailed, imbalanced data which is prevalent in real-world settings. Here,
the challenge is to learn accurate "few-shot" models for classes existing at
the tail of the class distribution, for which little data is available.
Inspired by the rich semantic correlations between classes at the long tail and
those at the head, we take advantage of the knowledge from data-rich classes at
the head of the distribution to boost the performance of the data-poor classes
at the tail. First, we propose to leverage implicit relational knowledge among
class labels from knowledge graph embeddings and learn explicit relational
knowledge using graph convolution networks. Second, we integrate that
relational knowledge into relation extraction model by coarse-to-fine
knowledge-aware attention mechanism. We demonstrate our results for a
large-scale benchmark dataset which show that our approach significantly
outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201
Schema-adaptable Knowledge Graph Construction
Conventional Knowledge Graph Construction (KGC) approaches typically follow
the static information extraction paradigm with a closed set of pre-defined
schema. As a result, such approaches fall short when applied to dynamic
scenarios or domains, whereas a new type of knowledge emerges. This
necessitates a system that can handle evolving schema automatically to extract
information for KGC. To address this need, we propose a new task called
schema-adaptable KGC, which aims to continually extract entity, relation, and
event based on a dynamically changing schema graph without re-training. We
first split and convert existing datasets based on three principles to build a
benchmark, i.e., horizontal schema expansion, vertical schema expansion, and
hybrid schema expansion; then investigate the schema-adaptable performance of
several well-known approaches such as Text2Event, TANL, UIE and GPT-3.5. We
further propose a simple yet effective baseline dubbed \textsc{AdaKGC}, which
contains schema-enriched prefix instructor and schema-conditioned dynamic
decoding to better handle evolving schema. Comprehensive experimental results
illustrate that AdaKGC can outperform baselines but still have room for
improvement. We hope the proposed work can deliver benefits to the community.
Code and datasets available at https://github.com/zjunlp/AdaKGC.Comment: EMNLP 2023 (Findings
Conformal Prediction for Deep Classifier via Label Ranking
Conformal prediction is a statistical framework that generates prediction
sets containing ground-truth labels with a desired coverage guarantee. The
predicted probabilities produced by machine learning models are generally
miscalibrated, leading to large prediction sets in conformal prediction. In
this paper, we empirically and theoretically show that disregarding the
probabilities' value will mitigate the undesirable effect of miscalibrated
probability values. Then, we propose a novel algorithm named (SAPS), which discards all the probability values
except for the maximum softmax probability. The key idea behind SAPS is to
minimize the dependence of the non-conformity score on the probability values
while retaining the uncertainty information. In this manner, SAPS can produce
sets of small size and communicate instance-wise uncertainty. Theoretically, we
provide a finite-sample coverage guarantee of SAPS and show that the expected
value of set size from SAPS is always smaller than APS. Extensive experiments
validate that SAPS not only lessens the prediction sets but also broadly
enhances the conditional coverage rate and adaptation of prediction sets
Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study
This paper presents an empirical study to build relation extraction systems
in low-resource settings. Based upon recent pre-trained language models, we
comprehensively investigate three schemes to evaluate the performance in
low-resource settings: (i) different types of prompt-based methods with
few-shot labeled data; (ii) diverse balancing methods to address the
long-tailed distribution issue; (iii) data augmentation technologies and
self-training to generate more labeled in-domain data. We create a benchmark
with 8 relation extraction (RE) datasets covering different languages, domains
and contexts and perform extensive comparisons over the proposed schemes with
combinations. Our experiments illustrate: (i) Though prompt-based tuning is
beneficial in low-resource RE, there is still much potential for improvement,
especially in extracting relations from cross-sentence contexts with multiple
relational triples; (ii) Balancing methods are not always helpful for RE with
long-tailed distribution; (iii) Data augmentation complements existing
baselines and can bring much performance gain, while self-training may not
consistently achieve advancement to low-resource RE. Code and datasets are in
https://github.com/zjunlp/LREBench.Comment: Accepted to EMNLP 2022 (Findings) and the project website is
https://zjunlp.github.io/project/LREBench
Optimization-Free Test-Time Adaptation for Cross-Person Activity Recognition
Human Activity Recognition (HAR) models often suffer from performance
degradation in real-world applications due to distribution shifts in activity
patterns across individuals. Test-Time Adaptation (TTA) is an emerging learning
paradigm that aims to utilize the test stream to adjust predictions in
real-time inference, which has not been explored in HAR before. However, the
high computational cost of optimization-based TTA algorithms makes it
intractable to run on resource-constrained edge devices. In this paper, we
propose an Optimization-Free Test-Time Adaptation (OFTTA) framework for
sensor-based HAR. OFTTA adjusts the feature extractor and linear classifier
simultaneously in an optimization-free manner. For the feature extractor, we
propose Exponential DecayTest-time Normalization (EDTN) to replace the
conventional batch normalization (CBN) layers. EDTN combines CBN and Test-time
batch Normalization (TBN) to extract reliable features against domain shifts
with TBN's influence decreasing exponentially in deeper layers. For the
classifier, we adjust the prediction by computing the distance between the
feature and the prototype, which is calculated by a maintained support set. In
addition, the update of the support set is based on the pseudo label, which can
benefit from reliable features extracted by EDTN. Extensive experiments on
three public cross-person HAR datasets and two different TTA settings
demonstrate that OFTTA outperforms the state-of-the-art TTA approaches in both
classification performance and computational efficiency. Finally, we verify the
superiority of our proposed OFTTA on edge devices, indicating possible
deployment in real applications. Our code is available at
\href{https://github.com/Claydon-Wang/OFTTA}{this https URL}.Comment: To be presented at UbiComp 2024; Accepted by Proceedings of the ACM
on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT
Social Media Meets Big Urban Data: A Case Study of Urban Waterlogging Analysis
With the design and development of smart cities, opportunities as well as challenges arise at the moment. For this purpose, lots of data need to be obtained. Nevertheless, circumstances vary in different cities due to the variant infrastructures and populations, which leads to the data sparsity. In this paper, we propose a transfer learning method for urban waterlogging disaster analysis, which provides the basis for traffic management agencies to generate proactive traffic operation strategies in order to alleviate congestion. Existing work on urban waterlogging mostly relies on past and current conditions, as well as sensors and cameras, while there may not be a sufficient number of sensors to cover the relevant areas of a city. To this end, it would be helpful if we could transfer waterlogging. We examine whether it is possible to use the copious amounts of information from social media and satellite data to improve urban waterlogging analysis. Moreover, we analyze the correlation between severity, road networks, terrain, and precipitation. Moreover, we use a multiview discriminant transfer learning method to transfer knowledge to small cities. Experimental results involving cities in China and India show that our proposed framework is effective
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy
issues, which means they are unaware of unseen events or generate text with
incorrect facts owing to the outdated/noisy data. To this end, many knowledge
editing approaches for LLMs have emerged -- aiming to subtly inject/edit
updated knowledge or adjust undesired behavior while minimizing the impact on
unrelated inputs. Nevertheless, due to significant differences among various
knowledge editing methods and the variations in task setups, there is no
standard implementation framework available for the community, which hinders
practitioners to apply knowledge editing to applications. To address these
issues, we propose EasyEdit, an easy-to-use knowledge editing framework for
LLMs. It supports various cutting-edge knowledge editing approaches and can be
readily apply to many well-known LLMs such as T5, GPT-J, LlaMA, etc.
Empirically, we report the knowledge editing results on LlaMA-2 with EasyEdit,
demonstrating that knowledge editing surpasses traditional fine-tuning in terms
of reliability and generalization. We have released the source code on GitHub
at https://github.com/zjunlp/EasyEdit, along with Google Colab tutorials and
comprehensive documentation for beginners to get started. Besides, we present
an online system for real-time knowledge editing, and a demo video at
http://knowlm.zjukg.cn/easyedit.mp4.Comment: The project website is https://github.com/zjunlp/EasyEdi