186 research outputs found
DORE: Document Ordered Relation Extraction based on Generative Framework
In recent years, there is a surge of generation-based information extraction
work, which allows a more direct use of pre-trained language models and
efficiently captures output dependencies. However, previous generative methods
using lexical representation do not naturally fit document-level relation
extraction (DocRE) where there are multiple entities and relational facts. In
this paper, we investigate the root cause of the underwhelming performance of
the existing generative DocRE models and discover that the culprit is the
inadequacy of the training paradigm, instead of the capacities of the models.
We propose to generate a symbolic and ordered sequence from the relation matrix
which is deterministic and easier for model to learn. Moreover, we design a
parallel row generation method to process overlong target sequences. Besides,
we introduce several negative sampling strategies to improve the performance
with balanced signals. Experimental results on four datasets show that our
proposed method can improve the performance of the generative DocRE models. We
have released our code at https://github.com/ayyyq/DORE.Comment: Findings of EMNLP 202
Alignment for Honesty
Recent research has made significant strides in applying alignment techniques
to enhance the helpfulness and harmlessness of large language models (LLMs) in
accordance with human intentions. In this paper, we argue for the importance of
alignment for honesty, ensuring that LLMs proactively refuse to answer
questions when they lack knowledge, while still not being overly conservative.
However, a pivotal aspect of alignment for honesty involves discerning the
limits of an LLM's knowledge, which is far from straightforward. This challenge
demands comprehensive solutions in terms of metric development, benchmark
creation, and training methodologies. In this paper, we address these
challenges by first establishing a precise problem definition and defining
``honesty'' inspired by the Analects of Confucius. This serves as a cornerstone
for developing metrics that effectively measure an LLM's honesty by quantifying
its progress post-alignment. Furthermore, we introduce a flexible training
framework which is further instantiated by several efficient fine-tuning
techniques that emphasize honesty without sacrificing performance on other
tasks. Our extensive experiments reveal that these aligned models show a marked
increase in honesty, as indicated by our proposed metrics. We open-source a
wealth of resources to facilitate future research at
https://github.com/GAIR-NLP/alignment-for-honesty, including honesty-aligned
models, training and evaluation datasets for honesty alignment, concept
glossary, as well as all relevant source code
An AMR-based Link Prediction Approach for Document-level Event Argument Extraction
Recent works have introduced Abstract Meaning Representation (AMR) for
Document-level Event Argument Extraction (Doc-level EAE), since AMR provides a
useful interpretation of complex semantic structures and helps to capture
long-distance dependency. However, in these works AMR is used only implicitly,
for instance, as additional features or training signals. Motivated by the fact
that all event structures can be inferred from AMR, this work reformulates EAE
as a link prediction problem on AMR graphs. Since AMR is a generic structure
and does not perfectly suit EAE, we propose a novel graph structure, Tailored
AMR Graph (TAG), which compresses less informative subgraphs and edge types,
integrates span information, and highlights surrounding events in the same
document. With TAG, we further propose a novel method using graph neural
networks as a link prediction model to find event arguments. Our extensive
experiments on WikiEvents and RAMS show that this simpler approach outperforms
the state-of-the-art models by 3.63pt and 2.33pt F1, respectively, and do so
with reduced 56% inference time. The code is availabel at
https://github.com/ayyyq/TARA.Comment: Accepted to ACL 202
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
In long context scenarios, large language models (LLMs) face three main
challenges: higher computational/financial cost, longer latency, and inferior
performance. Some studies reveal that the performance of LLMs depends on both
the density and the position of the key information (question relevant) in the
input prompt. Inspired by these findings, we propose LongLLMLingua for prompt
compression towards improving LLMs' perception of the key information to
simultaneously address the three challenges. We conduct evaluation on a wide
range of long context scenarios including single-/multi-document QA, few-shot
learning, summarization, synthetic tasks, and code completion. The experimental
results show that LongLLMLingua compressed prompt can derive higher performance
with much less cost. The latency of the end-to-end system is also reduced. For
example, on NaturalQuestions benchmark, LongLLMLingua gains a performance boost
of up to 17.1% over the original prompt with ~4x fewer tokens as input to
GPT-3.5-Turbo. It can derive cost savings of \$28.5 and \$27.4 per 1,000
samples from the LongBench and ZeroScrolls benchmark, respectively.
Additionally, when compressing prompts of ~10k tokens at a compression rate of
2x-10x, LongLLMLingua can speed up the end-to-end latency by 1.4x-3.8x. Our
code is available at https://aka.ms/LLMLingua
Evaluation of the effectiveness of EFL online teaching during the COVID-19 pandemic
Online teaching has been massively conducted during the novel coronavirus period all over the world. How to evaluate online teaching has been increasingly researched recently. This study looked at how English as a foreign language (EFL) teaching was delivered online by university teachers during the COVID-19 pandemic. We investigated university teachers and students’ perception of effective EFL online teaching and learning based on several evaluation modes in using technology in education. Data were collected using questionnaires and interviews from teachers and students in a variety of provinces in Mainland China. The results showed that various methods were used to deliver online EFL courses and these approaches are found to correlate with each other. Teachers and students provided positive comments on online teaching and were satisfied with their online teaching and learning. Participants also noted effective ways in online EFL teaching. The findings indicated that when teachers have more training, more skills, and more confidence, they could deliver more effective online teaching and learning. </jats:p
BigDataBench: a Big Data Benchmark Suite from Internet Services
As architecture, systems, and data management communities pay greater
attention to innovative big data systems and architectures, the pressure of
benchmarking and evaluating these systems rises. Considering the broad use of
big data systems, big data benchmarks must include diversity of data and
workloads. Most of the state-of-the-art big data benchmarking efforts target
evaluating specific types of applications or system software stacks, and hence
they are not qualified for serving the purposes mentioned above. This paper
presents our joint research efforts on this issue with several industrial
partners. Our big data benchmark suite BigDataBench not only covers broad
application scenarios, but also includes diverse and representative data sets.
BigDataBench is publicly available from http://prof.ict.ac.cn/BigDataBench .
Also, we comprehensively characterize 19 big data workloads included in
BigDataBench with varying data inputs. On a typical state-of-practice
processor, Intel Xeon E5645, we have the following observations: First, in
comparison with the traditional benchmarks: including PARSEC, HPCC, and
SPECCPU, big data applications have very low operation intensity; Second, the
volume of data input has non-negligible impact on micro-architecture
characteristics, which may impose challenges for simulation-based big data
architecture research; Last but not least, corroborating the observations in
CloudSuite and DCBench (which use smaller data inputs), we find that the
numbers of L1 instruction cache misses per 1000 instructions of the big data
applications are higher than in the traditional benchmarks; also, we find that
L3 caches are effective for the big data applications, corroborating the
observation in DCBench.Comment: 12 pages, 6 figures, The 20th IEEE International Symposium On High
Performance Computer Architecture (HPCA-2014), February 15-19, 2014, Orlando,
Florida, US
ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation
While language-guided image manipulation has made remarkable progress, the
challenge of how to instruct the manipulation process faithfully reflecting
human intentions persists. An accurate and comprehensive description of a
manipulation task using natural language is laborious and sometimes even
impossible, primarily due to the inherent uncertainty and ambiguity present in
linguistic expressions. Is it feasible to accomplish image manipulation without
resorting to external cross-modal language information? If this possibility
exists, the inherent modality gap would be effortlessly eliminated. In this
paper, we propose a novel manipulation methodology, dubbed ImageBrush, that
learns visual instructions for more accurate image editing. Our key idea is to
employ a pair of transformation images as visual instructions, which not only
precisely captures human intention but also facilitates accessibility in
real-world scenarios. Capturing visual instructions is particularly challenging
because it involves extracting the underlying intentions solely from visual
demonstrations and then applying this operation to a new image. To address this
challenge, we formulate visual instruction learning as a diffusion-based
inpainting problem, where the contextual information is fully exploited through
an iterative process of generation. A visual prompting encoder is carefully
devised to enhance the model's capacity in uncovering human intent behind the
visual instructions. Extensive experiments show that our method generates
engaging manipulation results conforming to the transformations entailed in
demonstrations. Moreover, our model exhibits robust generalization capabilities
on various downstream tasks such as pose transfer, image translation and video
inpainting
- …