288,387 research outputs found
Fact-based Text Editing
We propose a novel text editing task, referred to as \textit{fact-based text
editing}, in which the goal is to revise a given document to better describe
the facts in a knowledge base (e.g., several triples). The task is important in
practice because reflecting the truth is a common requirement in text editing.
First, we propose a method for automatically generating a dataset for research
on fact-based text editing, where each instance consists of a draft text, a
revised text, and several facts represented in triples. We apply the method
into two public table-to-text datasets, obtaining two new datasets consisting
of 233k and 37k instances, respectively. Next, we propose a new neural network
architecture for fact-based text editing, called \textsc{FactEditor}, which
edits a draft text by referring to given facts using a buffer, a stream, and a
memory. A straightforward approach to address the problem would be to employ an
encoder-decoder model. Our experimental results on the two datasets show that
\textsc{FactEditor} outperforms the encoder-decoder approach in terms of
fidelity and fluency. The results also show that \textsc{FactEditor} conducts
inference faster than the encoder-decoder approach.Comment: ACL 202
Edit-A-Video: Single Video Editing with Object-Aware Consistency
Despite the fact that text-to-video (TTV) model has recently achieved
remarkable success, there have been few approaches on TTV for its extension to
video editing. Motivated by approaches on TTV models adapting from
diffusion-based text-to-image (TTI) models, we suggest the video editing
framework given only a pretrained TTI model and a single pair,
which we term Edit-A-Video. The framework consists of two stages: (1) inflating
the 2D model into the 3D model by appending temporal modules and tuning on the
source video (2) inverting the source video into the noise and editing with
target text prompt and attention map injection. Each stage enables the temporal
modeling and preservation of semantic attributes of the source video. One of
the key challenges for video editing include a background inconsistency
problem, where the regions not included for the edit suffer from undesirable
and inconsistent temporal alterations. To mitigate this issue, we also
introduce a novel mask blending method, termed as sparse-causal blending (SC
Blending). We improve previous mask blending methods to reflect the temporal
consistency so that the area where the editing is applied exhibits smooth
transition while also achieving spatio-temporal consistency of the unedited
regions. We present extensive experimental results over various types of text
and videos, and demonstrate the superiority of the proposed method compared to
baselines in terms of background consistency, text alignment, and video editing
quality
Eye-tracking as a measure of cognitive effort for post-editing of machine translation
The three measurements for post-editing effort as proposed by Krings (2001) have been adopted by many researchers in subsequent studies and publications. These measurements comprise temporal effort (the speed or productivity rate of post-editing, often measured in words per second or per minute at the segment level), technical effort (the number of actual edits performed by the post-editor, sometimes approximated using the Translation Edit Rate metric (Snover et al. 2006), again usually at the segment level), and cognitive effort. Cognitive effort has been measured using Think-Aloud Protocols, pause measurement, and, increasingly, eye-tracking. This chapter provides a review of studies of post-editing effort using eye-tracking, noting the influence of publications by Danks et al. (1997), and OâBrien (2006, 2008), before describing a single study in detail.
The detailed study examines whether predicted effort indicators affect post-editing effort and results were previously published as Moorkens et al. (2015). Most of the eye-tracking data analysed were unused in the previou
- âŠ