2,303 research outputs found
Defining Textual Entailment
Textual entailment is a relationship that obtains between fragments of text when one fragment in some sense implies the other fragment. The automation of textual entailment recognition supports a wide variety of text-based tasks, including information retrieval, information extraction, question answering, text summarization, and machine translation. Much ingenuity has been devoted to developing algorithms for identifying textual entailments, but relatively little to saying what textual entailment actually is. This article is a review of the logical and philosophical issues involved in providing an adequate definition of textual entailment. We show that many natural definitions of textual entailment are refuted by counterexamples, including the most widely cited definition of Dagan et al. We then articulate and defend the following revised definition: T textually entails H = df typically, a human reading T would be justified in inferring the proposition expressed by H from the proposition expressed by T. We also show that textual entailment is context-sensitive, nontransitive, and nonmonotonic
A Context-theoretic Framework for Compositionality in Distributional Semantics
Techniques in which words are represented as vectors have proved useful in
many applications in computational linguistics, however there is currently no
general semantic formalism for representing meaning in terms of vectors. We
present a framework for natural language semantics in which words, phrases and
sentences are all represented as vectors, based on a theoretical analysis which
assumes that meaning is determined by context.
In the theoretical analysis, we define a corpus model as a mathematical
abstraction of a text corpus. The meaning of a string of words is assumed to be
a vector representing the contexts in which it occurs in the corpus model.
Based on this assumption, we can show that the vector representations of words
can be considered as elements of an algebra over a field. We note that in
applications of vector spaces to representing meanings of words there is an
underlying lattice structure; we interpret the partial ordering of the lattice
as describing entailment between meanings. We also define the context-theoretic
probability of a string, and, based on this and the lattice structure, a degree
of entailment between strings.
We relate the framework to existing methods of composing vector-based
representations of meaning, and show that our approach generalises many of
these, including vector addition, component-wise multiplication, and the tensor
product.Comment: Submitted to Computational Linguistics on 20th January 2010 for
revie
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
Visual metaphors are powerful rhetorical devices used to persuade or
communicate creative ideas through images. Similar to linguistic metaphors,
they convey meaning implicitly through symbolism and juxtaposition of the
symbols. We propose a new task of generating visual metaphors from linguistic
metaphors. This is a challenging task for diffusion-based text-to-image models,
such as DALLE 2, since it requires the ability to model implicit meaning
and compositionality. We propose to solve the task through the collaboration
between Large Language Models (LLMs) and Diffusion Models: Instruct GPT-3
(davinci-002) with Chain-of-Thought prompting generates text that represents a
visual elaboration of the linguistic metaphor containing the implicit meaning
and relevant objects, which is then used as input to the diffusion-based
text-to-image models.Using a human-AI collaboration framework, where humans
interact both with the LLM and the top-performing diffusion model, we create a
high-quality dataset containing 6,476 visual metaphors for 1,540 linguistic
metaphors and their associated visual elaborations. Evaluation by professional
illustrators shows the promise of LLM-Diffusion Model collaboration for this
task . To evaluate the utility of our Human-AI collaboration framework and the
quality of our dataset, we perform both an intrinsic human-based evaluation and
an extrinsic evaluation using visual entailment as a downstream task.Comment: ACL 2023 (Findings
A Continuously Growing Dataset of Sentential Paraphrases
A major challenge in paraphrase research is the lack of parallel corpora. In
this paper, we present a new method to collect large-scale sentential
paraphrases from Twitter by linking tweets through shared URLs. The main
advantage of our method is its simplicity, as it gets rid of the classifier or
human in the loop needed to select data before annotation and subsequent
application of paraphrase identification algorithms in the previous work. We
present the largest human-labeled paraphrase corpus to date of 51,524 sentence
pairs and the first cross-domain benchmarking for automatic paraphrase
identification. In addition, we show that more than 30,000 new sentential
paraphrases can be easily and continuously captured every month at ~70%
precision, and demonstrate their utility for downstream NLP tasks through
phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201
Recommended from our members
Metaphors of a conflicted self in the journals of Sylvia Plath
This paper presents some of the results of a study that aims to investigate how mental states can be conveyed linguistically in texts of a personal nature. Figurative language, in particular metaphor and metonymy, are generally understood to play an important role in the expression of such complex phenomena (Lakoff and Johnson, 1999; Kövecses, 2000; Meier and Robinson, 2005). The study therefore looks at the metaphors used to convey mental states in the Smith Journal of ‘The Unabridged Journals of Sylvia Plath’. Mental state here refers to various aspects of cognitive functioning, but the focus, in particular, is on mental states of affect i.e. those mental states that are intrinsically valenced (Ortony and Turner, 1990). Sylvia Plath’s journal provides particularly rich data due to the writer’s linguistic creativity and documented mental health issues, the experience of which she continually explores. Specifically then, this paper focuses on metaphors of motion (or lack thereof) and so called split self metaphors.
Both manual intensive analysis and automated corpus methodologies are employed in the investigation: the Wmatrix corpus tool (Rayson, 2009) is used to identify semantic fields that are potential source and target domains in order to obtain a comprehensive picture of metaphor use. In depth analysis is then conducted manually on a sample of journal entries. The MIP procedure (Pragglejaz, 2007) is used for metaphor identification, and interpretations draw on research in other fields, especially psychology, on representations of affect. Metaphors of mental state are analyzed in terms of their implications for conveying a sense of intensity, valency and creativity
Recommended from our members
The Effects of a Curriculum Sequence on the Emergence of Reading Comprehension Involving Derived Relations in First Grade Students
I conducted 2 experiments to analyze the effects of a reading curriculum, Corrective Reading, which has a sequence that trains derived relations, on the emission of (a) derived relations defined as combinatorial entailment in Relational Frame Theory and (b) metaphors with first grade students. In Experiment 1, I compared the curriculum, which has the sequence to train derived relations to a well-known reading curriculum, RAZ Kids. RAZ Kids served as the content control. I used an experimental group design with a simultaneous treatment and a crossover feature. I selected 14 participants, who were matched then randomly assigned into 2 groups of 7. Both groups received matched instructional trials either in Corrective Reading or RAZ Kids condition, and each group was post-tested. Upon completion of the Post intervention 1 probes, each group was placed in an alternative condition, where Group 1 received the content control intervention, and Group 2 received instruction from the curriculum that has the sequence to train derived relations. Both groups increased in number of correct responses following the Corrective Reading intervention. Two kinds of analyses were done, small group and individual. In Experiment 2, I replicated Experiment I using a delayed multiple probe design across 2 first-grade dyads without a content control curriculum. I tested the effects of 5 lessons of the curriculum that has the sequence to train derived relations on the same dependent measures with an addition of implicit/explicit reading comprehension probes. The results showed that the curriculum sequence found within Corrective Reading was effective in increasing the number of correct derived relation responses, while also improving reading comprehension responses
- …