Search CORE

114,665 research outputs found

Detecting and Preventing Hallucinations in Large Vision Language Models

Author: Bas Erhan
Gunjal Anisha
Yin Jihan
Publication venue
Publication date: 11/08/2023
Field of study

Instruction tuned Large Vision Language Models (LVLMs) have made significant advancements in generalizing across a diverse set of multimodal tasks, especially for Visual Question Answering (VQA). However, generating detailed responses that are visually grounded is still a challenging task for these models. We find that even the current state-of-the-art LVLMs (InstructBLIP) still contain a staggering 30 percent of hallucinatory text in the form of non-existent objects, unfaithful descriptions, and inaccurate relationships. To address this, we introduce M-HalDetect, a {M}ultimodal {Hal}lucination {Detect}ion Dataset that can be used to train and benchmark models for hallucination detection and prevention. M-HalDetect consists of 16k fine-grained labels on VQA examples, making it the first comprehensive multi-modal hallucination detection dataset for detailed image descriptions. Unlike previous work that only consider object hallucination, we additionally annotate both entity descriptions and relationships that are unfaithful. To demonstrate the potential of this dataset for preference alignment, we propose fine-grained Direct Preference Optimization, as well as train fine-grained multi-modal reward models and evaluate their effectiveness with best-of-n rejection sampling. We perform human evaluation on both DPO and rejection sampling, and find that they reduce hallucination rates by 41% and 55% respectively, a significant improvement over the baseline.Comment: preprin

arXiv.org e-Print Archive

Textual Economy through Close Coupling of Syntax and Semantics

Author: Stone Matthew
Webber Bonnie
Publication venue
Publication date: 01/01/1998
Field of study

We focus on the production of efficient descriptions of objects, actions and events. We define a type of efficiency, textual economy, that exploits the hearer's recognition of inferential links to material elsewhere within a sentence. Textual economy leads to efficient descriptions because the material that supports such inferences has been included to satisfy independent communicative goals, and is therefore overloaded in Pollack's sense. We argue that achieving textual economy imposes strong requirements on the representation and reasoning used in generating sentences. The representation must support the generator's simultaneous consideration of syntax and semantics. Reasoning must enable the generator to assess quickly and reliably at any stage how the hearer will interpret the current sentence, with its (incomplete) syntax and semantics. We show that these representational and reasoning requirements are met in the SPUD system for sentence planning and realization.Comment: 10 pages, uses QobiTree.te

arXiv.org e-Print Archive

CiteSeerX

Edinburgh Research Explorer

Recommended from our members

Automating class definitions from OWL to English

Author: Malone James
Power Richard
Stevens Robert
Williams Sandra
Publication venue
Publication date: 01/07/2010
Field of study

Text definitions for entities within bio-ontologies are a cor-nerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag be-tween specification of the entities in the ontology and the development of the text-based definitions. As well as these text definitions, there can also be logical descriptions and definitions of an ontology's entities. The goal of natural lan-guage generation (NLG) from ontologies is to take the logi-cal description of entities and generate fluent natural lan-guage. We should be able to use NLG to automatically pro-vide text-based definitions from an ontology that has logical descriptions of its entities and thus avoid the bottleneck of authoring these definitions by hand. In this paper we present some early work in using NLG to provide such text definitions for the Experimental factor Ontology (EFO). We present our results, discuss issues in generating text definitions, and highlight some future work

Open Research Online (The Open University)

Metamodel-based model conformance and multiview consistency checking

Author: Amalio N.
Bezivin J.
Bhaduri P.
Bidoit M.
Clark T.
Evans A.
Gryce C.
Hussman H.
Huzar Z.
Huzar Z.
Huzar Z.
Jonathan S. Ostroff
Kuzniarz L.
Marcano R.
Paige R.
Paige R.
Paige R.
Paige R.
Paige R.
Paige R.
Phillip J. Brooke
Richard F. Paige
Richters M.
Sourrouille J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Model-driven development, using languages such as UML and BON, often makes use of multiple diagrams (e.g., class and sequence diagrams) when modeling systems. These diagrams, presenting different views of a system of interest, may be inconsistent. A metamodel provides a unifying framework in which to ensure and check consistency, while at the same time providing the means to distinguish between valid and invalid models, that is, conformance. Two formal specifications of the metamodel for an object-oriented modeling language are presented, and it is shown how to use these specifications for model conformance and multiview consistency checking. Comparisons are made in terms of completeness and the level of automation each provide for checking multiview consistency and model conformance. The lessons learned from applying formal techniques to the problems of metamodeling, model conformance, and multiview consistency checking are summarized

CiteSeerX

Crossref

White Rose Research Online

Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation

Author: Dhole Kaustubh D.
Manning Christopher D.
Publication venue
Publication date: 09/06/2021
Field of study

Question Generation (QG) is fundamentally a simple syntactic transformation; however, many aspects of semantics influence what questions are good to form. We implement this observation by developing Syn-QG, a set of transparent syntactic rules leveraging universal dependencies, shallow semantic parsing, lexical resources, and custom rules which transform declarative sentences into question-answer pairs. We utilize PropBank argument descriptions and VerbNet state predicates to incorporate shallow semantic content, which helps generate questions of a descriptive nature and produce inferential and semantically richer questions than existing systems. In order to improve syntactic fluency and eliminate grammatically incorrect questions, we employ back-translation over the output of these syntactic rules. A set of crowd-sourced evaluations shows that our system can generate a larger number of highly grammatical and relevant questions than previous QG systems and that back-translation drastically improves grammaticality at a slight cost of generating irrelevant questions.Comment: Some of the results in the paper were incorrec

arXiv.org e-Print Archive

Generating natural language specifications from UML class diagrams

Author: A Abbott
AV Gervasi
CL Heitmeyer
E Brill
E Goldberg
Farid Meziane
G Booch
HM Harmain
K Walden
L Goldin
L Mich
MD Lubars
Nikos Athanasakis
P Martin-Löf
PPS Chen
PPS Chen
Sophia Ananiadou
SW Ambler
W Ahrendt
WC Mann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Early phases of software development are known to be problematic, difficult to manage and errors occurring during these phases are expensive to correct. Many systems have been developed to aid the transition from informal Natural Language requirements to semistructured or formal specifications. Furthermore, consistency checking is seen by many software engineers as the solution to reduce the number of errors occurring during the software development life cycle and allow early verification and validation of software systems. However, this is confined to the models developed during analysis and design and fails to include the early Natural Language requirements. This excludes proper user involvement and creates a gap between the original requirements and the updated and modified models and implementations of the system. To improve this process, we propose a system that generates Natural Language specifications from UML class diagrams. We first investigate the variation of the input language used in naming the components of a class diagram based on the study of a large number of examples from the literature and then develop rules for removing ambiguities in the subset of Natural Language used within UML. We use WordNet,a linguistic ontology, to disambiguate the lexical structures of the UML string names and generate semantically sound sentences. Our system is developed in Java and is tested on an independent though academic case study

CiteSeerX

University of Salford Institutional Repository

Crossref

The University of Manchester - Institutional Repository

Clinical data wrangling using Ontological Realism and Referent Tracking

Author: Ceusters Werner
Hsu Chiun Yu
Smith Barry
Publication venue
Publication date: 01/01/2014
Field of study

Ontological realism aims at the development of high quality ontologies that faithfully represent what is general in reality and to use these ontologies to render heterogeneous data collections comparable. To achieve this second goal for clinical research datasets presupposes not merely (1) that the requisite ontologies already exist, but also (2) that the datasets in question are faithful to reality in the dual sense that (a) they denote only particulars and relationships between particulars that do in fact exist and (b) they do this in terms of the types and type-level relationships described in these ontologies. While much attention has been devoted to (1), work on (2), which is the topic of this paper, is comparatively rare. Using Referent Tracking as basis, we describe a technical data wrangling strategy which consists in creating for each dataset a template that, when applied to each particular record in the dataset, leads to the generation of a collection of Referent Tracking Tuples (RTT) built out of unique identifiers for the entities described by means of the data items in the record. The proposed strategy is based on (i) the distinction between data and what data are about, and (ii) the explicit descriptions of portions of reality which RTTs provide and which range not only over the particulars described by data items in a dataset, but also over these data items themselves. This last feature allows us to describe particulars that are only implicitly referred to by the dataset; to provide information about correspondences between data items in a dataset; and to assert which data items are unjustifiably or redundantly present in or absent from the dataset. The approach has been tested on a dataset collected from patients seeking treatment for orofacial pain at two German universities and made available for the NIDCR-funded OPMQoL project

PhilPapers

CiteSeerX