Search CORE

24 research outputs found

Dont Add, dont Miss: Effective Content Preserving Generation from Pre-Selected Text Spans

Author: Caciularu Avi
Dagan Ido
Hirsch Eran
Slobodkin Aviv
Publication venue
Publication date: 12/11/2023
Field of study

The recently introduced Controlled Text Reduction (CTR) task isolates the text generation step within typical summarization-style tasks. It does so by challenging models to generate coherent text conforming to pre-selected content within the input text (``highlights''). This framing enables increased modularity in summarization-like tasks, allowing to couple a single CTR model with various content-selection setups and modules. However, there are currently no reliable CTR models, while the performance of the existing baseline for the task is mediocre, falling short of practical utility. Here, we address this gap by introducing a high-quality, open-source CTR model that tackles two prior key limitations: inadequate enforcement of the content-preservation constraint, and suboptimal silver training data. Addressing these, we amplify the content-preservation constraint in both training, via RL, and inference, via a controlled decoding strategy. Further, we substantially improve the silver training data quality via GPT-4 distillation. Overall, pairing the distilled dataset with the highlight-adherence strategies yields marked gains over the current baseline, of up to 30 ROUGE-L points, providing a reliable CTR model for downstream use.Comment: EMNLP 2023, finding

arXiv.org e-Print Archive

The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models

Author: Caciularu Avi
Dagan Ido
Goldman Omer
Ravfogel Shauli
Slobodkin Aviv
Publication venue
Publication date: 12/11/2023
Field of study

Large language models (LLMs) have been shown to possess impressive capabilities, while also raising crucial concerns about the faithfulness of their responses. A primary issue arising in this context is the management of (un)answerable queries by LLMs, which often results in hallucinatory behavior due to overconfidence. In this paper, we explore the behavior of LLMs when presented with (un)answerable queries. We ask: do models represent the fact that the question is (un)answerable when generating a hallucinatory answer? Our results show strong indications that such models encode the answerability of an input query, with the representation of the first decoded token often being a strong indicator. These findings shed new light on the spatial organization within the latent representations of LLMs, unveiling previously unexplored facets of these models. Moreover, they pave the way for the development of improved decoding techniques with better adherence to factual generation, particularly in scenarios where query (un)answerability is a concern.Comment: EMNLP 202

arXiv.org e-Print Archive

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering

Author: Caciularu Avi
Cohan Arman
Dagan Ido
Goldberger Jacob
Peters Matthew E.
Publication venue
Publication date: 24/05/2023
Field of study

The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks. In this work, we propose extending this idea by pre-training a generic multi-document model from a novel cross-document question answering pre-training objective. To that end, given a set (or cluster) of topically-related documents, we systematically generate semantically-oriented questions from a salient sentence in one document and challenge the model, during pre-training, to answer these questions while "peeking" into other topically-related documents. In a similar manner, the model is also challenged to recover the sentence from which the question was generated, again while leveraging cross-document information. This novel multi-document QA formulation directs the model to better recover cross-text informational relations, and introduces a natural augmentation that artificially increases the pre-training data. Further, unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation (e.g., QA) and long text generation (e.g., summarization). Following this scheme, we pre-train our model -- termed QAmden -- and evaluate its performance across several multi-document tasks, including multi-document QA, summarization, and query-focused summarization, yielding improvements of up to 7%, and significantly outperforms zero-shot GPT-3.5 and GPT-4.Comment: Accepted at ACL 2023; camera-ready versio

arXiv.org e-Print Archive

QASem Parsing: Text-to-text Modeling of QA-based Semantics

Author: Caciularu Avi
Dagan Ido
Eliav Ron
Hirsch Eran
Klein Ayal
Pyatkin Valentina
Publication venue
Publication date: 14/02/2023
Field of study

Several recent works have suggested to represent semantic relations with questions and answers, decomposing textual information into separate interrogative natural language statements. In this paper, we consider three QA-based semantic tasks - namely, QA-SRL, QANom and QADiscourse, each targeting a certain type of predication - and propose to regard them as jointly providing a comprehensive representation of textual information. To promote this goal, we investigate how to best utilize the power of sequence-to-sequence (seq2seq) pre-trained language models, within the unique setup of semi-structured outputs, consisting of an unordered set of question-answer pairs. We examine different input and output linearization strategies, and assess the effect of multitask learning and of simple data augmentation techniques in the setting of imbalanced training data. Consequently, we release the first unified QASem parsing tool, practical for downstream applications who can benefit from an explicit, QA-based account of information units in a text

arXiv.org e-Print Archive

Association between translation efficiency and horizontal gene transfer within microbial communities

Author: Avi Kreimer
Bahir
Ball
Barker
Beiko
Birin
Corbin
Dagan
Dagan
Dong
Doolittle
dos Reis
Drummond
Eytan Ruppin
Foerstner
Freilich
Garcia-Vallve
Gogarten
Ikemura
Jain
Jukes
Kanaya
Kobayashi
Kudla
Lawrence
Lowe
Lynch
Man
Martin Kupiec
Medrano-Soto
Nakamura
Newman
Neyman
Novozhilov
Ochman
Percudani
Ragan
Sharp
Shiri Freilich
Sorek
Stern
Tamir Tuller
Taoka
Tatusov
Thomas
Tuller
Tuller
Tuller
Tuller
Tuller
Uri Gophna
Vieira-Silva
Waldman
Warren
Wellner
Wellner
Yael Sella
Yana Girshovich
Yang
Publication venue: Oxford University Press
Publication date
Field of study

Horizontal gene transfer (HGT) is a major force in microbial evolution. Previous studies have suggested that a variety of factors, including restricted recombination and toxicity of foreign gene products, may act as barriers to the successful integration of horizontally transferred genes. This study identifies an additional central barrier to HGT—the lack of co-adaptation between the codon usage of the transferred gene and the tRNA pool of the recipient organism. Analyzing the genomic sequences of more than 190 microorganisms and the HGT events that have occurred between them, we show that the number of genes that were horizontally transferred between organisms is positively correlated with the similarity between their tRNA pools. Those genes that are better adapted to the tRNA pools of the target genomes tend to undergo more frequent HGT. At the community (or environment) level, organisms that share a common ecological niche tend to have similar tRNA pools. These results remain significant after controlling for diverse ecological and evolutionary parameters. Our analysis demonstrates that there are bi-directional associations between the similarity in the tRNA pools of organisms and the number of HGT events occurring between them. Similar tRNA pools between a donor and a host tend to increase the probability that a horizontally acquired gene will become fixed in its new genome. Our results also suggest that frequent HGT may be a homogenizing force that increases the similarity in the tRNA pools of organisms within the same community

Crossref

PubMed Central

Pneumococcal Meningitis in Adults after Introduction of PCV7 and PCV13, Israel, July 2009–June 2015

Author: Avi Peretz
Evgenia Tsyba
Galia Rahav
Gili Regev-Yochay
Jacob Strahilevitz
Klaris Reisenberg
Michal Katzir
Ron Dagan
Shirley Khakshoor
Valery Istomin
Yonit Wiener-Well
Publication venue: 'Centers for Disease Control and Prevention (CDC)'
Publication date: 01/07/2018
Field of study

The indirect effect of pneumococcal conjugate vaccine on adult pneumococcal meningitis has not been thoroughly investigated. We present data from active surveillance on pneumococcal meningitis in adults in Israel occurring during July 2009–June 2015. Pneumococcal meningitis was diagnosed for 221 patients, 9.4% of all invasive pneumococcal disease (IPD) cases. Although overall IPD incidence decreased during the study period, meningitis increased nonsignificantly from 0.66 to 0.85 cases/100,000 population. Incidence of vaccine type (VT) pneumococcal meningitis (VT13) decreased by 70%, but non-VT13 pneumococcal meningitis increased from 0.32 to 0.75 cases/100,000 population (incident rate ratio 2.35, 95% CI 1.27–4.35). Pneumococcal meningitis patients were younger and healthier than nonmeningitis IPD patients, and 20.2% had a history of previous head surgery or cerebrospinal fluid leak compared with <2.0% of nonmeningitis patients (p<0.0001). Non-VT13 types that rarely cause IPD (15B/C, 6C, 23A, 23B, 24F) seem to be emerging as common causes of meningitis

Directory of Open Access Journals