Search CORE

17 research outputs found

On Improving Summarization Factual Consistency from Natural Language Feedback

Author: Awadallah Ahmed H.
Deb Budhaditya
Halfaker Aaron
Liu Yixin
Radev Dragomir
Teruel Milagro
Publication venue
Publication date: 16/10/2023
Field of study

Despite the recent progress in language generation models, their outputs may not always meet user expectations. In this work, we study whether informational feedback in natural language can be leveraged to improve generation quality and user preference alignment. To this end, we consider factual consistency in summarization, the quality that the summary should only contain information supported by the input documents, as the user-expected preference. We collect a high-quality dataset, DeFacto, containing human demonstrations and informational natural language feedback consisting of corrective instructions, edited summaries, and explanations with respect to the factual consistency of the summary. Using our dataset, we study three natural language generation tasks: (1) editing a summary by following the human feedback, (2) generating human feedback for editing the original summary, and (3) revising the initial summary to correct factual errors by generating both the human feedback and edited summary. We show that DeFacto can provide factually consistent human-edited summaries and further insights into summarization factual consistency thanks to its informational natural language feedback. We further demonstrate that fine-tuned language models can leverage our dataset to improve the summary factual consistency, while large language models lack the zero-shot learning ability in our proposed tasks that require controllable text generation.Comment: ACL 2023 Camera Ready, GitHub Repo: https://github.com/microsoft/DeFact

arXiv.org e-Print Archive

Using Markov Models and Statistics to Learn, Extract, Fuse, and Detect Patterns in Raw Data

Author: Bao Ly Van
Bart Vanluyten
Budhaditya Deb
Chen Lu
Christopher Griffin
Harakrishnan Bhanu
J M Schwier
J.M. Schwier
L.R. Rabiner
Lu Yu
Micha Ober
R. Vilim
R.R. Brooks
S.M. Brennan
Sean R Eddy
Simon Barber
X Zhong
Xiaohong Sheng
Yu Fu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/09/2017
Field of study

Many systems are partially stochastic in nature. We have derived data driven approaches for extracting stochastic state machines (Markov models) directly from observed data. This chapter provides an overview of our approach with numerous practical applications. We have used this approach for inferring shipping patterns, exploiting computer system side-channel information, and detecting botnet activities. For contrast, we include a related data-driven statistical inferencing approach that detects and localizes radiation sources.Comment: Accepted by 2017 International Symposium on Sensor Networks, Systems and Securit

arXiv.org e-Print Archive

Crossref

On the node-scheduling approach to topology control in ad hoc networks

Author: Budhaditya Deb
Publication venue
Publication date: 01/01/2005
Field of study

In this paper, we analyze the node scheduling approach of topology control in the context of reliable packet delivery. In node scheduling, only a minimum set of nodes needed for routing purposes (usually determined by a minimum connected dominating set, MCDS) are kept active. However, a very low density resulting from switching off nodes can adversely affect the performance of data delivery due to three factors. First, our analysis shows that at low density, the average path length increases by a factor more than previously thought. Second, protocols such as the Hop-By-Hop Broadcast (HHB) reliability scheme (which relies on high network degree for optimum performance) suffer. Third, with limited buffers at nodes, the overhead is more pronounced to the extent of making the network unstable. Using probabilistic models, we derive the relationship between network density and overhead based on the above factors and find the density conditions for minimum power consumption. We also propose a, fully distributed and message-optimal node scheduling algorithm with a constant approximation bound based on the concept of Virtual Connected Dominating Sets. The scheme can asymptotically achieve optimal density conditions while adapting to different network parameters

CiteSeerX

ReInForM: Reliable Information Forwarding Using Multiple Paths in Sensor Networks

Author: Badri Nath
Budhaditya Deb
Sudeept Bhatnagar
Publication venue
Publication date: 01/01/2003
Field of study

Sensor networks are meant for sensing and disseminating information about the environment they sense. The criticality of a sensed phenomenon determines it's importance to the end user. Hence data dissemination in a sensor network should be informationaware

CiteSeerX