17 research outputs found
On Improving Summarization Factual Consistency from Natural Language Feedback
Despite the recent progress in language generation models, their outputs may
not always meet user expectations. In this work, we study whether informational
feedback in natural language can be leveraged to improve generation quality and
user preference alignment. To this end, we consider factual consistency in
summarization, the quality that the summary should only contain information
supported by the input documents, as the user-expected preference. We collect a
high-quality dataset, DeFacto, containing human demonstrations and
informational natural language feedback consisting of corrective instructions,
edited summaries, and explanations with respect to the factual consistency of
the summary. Using our dataset, we study three natural language generation
tasks: (1) editing a summary by following the human feedback, (2) generating
human feedback for editing the original summary, and (3) revising the initial
summary to correct factual errors by generating both the human feedback and
edited summary. We show that DeFacto can provide factually consistent
human-edited summaries and further insights into summarization factual
consistency thanks to its informational natural language feedback. We further
demonstrate that fine-tuned language models can leverage our dataset to improve
the summary factual consistency, while large language models lack the zero-shot
learning ability in our proposed tasks that require controllable text
generation.Comment: ACL 2023 Camera Ready, GitHub Repo:
https://github.com/microsoft/DeFact
Using Markov Models and Statistics to Learn, Extract, Fuse, and Detect Patterns in Raw Data
Many systems are partially stochastic in nature. We have derived data driven
approaches for extracting stochastic state machines (Markov models) directly
from observed data. This chapter provides an overview of our approach with
numerous practical applications. We have used this approach for inferring
shipping patterns, exploiting computer system side-channel information, and
detecting botnet activities. For contrast, we include a related data-driven
statistical inferencing approach that detects and localizes radiation sources.Comment: Accepted by 2017 International Symposium on Sensor Networks, Systems
and Securit
On the node-scheduling approach to topology control in ad hoc networks
In this paper, we analyze the node scheduling approach of topology control in the context of reliable packet delivery. In node scheduling, only a minimum set of nodes needed for routing purposes (usually determined by a minimum connected dominating set, MCDS) are kept active. However, a very low density resulting from switching off nodes can adversely affect the performance of data delivery due to three factors. First, our analysis shows that at low density, the average path length increases by a factor more than previously thought. Second, protocols such as the Hop-By-Hop Broadcast (HHB) reliability scheme (which relies on high network degree for optimum performance) suffer. Third, with limited buffers at nodes, the overhead is more pronounced to the extent of making the network unstable. Using probabilistic models, we derive the relationship between network density and overhead based on the above factors and find the density conditions for minimum power consumption. We also propose a, fully distributed and message-optimal node scheduling algorithm with a constant approximation bound based on the concept of Virtual Connected Dominating Sets. The scheme can asymptotically achieve optimal density conditions while adapting to different network parameters
ReInForM: Reliable Information Forwarding Using Multiple Paths in Sensor Networks
Sensor networks are meant for sensing and disseminating information about the environment they sense. The criticality of a sensed phenomenon determines it's importance to the end user. Hence data dissemination in a sensor network should be informationaware