282 research outputs found
Recommended from our members
Generative Language Models for Personalized Information Understanding
A major challenge in information understanding stems from the diverse nature of the audience, where individuals possess varying preferences, experiences, educational and cultural backgrounds. Consequently, adopting a one-size-fits-all approach to provide information may prove suboptimal. While prior research has predominantly focused on delivering pre-existing content to users with potential interests, this thesis explores generative language models for personalized information understanding. By harnessing the potential of generative language models, our objective is to generate novel personalize content for individual users. As a result, users from diverse backgrounds can be provided with content that are tailored for their need and better aligns with their interests. The crux of this research hinges on addressing the following two aspects: 1. Personalized Content: How to harness user profiles to create tailored content for individual users; 2. Effective Communication: How to engage with users in order to proficiently convey information. For the first aspect, i.e. personalized content, we explored personalized news headline generation. By analyzing users\u27 reading history, our proposed framework identifies perspectives that users are interested in, which can further guide generating news headlines that are attractive to users. For the second aspect, i.e. effective communication, we developed personalized reading assistive agent, which assist users understand complex information in news article or academic documents through conversations. Compared to reading, obtaining information through conversations is more interactive and requires shorter attention span. We further incorporate the above aspects in personalized information systems in a real-life scenario, i.e. patient education. Specifically, we propose a novel after-visit summaries (AVS) writing assistant. After-visit summaries notes are documents given to patients to help them understand their clinical visits and disease self-management. Our approach not only automatically generates AVS drafts, but also detects potential errors in the generated drafts, allowing physicians to revise and produce AVS notes with higher efficiency and accuracy. Moreover, we present PaniniQA, a patient-centric interactive question answering system designed to help patients understand their discharge instructions. PaniniQA first identifies important clinical content from patients’ discharge instructions and then formulates personalized educational questions for distinctive patients. In addition, PaniniQA is also equipped with answer verification functionality to provide timely feedback to correct patients’ misunderstandings. Overall, we aspire to contribute to the advancement of information dissemination techniques, promoting a more inclusive and effective means of communication in our information-driven world
On Improving Summarization Factual Consistency from Natural Language Feedback
Despite the recent progress in language generation models, their outputs may
not always meet user expectations. In this work, we study whether informational
feedback in natural language can be leveraged to improve generation quality and
user preference alignment. To this end, we consider factual consistency in
summarization, the quality that the summary should only contain information
supported by the input documents, as the user-expected preference. We collect a
high-quality dataset, DeFacto, containing human demonstrations and
informational natural language feedback consisting of corrective instructions,
edited summaries, and explanations with respect to the factual consistency of
the summary. Using our dataset, we study three natural language generation
tasks: (1) editing a summary by following the human feedback, (2) generating
human feedback for editing the original summary, and (3) revising the initial
summary to correct factual errors by generating both the human feedback and
edited summary. We show that DeFacto can provide factually consistent
human-edited summaries and further insights into summarization factual
consistency thanks to its informational natural language feedback. We further
demonstrate that fine-tuned language models can leverage our dataset to improve
the summary factual consistency, while large language models lack the zero-shot
learning ability in our proposed tasks that require controllable text
generation.Comment: ACL 2023 Camera Ready, GitHub Repo:
https://github.com/microsoft/DeFact
A reinforcement learning formulation to the complex question answering problem
International audienceWe use extractive multi-document summarization techniques to perform complex question answering and formulate it as a reinforcement learning problem. Given a set of complex questions, a list of relevant documents per question, and the corresponding human generated summaries (i.e. answers to the questions) as training data, the reinforcement learning module iteratively learns a number of feature weights in order to facilitate the automatic generation of summaries i.e. answers to previously unseen complex questions. A reward function is used to measure the similarities between the candidate (machine generated) summary sentences and the abstract summaries. In the training stage, the learner iteratively selects the important document sentences to be included in the candidate summary, analyzes the reward function and updates the related feature weights accordingly. The final weights are used to generate summaries as answers to unseen complex questions in the testing stage. Evaluation results show the effectiveness of our system. We also incorporate user interaction into the reinforcement learner to guide the candidate summary sentence selection process. Experiments reveal the positive impact of the user interaction component on the reinforcement learning framework
A survey on opinion summarization technique s for social media
The volume of data on the social media is huge and even keeps increasing. The need for efficient processing of this extensive information resulted in increasing research interest in knowledge engineering tasks such as Opinion Summarization. This survey shows the current opinion summarization challenges for social media, then the necessary pre-summarization steps like preprocessing, features extraction, noise elimination, and handling of synonym features. Next, it covers the various approaches used in opinion summarization like Visualization, Abstractive, Aspect based, Query-focused, Real Time, Update Summarization, and highlight other Opinion Summarization approaches such as Contrastive, Concept-based, Community Detection, Domain Specific, Bilingual, Social Bookmarking, and Social Media Sampling. It covers the different datasets used in opinion summarization and future work suggested in each technique. Finally, it provides different ways for evaluating opinion summarization
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Despite the seeming success of contemporary grounded text generation systems,
they often tend to generate factually inconsistent text with respect to their
input. This phenomenon is emphasized in tasks like summarization, in which the
generated summaries should be corroborated by their source article. In this
work, we leverage recent progress on textual entailment models to directly
address this problem for abstractive summarization systems. We use
reinforcement learning with reference-free, textual entailment rewards to
optimize for factual consistency and explore the ensuing trade-offs, as
improved consistency may come at the cost of less informative or more
extractive summaries. Our results, according to both automatic metrics and
human evaluation, show that our method considerably improves the faithfulness,
salience, and conciseness of the generated summaries.Comment: ACL 202
- …