19 research outputs found
RTSUM: Relation Triple-based Interpretable Summarization with Multi-level Salience Visualization
In this paper, we present RTSUM, an unsupervised summarization framework that
utilizes relation triples as the basic unit for summarization. Given an input
document, RTSUM first selects salient relation triples via multi-level salience
scoring and then generates a concise summary from the selected relation triples
by using a text-to-text language model. On the basis of RTSUM, we also develop
a web demo for an interpretable summarizing tool, providing fine-grained
interpretations with the output summary. With support for customization
options, our tool visualizes the salience for textual units at three distinct
levels: sentences, relation triples, and phrases. The codes,are publicly
available.Comment: 8 pages, 2 figure
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Code editing is an essential step towards reliable program synthesis to
automatically correct critical errors generated from code LLMs. Recent studies
have demonstrated that closed-source LLMs (i.e., ChatGPT and GPT-4) are capable
of generating corrective feedback to edit erroneous inputs. However, it remains
challenging for open-source code LLMs to generate feedback for code editing,
since these models tend to adhere to the superficial formats of feedback and
provide feedback with misleading information. Hence, the focus of our work is
to leverage open-source code LLMs to generate helpful feedback with correct
guidance for code editing. To this end, we present Coffee, a collected dataset
specifically designed for code fixing with feedback. Using this dataset, we
construct CoffeePots, a framework for COde Fixing with FEEdback via
Preference-Optimized Tuning and Selection. The proposed framework aims to
automatically generate helpful feedback for code editing while minimizing the
potential risk of superficial feedback. The combination of Coffee and
CoffeePots marks a significant advancement, achieving state-of-the-art
performance on HumanEvalFix benchmark. Codes and model checkpoints are publicly
available at https://github.com/Lune-Blue/COFFEE.Comment: Work in progres
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents
Human-like chatbots necessitate the use of commonsense reasoning in order to
effectively comprehend and respond to implicit information present within
conversations. Achieving such coherence and informativeness in responses,
however, is a non-trivial task. Even for large language models (LLMs), the task
of identifying and aggregating key evidence within a single hop presents a
substantial challenge. This complexity arises because such evidence is
scattered across multiple turns in a conversation, thus necessitating
integration over multiple hops. Hence, our focus is to facilitate such
multi-hop reasoning over a dialogue context, namely dialogue chain-of-thought
(CoT) reasoning. To this end, we propose a knowledge distillation framework
that leverages LLMs as unreliable teachers and selectively distills consistent
and helpful rationales via alignment filters. We further present DOCTOR, a
DialOgue Chain-of-ThOught Reasoner that provides reliable CoT rationales for
response generation. We conduct extensive experiments to show that enhancing
dialogue agents with high-quality rationales from DOCTOR significantly improves
the quality of their responses.Comment: 25 pages, 8 figures, Accepted to EMNLP 202
Understanding Emerging Spatial Entities
In Foursquare or Google+ Local, emerging spatial entities, such as new business or venue, are reported to grow by 1% every day. As information on such spatial entities is initially limited (e.g., only name), we need to quickly harvest related information from social media such as Flickr photos. Especially, achieving high-recall in photo population is essential for emerging spatial entities, which suffer from data sparseness (e.g., 71% restaurants of TripAdvisor in Seattle do not have any photo, as of Sep 03, 2015). Our goal is thus to address this limitation by identifying effective linking techniques for emerging spatial entities and photos. Compared with state-of-the-art baselines, our proposed approach improves recall and F1 score by up to 24% and 18%, respectively. To show the effectiveness and robustness of our approach, we have conducted extensive experiments in three different cities, Seattle, Washington D.C., and Taipei, of varying characteristics such as geographical density and language
TrustAL: Trustworthy Active Learning Using Knowledge Distillation
Active learning can be defined as iterations of data labeling, model training, and data acquisition, until sufficient labels are acquired. A traditional view of data acquisition is that, through iterations, knowledge from human labels and models is implicitly distilled to monotonically increase the accuracy and label consistency. Under this assumption, the most recently trained model is a good surrogate for the current labeled data, from which data acquisition is requested based on uncertainty/diversity. Our contribution is debunking this myth and proposing a new objective for distillation. First, we found example forgetting, which indicates the loss of knowledge learned across iterations. Second, for this reason, the last model is no longer the best teacher-- For mitigating such forgotten knowledge, we select one of its predecessor models as a teacher, by our proposed notion of "consistency". We show that this novel distillation is distinctive in the following three aspects; First, consistency ensures to avoid forgetting labels. Second, consistency improves both uncertainty/diversity of labeled data. Lastly, consistency redeems defective labels produced by human annotators
TUTORING: Instruction-Grounded Conversational Agent for Language Learners
In this paper, we propose Tutoring bot, a generative chatbot trained on a large scale of tutor-student conversations for English-language learning. To mimic a human tutor's behavior in language education, the tutor bot leverages diverse educational instructions and grounds to each instruction as additional input context for the tutor response generation. As a single instruction generally involves multiple dialogue turns to give the student sufficient speaking practice, the tutor bot is required to monitor and capture when the current instruction should be kept or switched to the next instruction. For that, the tutor bot is learned to not only generate responses but also infer its teaching action and progress on the current conversation simultaneously by a multi-task learning scheme. Our Tutoring bot is deployed under a non-commercial use license at https://tutoringai.com
Factors Associated with Long-Term Dietary Supplement Use among Korean Breast Cancer Survivors: A Cross-Sectional Study
Purpose: The factors associated with the dietary supplement (DS) use of Asian breast cancer survivors in consideration of the duration of use and types of DS have not been well established. Methods: We recruited 693 Korean female breast cancer survivors at two university-affiliated hospitals and collected study data through a self-administered questionnaire and a review of medical records. A multiple logistic regression analysis was conducted to evaluate the multivariable-adjusted association between DS use and study variables. Results: The prevalence of any (≥2 weeks) and long-term (≥6 months) DS use among study participants was 48.2% and 12.0%, respectively. Education level, alcohol use, adequate physical activity (≥150 min/week), and time lapse after cancer diagnosis were positively associated with any DS use. Among DS users, as compared with short-term (≥2 weeks and <6 months) users, long-term users were more likely to have a higher cancer stage, more diverse cancer treatment modalities, a shorter time since cancer diagnosis, and lower fear of cancer recurrence. When we repeated the analysis for each DS type, time lapse after cancer diagnosis showed a consistently inverse association with long-term use of the most frequently consumed DS (multivitamins, followed by vitamin D/calcium, vitamin C, and omega-3). The number of cancer treatment modalities was positively associated with the long-term use of multivitamins and vitamin D/calcium. Alcohol consumption and low bone mineral density were positively associated with long-term vitamin D/calcium use. Conclusions: The factors associated with DS use differed by the duration of DS use and specific DS type. Long-term DS use was more frequently associated with cancer-related factors
Dual Task Framework for Improving Persona-Grounded Dialogue Dataset
This paper introduces a simple yet effective data-centric approach for the task of improving persona-conditioned dialogue agents. Prior model-centric approaches unquestioningly depend on the raw crowdsourced benchmark datasets such as Persona-Chat. In contrast, we aim to fix annotation artifacts in benchmarking, which is orthogonally applicable to any dialogue model. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks, predicting dialogue responses and personas based on each other. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy