21 research outputs found
Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data
Advances in large language models (LLMs) have empowered a variety of
applications. However, there is still a significant gap in research when it
comes to understanding and enhancing the capabilities of LLMs in the field of
mental health. In this work, we present the first comprehensive evaluation of
multiple LLMs, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, and GPT-4, on
various mental health prediction tasks via online text data. We conduct a broad
range of experiments, covering zero-shot prompting, few-shot prompting, and
instruction fine-tuning. The results indicate a promising yet limited
performance of LLMs with zero-shot and few-shot prompt designs for the mental
health tasks. More importantly, our experiments show that instruction
finetuning can significantly boost the performance of LLMs for all tasks
simultaneously. Our best-finetuned models, Mental-Alpaca and Mental-FLAN-T5,
outperform the best prompt design of GPT-3.5 (25 and 15 times bigger) by 10.9%
on balanced accuracy and the best of GPT-4 (250 and 150 times bigger) by 4.8%.
They further perform on par with the state-of-the-art task-specific language
model. We also conduct an exploratory case study on LLMs' capability on the
mental health reasoning tasks, illustrating the promising capability of certain
models such as GPT-4. We summarize our findings into a set of action guidelines
for potential methods to enhance LLMs' capability for mental health tasks.
Meanwhile, we also emphasize the important limitations before achieving
deployability in real-world mental health settings, such as known racial and
gender bias. We highlight the important ethical risks accompanying this line of
research
Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture
Real-world domain experts (e.g., doctors) rarely annotate only a decision
label in their day-to-day workflow without providing explanations. Yet,
existing low-resource learning techniques, such as Active Learning (AL), that
aim to support human annotators mostly focus on the label while neglecting the
natural language explanation of a data point. This work proposes a novel AL
architecture to support experts' real-world need for label and explanation
annotations in low-resource scenarios. Our AL architecture leverages an
explanation-generation model to produce explanations guided by human
explanations, a prediction model that utilizes generated explanations toward
prediction faithfully, and a novel data diversity-based AL sampling strategy
that benefits from the explanation annotations. Automated and human evaluations
demonstrate the effectiveness of incorporating explanations into AL sampling
and the improved human annotation efficiency and trustworthiness with our AL
architecture. Additional ablation studies illustrate the potential of our AL
architecture for transfer learning, generalizability, and integration with
large language models (LLMs). While LLMs exhibit exceptional
explanation-generation capabilities for relatively simple tasks, their
effectiveness in complex real-world tasks warrants further in-depth study.Comment: Accepted to EMNLP 2023 Finding
Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations
Human-annotated labels and explanations are critical for training explainable
NLP models. However, unlike human-annotated labels whose quality is easier to
calibrate (e.g., with a majority vote), human-crafted free-form explanations
can be quite subjective, as some recent works have discussed. Before blindly
using them as ground truth to train ML models, a vital question needs to be
asked: How do we evaluate a human-annotated explanation's quality? In this
paper, we build on the view that the quality of a human-annotated explanation
can be measured based on its helpfulness (or impairment) to the ML models'
performance for the desired NLP tasks for which the annotations were collected.
In comparison to the commonly used Simulatability score, we define a new metric
that can take into consideration the helpfulness of an explanation for model
performance at both fine-tuning and inference. With the help of a unified
dataset format, we evaluated the proposed metric on five datasets (e.g.,
e-SNLI) against two model architectures (T5 and BART), and the results show
that our proposed metric can objectively evaluate the quality of
human-annotated explanations, while Simulatability falls short.Comment: Accepted to ACL202
PaperPuppy: Sniffing the Trail of Semantic Web Publications
Abstract. PaperPuppy is a system designed to allow the exploration and analysis of a network of ISWC conference authors and publications. Using proceeding data from the first four ISWC conferences, we show co-authorship, citation, institutional affiliation, co-depiction in photographs, and other connections supported by the underlying ontologies. We demonstrate the rich browsing experience with the PaperPuppy site, and support it with a variety of cutting-dge Semantic Web tools.
Relationship or revenue: potential management conflicts between customer relationship management and hotel revenue management
The concepts of customer relationship management (CRM) and revenue management (RevM) have been embraced by managers in the hospitality industry although, in practice, companies may find it difficult to accommodate both fully. This paper examines the compatibility between the two practices and discusses the possible management conflicts that occur from both account managersâ and revenue managersâ viewpoints. Findings gathered from an international hotel company reveal several causes of potential management conflicts including: management goals, management timescales, perceived business assets, performance indicators and management foci between CRM and RevM due to divergence occurring in managersâ priorities and in their approaches to achieving their individual set goals. These differences have rarely been comprehensively investigated in previous studies, yet are vital in integrating CRM and RevM practices