21 research outputs found

    Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

    Full text link
    Advances in large language models (LLMs) have empowered a variety of applications. However, there is still a significant gap in research when it comes to understanding and enhancing the capabilities of LLMs in the field of mental health. In this work, we present the first comprehensive evaluation of multiple LLMs, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, and GPT-4, on various mental health prediction tasks via online text data. We conduct a broad range of experiments, covering zero-shot prompting, few-shot prompting, and instruction fine-tuning. The results indicate a promising yet limited performance of LLMs with zero-shot and few-shot prompt designs for the mental health tasks. More importantly, our experiments show that instruction finetuning can significantly boost the performance of LLMs for all tasks simultaneously. Our best-finetuned models, Mental-Alpaca and Mental-FLAN-T5, outperform the best prompt design of GPT-3.5 (25 and 15 times bigger) by 10.9% on balanced accuracy and the best of GPT-4 (250 and 150 times bigger) by 4.8%. They further perform on par with the state-of-the-art task-specific language model. We also conduct an exploratory case study on LLMs' capability on the mental health reasoning tasks, illustrating the promising capability of certain models such as GPT-4. We summarize our findings into a set of action guidelines for potential methods to enhance LLMs' capability for mental health tasks. Meanwhile, we also emphasize the important limitations before achieving deployability in real-world mental health settings, such as known racial and gender bias. We highlight the important ethical risks accompanying this line of research

    Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture

    Full text link
    Real-world domain experts (e.g., doctors) rarely annotate only a decision label in their day-to-day workflow without providing explanations. Yet, existing low-resource learning techniques, such as Active Learning (AL), that aim to support human annotators mostly focus on the label while neglecting the natural language explanation of a data point. This work proposes a novel AL architecture to support experts' real-world need for label and explanation annotations in low-resource scenarios. Our AL architecture leverages an explanation-generation model to produce explanations guided by human explanations, a prediction model that utilizes generated explanations toward prediction faithfully, and a novel data diversity-based AL sampling strategy that benefits from the explanation annotations. Automated and human evaluations demonstrate the effectiveness of incorporating explanations into AL sampling and the improved human annotation efficiency and trustworthiness with our AL architecture. Additional ablation studies illustrate the potential of our AL architecture for transfer learning, generalizability, and integration with large language models (LLMs). While LLMs exhibit exceptional explanation-generation capabilities for relatively simple tasks, their effectiveness in complex real-world tasks warrants further in-depth study.Comment: Accepted to EMNLP 2023 Finding

    Care of Women with Obesity in Pregnancy:Green-top Guideline No. 72

    Get PDF

    Beyond Structural Genomics for Plant Science

    Full text link

    Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations

    Full text link
    Human-annotated labels and explanations are critical for training explainable NLP models. However, unlike human-annotated labels whose quality is easier to calibrate (e.g., with a majority vote), human-crafted free-form explanations can be quite subjective, as some recent works have discussed. Before blindly using them as ground truth to train ML models, a vital question needs to be asked: How do we evaluate a human-annotated explanation's quality? In this paper, we build on the view that the quality of a human-annotated explanation can be measured based on its helpfulness (or impairment) to the ML models' performance for the desired NLP tasks for which the annotations were collected. In comparison to the commonly used Simulatability score, we define a new metric that can take into consideration the helpfulness of an explanation for model performance at both fine-tuning and inference. With the help of a unified dataset format, we evaluated the proposed metric on five datasets (e.g., e-SNLI) against two model architectures (T5 and BART), and the results show that our proposed metric can objectively evaluate the quality of human-annotated explanations, while Simulatability falls short.Comment: Accepted to ACL202

    PaperPuppy: Sniffing the Trail of Semantic Web Publications

    No full text
    Abstract. PaperPuppy is a system designed to allow the exploration and analysis of a network of ISWC conference authors and publications. Using proceeding data from the first four ISWC conferences, we show co-authorship, citation, institutional affiliation, co-depiction in photographs, and other connections supported by the underlying ontologies. We demonstrate the rich browsing experience with the PaperPuppy site, and support it with a variety of cutting-dge Semantic Web tools.

    Relationship or revenue: potential management conflicts between customer relationship management and hotel revenue management

    Get PDF
    The concepts of customer relationship management (CRM) and revenue management (RevM) have been embraced by managers in the hospitality industry although, in practice, companies may find it difficult to accommodate both fully. This paper examines the compatibility between the two practices and discusses the possible management conflicts that occur from both account managers’ and revenue managers’ viewpoints. Findings gathered from an international hotel company reveal several causes of potential management conflicts including: management goals, management timescales, perceived business assets, performance indicators and management foci between CRM and RevM due to divergence occurring in managers’ priorities and in their approaches to achieving their individual set goals. These differences have rarely been comprehensively investigated in previous studies, yet are vital in integrating CRM and RevM practices
    corecore