Search CORE

175 research outputs found

ChatGPT as a Factual Inconsistency Evaluator for Abstractive Text Summarization

Author: Ananiadou Sophia
Luo Zheheng
Xie Qianqian
Publication venue
Publication date: 27/03/2023
Field of study

The performance of abstractive text summarization has been greatly boosted by pre-trained language models recently. The main concern of existing abstractive summarization methods is the factual inconsistency problem of their generated summary. To alleviate the problem, many efforts have focused on developing effective factuality evaluation metrics based on natural language inference and question answering et al. However, they have limitations of high computational complexity and relying on annotated data. Most recently, large language models such as ChatGPT have shown strong ability in not only natural language understanding but also natural language inference. In this paper, we study the factual inconsistency evaluation ability of ChatGPT under the zero-shot setting by evaluating it on the coarse-grained and fine-grained factuality evaluation tasks including binary natural language inference (NLI), summary ranking, and consistency rating. Experimental results show that ChatGPT outperforms previous SOTA evaluation metrics on 6/9 datasets across three tasks, demonstrating its great potential for assessing factual inconsistency in the zero-shot setting. The results also highlight the importance of prompt design and the need for future efforts to address ChatGPT's limitations on evaluation bias, wrong reasoning, and hallucination.Comment: ongoing work, 12 pages, 4 figure

arXiv.org e-Print Archive

LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive Summarisation

Author: Ananiadou Sophia
Bishop Jennifer A
Xie Qianqian
Publication venue
Publication date: 21/09/2023
Field of study

Maintaining factual consistency is a critical issue in abstractive text summarisation, however, it cannot be assessed by traditional automatic metrics used for evaluating text summarisation, such as ROUGE scoring. Recent efforts have been devoted to developing improved metrics for measuring factual consistency using pre-trained language models, but these metrics have restrictive token limits, and are therefore not suitable for evaluating long document text summarisation. Moreover, there is limited research evaluating whether existing automatic evaluation metrics are fit for purpose when applied to long document data sets. In this work, we evaluate the efficacy of automatic metrics at assessing factual consistency in long document text summarisation and propose a new evaluation framework LongDocFACTScore. This framework allows metrics to be extended to any length document. This framework outperforms existing state-of-the-art metrics in its ability to correlate with human measures of factuality when used to evaluate long document summarisation data sets. Furthermore, we show LongDocFACTScore has performance comparable to state-of-the-art metrics when evaluated against human measures of factual consistency on short document data sets. We make our code and annotated data publicly available: https://github.com/jbshp/LongDocFACTScore.Comment: 12 pages, 5 figure

arXiv.org e-Print Archive

A Survey on Biomedical Text Summarization with Pre-trained Language Model

Author: Ananiadou Sophia
Luo Zheheng
Wang Benyou
Xie Qianqian
Publication venue
Publication date: 18/04/2023
Field of study

The exponential growth of biomedical texts such as biomedical literature and electronic health records (EHRs), provides a big challenge for clinicians and researchers to access clinical information efficiently. To address the problem, biomedical text summarization has been proposed to support clinical information retrieval and management, aiming at generating concise summaries that distill key information from single or multiple biomedical documents. In recent years, pre-trained language models (PLMs) have been the de facto standard of various natural language processing tasks in the general domain. Most recently, PLMs have been further investigated in the biomedical field and brought new insights into the biomedical text summarization task. In this paper, we systematically summarize recent advances that explore PLMs for biomedical text summarization, to help understand recent progress, challenges, and future directions. We categorize PLMs-based approaches according to how they utilize PLMs and what PLMs they use. We then review available datasets, recent approaches and evaluation metrics of the task. We finally discuss existing challenges and promising future directions. To facilitate the research community, we line up open resources including available datasets, recent approaches, codes, evaluation metrics, and the leaderboard in a public project: https://github.com/KenZLuo/Biomedical-Text-Summarization-Survey/tree/master.Comment: 19 pages, 6 figures, TKDE under revie

arXiv.org e-Print Archive

MentalLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models

Author: Ananiadou Sophia
Kuang Ziyan
Xie Qianqian
Yang Kailai
Zhang Tianlin
Publication venue
Publication date: 24/09/2023
Field of study

With the development of web technology, social media texts are becoming a rich source for automatic mental health analysis. As traditional discriminative methods bear the problem of low interpretability, the recent large language models have been explored for interpretable mental health analysis on social media, which aims to provide detailed explanations along with predictions. The results show that ChatGPT can generate approaching-human explanations for its correct classifications. However, LLMs still achieve unsatisfactory classification performance in a zero-shot/few-shot manner. Domain-specific finetuning is an effective solution, but faces 2 challenges: 1) lack of high-quality training data. 2) no open-source LLMs for interpretable mental health analysis were released to lower the finetuning cost. To alleviate these problems, we build the first multi-task and multi-source interpretable mental health instruction (IMHI) dataset on social media, with 105K data samples. The raw social media data are collected from 10 existing sources covering 8 mental health analysis tasks. We use expert-written few-shot prompts and collected labels to prompt ChatGPT and obtain explanations from its responses. To ensure the reliability of the explanations, we perform strict automatic and human evaluations on the correctness, consistency, and quality of generated data. Based on the IMHI dataset and LLaMA2 foundation models, we train MentalLLaMA, the first open-source LLM series for interpretable mental health analysis with instruction-following capability. We also evaluate the performance of MentalLLaMA on the IMHI evaluation benchmark with 10 test sets, where their correctness for making predictions and the quality of explanations are examined. The results show that MentalLLaMA approaches state-of-the-art discriminative methods in correctness and generates high-quality explanations.Comment: Work in progres

arXiv.org e-Print Archive

Recommended from our members

Television Viewing Time in Hong Kong Adult Population: Associations with Body Mass Index and Obesity

Author: Chan Sophia S.
Lam Tai Hing
Stewart Sunita M.
Viswanath Kasisomayajula
Xie Yao Jie
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Background: Obesity is increasing dramatically in the Asia-Pacific region particularly China. The population of Hong Kong was exposed to modernization far earlier than the rest of China, reflecting conditions that are likely to be replicated as other Chinese cities undergo rapid change. This study examined the relationship between television viewing and obesity in a Hong Kong sample. Information about the relationship between a key sedentary behavior, TV viewing, and obesity, and its moderation by demographic characteristics may identify sectors of the population at highest risk for excess weight. Methods: Data were from Hong Kong Family and Health Information Trends Survey (2009–2010), a population-based survey on the public's use of media for health information and family communication by telephone interviews with 3,016 Hong Kong adults (age≥18 years). TV viewing time, body mass index (BMI), physical activity and other lifestyle variables were analyzed. Results: Viewing time was longer in women, increased with age but decreased with education level and vigorous physical activity (all P<0.01). Longer TV viewing time was significantly associated with higher BMI (Coefficients B = 0.17, 95% CI: 0.11, 0.24) after adjusting for age, gender, employment status, marital status, education level, smoking activity and vigorous physical activity. This association was stronger in women than men (Coefficients B: 0.19 versus 0.15) and strongest in those aged 18 to 34 years (Coefficients B = 0.35). Furthermore, an hour increase in daily TV viewing was associated with 10% greater odds of being obese. Conclusions: A significant socioeconomic gradient in television viewing time was observed. TV viewing time positively associated with BMI and obesity. The TV viewing – BMI associations were strongest in women and young adults, suggesting vulnerable groups to target for obesity prevention by decreasing TV viewing

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

Overview of the BioLaySumm 2023 Shared Task on Lay Summarization of Biomedical Research Articles

Author: Ananiadou Sophia
Goldsack Tomas
Lin Chenghua
Luo Zheheng
Scarton Carolina
Shardlow Matthew
Xie Qianqian
Publication venue
Publication date: 25/10/2023
Field of study

This paper presents the results of the shared task on Lay Summarisation of Biomedical Research Articles (BioLaySumm), hosted at the BioNLP Workshop at ACL 2023. The goal of this shared task is to develop abstractive summarisation models capable of generating "lay summaries" (i.e., summaries that are comprehensible to non-technical audiences) in both a controllable and non-controllable setting. There are two subtasks: 1) Lay Summarisation, where the goal is for participants to build models for lay summary generation only, given the full article text and the corresponding abstract as input; and 2) Readability-controlled Summarisation, where the goal is for participants to train models to generate both the technical abstract and the lay summary, given an article's main text as input. In addition to overall results, we report on the setup and insights from the BioLaySumm shared task, which attracted a total of 20 participating teams across both subtasks.Comment: Published at BioNLP@ACL202

arXiv.org e-Print Archive

Deep learning based single image super-resolution : a survey

Author: Ha Viet Khanh
Hussain Amir
Masero Valentin
Ren Jin Chang
Xie Gang
Xu Xin Ying
Zhao Sophia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/07/2019
Field of study

Single image super-resolution has attracted increasing attention and has a wide range of applications in satellite imaging, medical imaging, computer vision, security surveillance imaging, remote sensing, objection detection, and recognition. Recently, deep learning techniques have emerged and blossomed, producing “the state-of-the-art” in many domains. Due to their capability in feature extraction and mapping, it is very helpful to predict high-frequency details lost in low-resolution images. In this paper, we give an overview of recent advances in deep learning-based models and methods that have been applied to single image super-resolution tasks. We also summarize, compare and discuss various models from the past and present for comprehensive understanding and finally provide open problems and possible directions for future research

University of Strathclyde Institutional Repository

Open Access Institutional Repository at Robert Gordon University

Identifying neuropsychiatric disorders in the Medicare Current Beneficiary Survey: the benefits of combining health survey and claims data

Author: Dawei Xie
Joel E. Streim
Margaret G. Stineman
Pui L. Kwong
Qiang Pan
Sophia Miryam Schüssler-Fiorenza Rose
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Springer - Publisher Connector

FoodWise: Food Waste Reduction and Behavior Change on Campus with Data Visualization and Gamification

Author: Cheng Kwang-Ting
Lo Leo Yu-Ho
Nan Xi
Qu Huamin
Shigyo Kento
Wicaksana Jeffry
Xie Liwenhan
Yi Sophia
Yu Yue
Publication venue
Publication date: 24/07/2023
Field of study

Food waste presents a substantial challenge with significant environmental and economic ramifications, and its severity on campus environments is of particular concern. In response to this, we introduce FoodWise, a dual-component system tailored to inspire and incentivize campus communities to reduce food waste. The system consists of a data storytelling dashboard that graphically displays food waste information from university canteens, coupled with a mobile web application that encourages users to log their food waste reduction actions and rewards active participants for their efforts. Deployed during a two-week food-saving campaign at The Hong Kong University of Science and Technology (HKUST) in March 2023, FoodWise engaged over 200 participants from the university community, resulting in the logging of over 800 daily food-saving actions. Feedback collected post-campaign underscores the system's efficacy in elevating user consciousness about food waste and prompting behavioral shifts towards a more sustainable campus. This paper also provides insights for enhancing our system, contributing to a broader discourse on sustainable campus initiatives

arXiv.org e-Print Archive

Strong structural and electronic coupling in metavalent PbS moire superlattices

Author: Betzler Sophia
Bustillo Karen C.
Ercius Peter
Ophus Colin
Song Zhigang
Wan Jiawei
Wang Lin-Wang
Wang Yu
Xie Yujun
Zheng Haimei
Publication venue
Publication date: 22/07/2022
Field of study

Moire superlattices are twisted bilayer materials, in which the tunable interlayer quantum confinement offers access to new physics and novel device functionalities. Previously, moire superlattices were built exclusively using materials with weak van der Waals interactions and synthesizing moire superlattices with strong interlayer chemical bonding was considered to be impractical. Here using lead sulfide (PbS) as an example, we report a strategy for synthesizing of moire superlattices coupled by strong chemical bonding. We use water-soluble ligands as a removable template to obtain free-standing ultra-thin PbS nanosheets and assemble them into direct-contact bilayers with various twist angles. Atomic-resolution imaging shows the moire periodic structural reconstruction at superlattice interface, due to the strong metavalent coupling. Electron energy loss spectroscopy and theoretical calculations collectively reveal the twist angle26 dependent electronic structure, especially the emergent separation of flat bands at small twist angles. The localized states of flat bands are similar to well-arranged quantum dots, promising an application in devices. This study opens a new door to the exploration of deep energy modulations within moire superlattices alternative to van der Waals twistronics

arXiv.org e-Print Archive