4 research outputs found
Better Zero-Shot Reasoning with Role-Play Prompting
Modern large language models (LLMs), such as ChatGPT, exhibit a remarkable
capacity for role-playing, enabling them to embody not only human characters
but also non-human entities like a Linux terminal. This versatility allows them
to simulate complex human-like interactions and behaviors within various
contexts, as well as to emulate specific objects or systems. While these
capabilities have enhanced user engagement and introduced novel modes of
interaction, the influence of role-playing on LLMs' reasoning abilities remains
underexplored. In this study, we introduce a strategically designed role-play
prompting methodology and assess its performance under the zero-shot setting
across twelve diverse reasoning benchmarks, encompassing arithmetic,
commonsense reasoning, symbolic reasoning, and more. Leveraging models such as
ChatGPT and Llama 2, our empirical results illustrate that role-play prompting
consistently surpasses the standard zero-shot approach across most datasets.
Notably, accuracy on AQuA rises from 53.5% to 63.8%, and on Last Letter from
23.8% to 84.2%. Beyond enhancing contextual understanding, we posit that
role-play prompting serves as an implicit Chain-of-Thought (CoT) trigger,
thereby improving the quality of reasoning. By comparing our approach with the
Zero-Shot-CoT technique, which prompts the model to "think step by step", we
further demonstrate that role-play prompting can generate a more effective CoT.
This highlights its potential to augment the reasoning capabilities of LLMs
PromptRank: Unsupervised Keyphrase Extraction Using Prompt
The keyphrase extraction task refers to the automatic selection of phrases
from a given document to summarize its core content. State-of-the-art (SOTA)
performance has recently been achieved by embedding-based algorithms, which
rank candidates according to how similar their embeddings are to document
embeddings. However, such solutions either struggle with the document and
candidate length discrepancies or fail to fully utilize the pre-trained
language model (PLM) without further fine-tuning. To this end, in this paper,
we propose a simple yet effective unsupervised approach, PromptRank, based on
the PLM with an encoder-decoder architecture. Specifically, PromptRank feeds
the document into the encoder and calculates the probability of generating the
candidate with a designed prompt by the decoder. We extensively evaluate the
proposed PromptRank on six widely used benchmarks. PromptRank outperforms the
SOTA approach MDERank, improving the F1 score relatively by 34.18%, 24.87%, and
17.57% for 5, 10, and 15 returned results, respectively. This demonstrates the
great potential of using prompt for unsupervised keyphrase extraction. We
release our code at https://github.com/HLT-NLP/PromptRank.Comment: ACL 2023 main conferenc
Nomogram development and external validation for predicting overall survival and cancer-specific survival in patients with primary retroperitoneal sarcoma: a retrospective cohort study
Abstract Background Primary retroperitoneal sarcoma (RPS) comprises over 70 histologic subtypes, yet there are limited studies that have developed prognostic nomograms for RPS patients to predict overall survival (OS) and cancer-specific survival (CSS). The objective of this study was to construct prognostic nomograms for predicting OS and CSS in RPS patients. Methods We identified a total of 1166 RPS patients from the Surveillance, Epidemiology and End Results (SEER) database, and an additional 261 cases were collected from a tertiary cancer center. The study incorporated various clinicopathological and epidemiologic features as variables, and prediction windows for overall survival (OS) and cancer-specific survival (CSS) were set at 3, 5, and 7 years. Multivariable Cox models were utilized to develop the nomograms, and variable selection was performed using a backward procedure based on the Akaike Information Criterion. To evaluate the performance of the nomograms in terms of calibration and discrimination, we used calibration plots, coherence index, and area under the curve. Findings The study included 818 patients in the development cohort, 348 patients in the internal validation cohort, and 261 patients in the external validation cohort. The backward procedure selected the following variables: age, French Federation of Cancer Centers Sarcoma Group (FNCLCC) grade, pre-/postoperative chemotherapy, tumor size, primary site surgery, and tumor multifocality. The validation results demonstrated that the nomograms had good calibration and discrimination, with C-indices of 0.76 for OS and 0.81 for CSS. Calibration plots also showed good consistency between the predicted and actual survival rates. Furthermore, the areas under the time-dependent receiver operating characteristic curves for the 3-, 5-, and 7-year OS (0.84, 0.82, and 0.78, respectively) and CSS (0.88, 0.88, and 0.85, respectively) confirmed the accuracy of the nomograms. Interpretation Our study developed accurate nomograms to predict OS and CSS in patients with RPS. These nomograms have important clinical implications and can assist healthcare providers in making informed decisions regarding patient care and treatment options. They may also aid in patient counseling and stratification in clinical trials