67 research outputs found

    Development of a new stroke scale in an emergency setting

    Get PDF
    Background: Early identification of stroke is crucial to maximize early management benefits in emergency departments. This study aimed to develop and validate a new stroke recognition instrument for differentiating acute stroke from stroke mimics in an emergency setting. Methods: A prospective observational cohort study among suspected stroke patients presenting to Emergency Department in the Second Affiliated Hospital of Guangzhou Medical University was conducted from May 2012 to March 2013. The symptoms and signs of suspected stroke patients were collected. Logistic regression analysis was used to identify the factors associated with acute stroke. The symptoms and signs closely associated with acute stroke were selected to develop the new stroke scale, Guangzhou Stroke Scale (GZSS). The diagnostic value of GZSS was then compared with ROSIER, FAST and LAPSS. The primary outcome was confirmed stroke by CT within 24 h. Results: Four hundred and sixteen suspected stroke patients (247 ischemia, 107 hemorrhage, 4 transient ischemic attack, 58 non-stroke) were assessed. A new stroke scale, GZSS (total score from −1 to 8.5), was developed and consisted of nine parameters: vertigo (−1), GCS ≤ 8 (+2), facial paralysis (+1), asymmetric arm weakness (+1), asymmetric leg weakness (+1), speech disturbance (+0.5), visual field defect (+1), systolic blood pressure ≥145 mmHg (+1) and diastolic blood pressure ≥95 mmHg (+1). Among the four scales, the discriminatory value (C-statistic) of GZSS was the best (AUC: 0.871 (p < 0.001) when compared to ROSIER (0.772), LAPSS (0.722) and FAST (0.699). At an optimal cut-off score of >1.5 on a scale from −1 to 8.5, the sensitivity and specificity of GZSS were 83.2 and 74.1 %, whilst the sensitivities and specificities of ROSIER were 77.7 and 70.7 %, FAST were 76.0 and 63.8 %, LAPSS were 56.4 and 87.9 %. Conclusion: GZSS had better sensitivity than existing stroke scales in Chinese patients with suspected stroke. Further studies should be conducted to confirm its effectiveness in the initial differentiation of acute stroke from stroke mimics. Keywords: Diagnosis, Stroke, Stroke mimics, ROSIER scale, FAST scale, LAPSS scale, Emergency department, China Abbreviations: AUC, area under the ROC curve; CT, computed tomography; DWI, diffusion weighted imaging; FAST, the face arm speech test; GCS, Glasgow Coma Scale; IQR, inter quartile range; LAPSS, the Los Angeles Prehospital Stroke Screen; MRI, magnetic resonance imaging; NIHSS, National Institute of Health stroke scale; OR, odds ratio; ROC, receiver operating characteristic; ROSIER, the Recognition of Stroke in the Emergency Room scale; TIA, transient ischemic attac

    Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation

    Full text link
    Continual relation extraction (CRE) aims to continually learn new relations from a class-incremental data stream. CRE model usually suffers from catastrophic forgetting problem, i.e., the performance of old relations seriously degrades when the model learns new relations. Most previous work attributes catastrophic forgetting to the corruption of the learned representations as new relations come, with an implicit assumption that the CRE models have adequately learned the old relations. In this paper, through empirical studies we argue that this assumption may not hold, and an important reason for catastrophic forgetting is that the learned representations do not have good robustness against the appearance of analogous relations in the subsequent learning process. To address this issue, we encourage the model to learn more precise and robust representations through a simple yet effective adversarial class augmentation mechanism (ACA), which is easy to implement and model-agnostic. Experimental results show that ACA can consistently improve the performance of state-of-the-art CRE models on two popular benchmarks.Comment: Accepted by EMNLP 202

    Making Large Language Models Better Reasoners with Alignment

    Full text link
    Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capability is essential for large language models (LLMs) to serve as the brain of the artificial general intelligence agent. Recent studies reveal that fine-tuning LLMs on data with the chain of thought (COT) reasoning process can significantly enhance their reasoning capabilities. However, we find that the fine-tuned LLMs suffer from an \textit{Assessment Misalignment} problem, i.e., they frequently assign higher scores to subpar COTs, leading to potential limitations in their reasoning abilities. To address this problem, we introduce an \textit{Alignment Fine-Tuning (AFT)} paradigm, which involves three steps: 1) fine-tuning LLMs with COT training data; 2) generating multiple COT responses for each question, and categorizing them into positive and negative ones based on whether they achieve the correct answer; 3) calibrating the scores of positive and negative responses given by LLMs with a novel constraint alignment loss. Specifically, the constraint alignment loss has two objectives: a) Alignment, which guarantees that positive scores surpass negative scores to encourage answers with high-quality COTs; b) Constraint, which keeps the negative scores confined to a reasonable range to prevent the model degradation. Beyond just the binary positive and negative feedback, the constraint alignment loss can be seamlessly adapted to the ranking situations when ranking feedback is accessible. Furthermore, we also delve deeply into recent ranking-based alignment methods, such as DPO, RRHF, and PRO, and discover that the constraint, which has been overlooked by these approaches, is also crucial for their performance. Extensive experiments on four reasoning benchmarks with both binary and ranking feedback demonstrate the effectiveness of AFT.Comment: Large Language Models; Reasoning; Alignmen

    STEMI outcomes in Guangzhou and Hong Kong: two-centre retrospective interregional study

    Get PDF
    BACKGROUND AND OBJECTIVES:Healthcare systems are organized very differently in Hong Kong (HK) and Guangzhou (GZ). This study compared managements of the emergency departments (ED) and one-year mortalities of ST-segment elevation myocardial infarction (STEMI) patients in two teaching hospitals in Guangzhou and Hong Kong. METHODS:Retrospective observational study of STEMI mortalities and treatments in the Prince of Wales Hospital (PWH) and the Second Affiliated Hospital of Guangzhou Medical University (AHGZMU), was conducted between January and December 2010. The primary outcome was one-year all cause mortality. RESULTS:Univariate analysis of 76 cases from PWH and 111 cases from AHGZMU showed similar clinical characteristics, except for lower proportions of males (74% vs 92%, P = 0.002), hyperlipidemia (5% vs 25%, P67 years) and hyperglycemia (>10 mmol/L). Aged over 65 years, presence of anterior wall infarct, body weight ≤65 kg, SBP 10 mmol/L were the independent predictors of in-hospital MACE. CONCLUSION:There was no statistically significant difference between the standardized one-year all-cause mortalities of STEMI patients in the setting mainly using thrombolysis with shorter door-to-treatment time and the setting mainly using PCI with longer door-to-treatment time. Aged over 67 years and glucose level over 10 mmol/L were the independent predictors of one-year mortality. Older age, presence of anterior wall infarct, lower body weight, lower SBP at ED and hyperglycemia were the independent predictors of in-hospital MACE

    Large Language Models are not Fair Evaluators

    Full text link
    We uncover a systematic bias in the evaluation paradigm of adopting large language models~(LLMs), e.g., GPT-4, as a referee to score the quality of responses generated by candidate models. We find that the quality ranking of candidate responses can be easily hacked by simply altering their order of appearance in the context. This manipulation allows us to skew the evaluation result, making one model appear considerably superior to the other, e.g., vicuna could beat ChatGPT on 66 over 80 tested queries. To address this issue, we propose two simple yet effective calibration strategies: 1) Multiple Evidence Calibration, which requires the evaluator model to generate multiple detailed pieces of evidence before assigning ratings; 2) Balanced Position Calibration, which aggregates results across various orders to determine the final score. Extensive experiments demonstrate that our approach successfully mitigates evaluation bias, resulting in closer alignment with human judgments. To facilitate future research on more robust large language model comparison, we integrate the techniques in the paper into an easy-to-use toolkit \emph{FairEval}, along with the human annotations.\footnote{\url{https://github.com/i-Eval/FairEval}}Comment: work in progres

    Comparison of outcomes in emergency department patients with suspected cardiac chest pain: two-centre prospective observational study in Southern China

    Get PDF
    Background Hong Kong (HK) and Guangzhou (GZ) are cities in China with different healthcare systems. This study aimed to compare 30-day and 6-month mortality and characteristics of patients with suspected cardiac chest pain admitted to two emergency departments (ED) in HK and GZ. Methods A prospective observational study enrolled patients with suspected cardiac chest pain presenting to EDs in the Prince of Wales Hospital (PWH), HK and the Second Affiliated Hospital of Guangzhou Medical University (AHGZMU),GZ. The primary outcome was 30-day and 6-month mortality. Results In total, 996 patients were recruited, 407 cases from GZ and 589 cases from HK.The 30-day and 6-month mortality of chest patients were 3.7% and 4.7% in GZand 0.3% and 1.9% in HK, respectively. Serum creatinine level (Cr) was an independent factor for 30-day mortality whilst Cr and systolic blood pressure (SBP) were independent factors for 6-month mortality. In Cox regression analysis, unadjusted and adjusted hazard ratios for 30-day and 6-month mortality in GZ were significantly increased. Conclusion The 30-day and 6-month mortality of patients with suspected cardiac chest pain in Guangzhou were higher than in Hong Kong due to due to different baseline clinical characteristics of patients and different distributions of diagnoses, which were associated with different healthcare systems. Serum creatinine and SBP were independent factors for 30-day and 6-month mortality

    Derivation of a prediction rule for unfavorable outcome after ischemic stroke in the Chinese population

    Get PDF
    Background Efficient assessment of patients after ischemic stroke has important reference value for doctors to choose appropriate treatment for patients. Our study aimed to develop a new prognostic model for predicting outcomes 3 months after ischemic stroke among Chinese Population. Methods A prospective observational cohort study among ischemic stroke patients presenting to Emergency Department in the Second Affiliated Hospital of Guangzhou Medical University was conducted from May 2012 to June 2013. Demographic data of ischemic stroke patients, assessment of NIHSS and laboratory results were collected. Based on 3-month modified Rankin Scale (mRS) ischemic stroke patients were divided into either favorable outcome (mRS: 0-2) or unfavorable outcome groups (mRS: 3-6). The variables closely associated with prognosis of ischemic stroke were selected to develop the new prognostic model (NAAP) consisted of 4 parameters: NIHSS, age, atrial fibrillation, and prealbumin. The prognostic value of the modified prognostic model was then compared with NIHSS alone. Results A total of 454 patients with suspected stroke were recruited. One hundred eighty-six patients with ischemic stroke were included in the final analysis. A new prognostic model, NAAP was developed. The area under curve (AUC) of NAAP was .861 (95%confidence interval: .803-.907), whilst the AUC of NIHSS was .783 (95%CI: .717-.840), (P = .0048). Decision curve analysis showed that NAAP had a higher net benefit for threshold probabilities of 65% for predictive risk of poor outcomes. Conclusions The modified prognostic model, NAAP may be a better prognostic tool for predicting 3-month unfavorable outcomes for ischemic stroke than NIHSS alone

    The Chinese pine genome and methylome unveil key features of conifer evolution

    Get PDF
    Conifers dominate the world's forest ecosystems and are the most widely planted tree species. Their giant and complex genomes present great challenges for assembling a complete reference genome for evolutionary and genomic studies. We present a 25.4-Gb chromosome-level assembly of Chinese pine (Pinus tabuliformis) and revealed that its genome size is mostly attributable to huge intergenic regions and long introns with high transposable element (TE) content. Large genes with long introns exhibited higher expressions levels. Despite a lack of recent whole-genome duplication, 91.2% of genes were duplicated through dispersed duplication, and expanded gene families are mainly related to stress responses, which may underpin conifers' adaptation, particularly in cold and/or arid conditions. The reproductive regulation network is distinct compared with angiosperms. Slow removal of TEs with high-level methylation may have contributed to genomic expansion. This study provides insights into conifer evolution and resources for advancing research on conifer adaptation and development
    • …
    corecore