67 research outputs found
Development of a new stroke scale in an emergency setting
Background: Early identification of stroke is crucial to maximize early management benefits in emergency
departments. This study aimed to develop and validate a new stroke recognition instrument for differentiating
acute stroke from stroke mimics in an emergency setting.
Methods: A prospective observational cohort study among suspected stroke patients presenting to Emergency
Department in the Second Affiliated Hospital of Guangzhou Medical University was conducted from May 2012 to
March 2013. The symptoms and signs of suspected stroke patients were collected. Logistic regression analysis was
used to identify the factors associated with acute stroke. The symptoms and signs closely associated with acute
stroke were selected to develop the new stroke scale, Guangzhou Stroke Scale (GZSS). The diagnostic value of GZSS
was then compared with ROSIER, FAST and LAPSS. The primary outcome was confirmed stroke by CT within 24 h.
Results: Four hundred and sixteen suspected stroke patients (247 ischemia, 107 hemorrhage, 4 transient ischemic
attack, 58 non-stroke) were assessed. A new stroke scale, GZSS (total score from −1 to 8.5), was developed and
consisted of nine parameters: vertigo (−1), GCS ≤ 8 (+2), facial paralysis (+1), asymmetric arm weakness (+1),
asymmetric leg weakness (+1), speech disturbance (+0.5), visual field defect (+1), systolic blood pressure ≥145 mmHg
(+1) and diastolic blood pressure ≥95 mmHg (+1). Among the four scales, the discriminatory value (C-statistic) of GZSS
was the best (AUC: 0.871 (p < 0.001) when compared to ROSIER (0.772), LAPSS (0.722) and FAST (0.699). At an optimal
cut-off score of >1.5 on a scale from −1 to 8.5, the sensitivity and specificity of GZSS were 83.2 and 74.1 %, whilst the
sensitivities and specificities of ROSIER were 77.7 and 70.7 %, FAST were 76.0 and 63.8 %, LAPSS were 56.4 and 87.9 %.
Conclusion: GZSS had better sensitivity than existing stroke scales in Chinese patients with suspected stroke. Further
studies should be conducted to confirm its effectiveness in the initial differentiation of acute stroke from stroke mimics.
Keywords: Diagnosis, Stroke, Stroke mimics, ROSIER scale, FAST scale, LAPSS scale, Emergency department, China
Abbreviations: AUC, area under the ROC curve; CT, computed tomography; DWI, diffusion weighted imaging; FAST,
the face arm speech test; GCS, Glasgow Coma Scale; IQR, inter quartile range; LAPSS, the Los Angeles Prehospital Stroke
Screen; MRI, magnetic resonance imaging; NIHSS, National Institute of Health stroke scale; OR, odds ratio; ROC, receiver
operating characteristic; ROSIER, the Recognition of Stroke in the Emergency Room scale; TIA, transient ischemic attac
Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation
Continual relation extraction (CRE) aims to continually learn new relations
from a class-incremental data stream. CRE model usually suffers from
catastrophic forgetting problem, i.e., the performance of old relations
seriously degrades when the model learns new relations. Most previous work
attributes catastrophic forgetting to the corruption of the learned
representations as new relations come, with an implicit assumption that the CRE
models have adequately learned the old relations. In this paper, through
empirical studies we argue that this assumption may not hold, and an important
reason for catastrophic forgetting is that the learned representations do not
have good robustness against the appearance of analogous relations in the
subsequent learning process. To address this issue, we encourage the model to
learn more precise and robust representations through a simple yet effective
adversarial class augmentation mechanism (ACA), which is easy to implement and
model-agnostic. Experimental results show that ACA can consistently improve the
performance of state-of-the-art CRE models on two popular benchmarks.Comment: Accepted by EMNLP 202
Making Large Language Models Better Reasoners with Alignment
Reasoning is a cognitive process of using evidence to reach a sound
conclusion. The reasoning capability is essential for large language models
(LLMs) to serve as the brain of the artificial general intelligence agent.
Recent studies reveal that fine-tuning LLMs on data with the chain of thought
(COT) reasoning process can significantly enhance their reasoning capabilities.
However, we find that the fine-tuned LLMs suffer from an \textit{Assessment
Misalignment} problem, i.e., they frequently assign higher scores to subpar
COTs, leading to potential limitations in their reasoning abilities. To address
this problem, we introduce an \textit{Alignment Fine-Tuning (AFT)} paradigm,
which involves three steps: 1) fine-tuning LLMs with COT training data; 2)
generating multiple COT responses for each question, and categorizing them into
positive and negative ones based on whether they achieve the correct answer; 3)
calibrating the scores of positive and negative responses given by LLMs with a
novel constraint alignment loss. Specifically, the constraint alignment loss
has two objectives: a) Alignment, which guarantees that positive scores surpass
negative scores to encourage answers with high-quality COTs; b) Constraint,
which keeps the negative scores confined to a reasonable range to prevent the
model degradation. Beyond just the binary positive and negative feedback, the
constraint alignment loss can be seamlessly adapted to the ranking situations
when ranking feedback is accessible. Furthermore, we also delve deeply into
recent ranking-based alignment methods, such as DPO, RRHF, and PRO, and
discover that the constraint, which has been overlooked by these approaches, is
also crucial for their performance. Extensive experiments on four reasoning
benchmarks with both binary and ranking feedback demonstrate the effectiveness
of AFT.Comment: Large Language Models; Reasoning; Alignmen
STEMI outcomes in Guangzhou and Hong Kong: two-centre retrospective interregional study
BACKGROUND AND OBJECTIVES:Healthcare systems are organized very differently in Hong Kong (HK) and Guangzhou (GZ). This study compared managements of the emergency departments (ED) and one-year mortalities of ST-segment elevation myocardial infarction (STEMI) patients in two teaching hospitals in Guangzhou and Hong Kong. METHODS:Retrospective observational study of STEMI mortalities and treatments in the Prince of Wales Hospital (PWH) and the Second Affiliated Hospital of Guangzhou Medical University (AHGZMU), was conducted between January and December 2010. The primary outcome was one-year all cause mortality. RESULTS:Univariate analysis of 76 cases from PWH and 111 cases from AHGZMU showed similar clinical characteristics, except for lower proportions of males (74% vs 92%, P = 0.002), hyperlipidemia (5% vs 25%, P67 years) and hyperglycemia (>10 mmol/L). Aged over 65 years, presence of anterior wall infarct, body weight ≤65 kg, SBP 10 mmol/L were the independent predictors of in-hospital MACE. CONCLUSION:There was no statistically significant difference between the standardized one-year all-cause mortalities of STEMI patients in the setting mainly using thrombolysis with shorter door-to-treatment time and the setting mainly using PCI with longer door-to-treatment time. Aged over 67 years and glucose level over 10 mmol/L were the independent predictors of one-year mortality. Older age, presence of anterior wall infarct, lower body weight, lower SBP at ED and hyperglycemia were the independent predictors of in-hospital MACE
Large Language Models are not Fair Evaluators
We uncover a systematic bias in the evaluation paradigm of adopting large
language models~(LLMs), e.g., GPT-4, as a referee to score the quality of
responses generated by candidate models. We find that the quality ranking of
candidate responses can be easily hacked by simply altering their order of
appearance in the context. This manipulation allows us to skew the evaluation
result, making one model appear considerably superior to the other, e.g.,
vicuna could beat ChatGPT on 66 over 80 tested queries. To address this issue,
we propose two simple yet effective calibration strategies: 1) Multiple
Evidence Calibration, which requires the evaluator model to generate multiple
detailed pieces of evidence before assigning ratings; 2) Balanced Position
Calibration, which aggregates results across various orders to determine the
final score. Extensive experiments demonstrate that our approach successfully
mitigates evaluation bias, resulting in closer alignment with human judgments.
To facilitate future research on more robust large language model comparison,
we integrate the techniques in the paper into an easy-to-use toolkit
\emph{FairEval}, along with the human
annotations.\footnote{\url{https://github.com/i-Eval/FairEval}}Comment: work in progres
Comparison of outcomes in emergency department patients with suspected cardiac chest pain: two-centre prospective observational study in Southern China
Background
Hong Kong (HK) and Guangzhou (GZ) are cities in China with different healthcare systems. This study aimed to compare 30-day and 6-month mortality and characteristics of patients with suspected cardiac chest pain admitted to two emergency departments (ED) in HK and GZ.
Methods
A prospective observational study enrolled patients with suspected cardiac chest pain presenting to EDs in the Prince of Wales Hospital (PWH), HK and the Second Affiliated Hospital of Guangzhou Medical University (AHGZMU),GZ. The primary outcome was 30-day and 6-month mortality.
Results
In total, 996 patients were recruited, 407 cases from GZ and 589 cases from HK.The 30-day and 6-month mortality of chest patients were 3.7% and 4.7% in GZand 0.3% and 1.9% in HK, respectively. Serum creatinine level (Cr) was an independent factor for 30-day mortality whilst Cr and systolic blood pressure (SBP) were independent factors for 6-month mortality. In Cox regression analysis, unadjusted and adjusted hazard ratios for 30-day and 6-month mortality in GZ were significantly increased.
Conclusion
The 30-day and 6-month mortality of patients with suspected cardiac chest pain in Guangzhou were higher than in Hong Kong due to due to different baseline clinical characteristics of patients and different distributions of diagnoses, which were associated with different healthcare systems. Serum creatinine and SBP were independent factors for 30-day and 6-month mortality
Derivation of a prediction rule for unfavorable outcome after ischemic stroke in the Chinese population
Background
Efficient assessment of patients after ischemic stroke has important reference value for doctors to choose appropriate treatment for patients. Our study aimed to develop a new prognostic model for predicting outcomes 3 months after ischemic stroke among Chinese Population.
Methods
A prospective observational cohort study among ischemic stroke patients presenting to Emergency Department in the Second Affiliated Hospital of Guangzhou Medical University was conducted from May 2012 to June 2013. Demographic data of ischemic stroke patients, assessment of NIHSS and laboratory results were collected. Based on 3-month modified Rankin Scale (mRS) ischemic stroke patients were divided into either favorable outcome (mRS: 0-2) or unfavorable outcome groups (mRS: 3-6). The variables closely associated with prognosis of ischemic stroke were selected to develop the new prognostic model (NAAP) consisted of 4 parameters: NIHSS, age, atrial fibrillation, and prealbumin. The prognostic value of the modified prognostic model was then compared with NIHSS alone.
Results
A total of 454 patients with suspected stroke were recruited. One hundred eighty-six patients with ischemic stroke were included in the final analysis. A new prognostic model, NAAP was developed. The area under curve (AUC) of NAAP was .861 (95%confidence interval: .803-.907), whilst the AUC of NIHSS was .783 (95%CI: .717-.840), (P = .0048). Decision curve analysis showed that NAAP had a higher net benefit for threshold probabilities of 65% for predictive risk of poor outcomes.
Conclusions
The modified prognostic model, NAAP may be a better prognostic tool for predicting 3-month unfavorable outcomes for ischemic stroke than NIHSS alone
The Chinese pine genome and methylome unveil key features of conifer evolution
Conifers dominate the world's forest ecosystems and are the most widely planted tree species. Their giant and complex genomes present great challenges for assembling a complete reference genome for evolutionary and genomic studies. We present a 25.4-Gb chromosome-level assembly of Chinese pine (Pinus tabuliformis) and revealed that its genome size is mostly attributable to huge intergenic regions and long introns with high transposable element (TE) content. Large genes with long introns exhibited higher expressions levels. Despite a lack of recent whole-genome duplication, 91.2% of genes were duplicated through dispersed duplication, and expanded gene families are mainly related to stress responses, which may underpin conifers' adaptation, particularly in cold and/or arid conditions. The reproductive regulation network is distinct compared with angiosperms. Slow removal of TEs with high-level methylation may have contributed to genomic expansion. This study provides insights into conifer evolution and resources for advancing research on conifer adaptation and development
- …