116 research outputs found
Adding Logical Operators to Tree Pattern Queries on Graph-Structured Data
As data are increasingly modeled as graphs for expressing complex
relationships, the tree pattern query on graph-structured data becomes an
important type of queries in real-world applications. Most practical query
languages, such as XQuery and SPARQL, support logical expressions using
logical-AND/OR/NOT operators to define structural constraints of tree patterns.
In this paper, (1) we propose generalized tree pattern queries (GTPQs) over
graph-structured data, which fully support propositional logic of structural
constraints. (2) We make a thorough study of fundamental problems including
satisfiability, containment and minimization, and analyze the computational
complexity and the decision procedures of these problems. (3) We propose a
compact graph representation of intermediate results and a pruning approach to
reduce the size of intermediate results and the number of join operations --
two factors that often impair the efficiency of traditional algorithms for
evaluating tree pattern queries. (4) We present an efficient algorithm for
evaluating GTPQs using 3-hop as the underlying reachability index. (5)
Experiments on both real-life and synthetic data sets demonstrate the
effectiveness and efficiency of our algorithm, from several times to orders of
magnitude faster than state-of-the-art algorithms in terms of evaluation time,
even for traditional tree pattern queries with only conjunctive operations.Comment: 16 page
Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints
Early identification of acute gout is crucial, enabling healthcare professionals to implement targeted interventions for rapid pain relief and preventing disease progression, ensuring improved long-term joint function. In this study, we comprehensively explored the potential early detection of gout flares (GFs) based on nurses’ chief complaint notes in the Emergency Department (ED). Addressing the challenge of identifying GFs prospectively during an ED visit, where documentation is typically minimal, our research focused on employing alternative Natural Language Processing (NLP) techniques to enhance detection accuracy. We investigated GF detection algorithms using both sparse representations by traditional NLP methods and dense encodings by medical domain-specific Large Language Models (LLMs), distinguishing between generative and discriminative models. Three methods were used to alleviate the issue of severe data imbalances, including oversampling, class weights, and focal loss. Extensive empirical studies were performed on the Gout Emergency Department Chief Complaint Corpora. Sparse text representations like tf-idf proved to produce strong performances, achieving F1 scores higher than 0.75. The best deep learning models were RoBERTa-large-PM-M3-Voc and BioGPT, which had the best F1 scores for each dataset, with a 0.8 on the 2019 dataset and a 0.85 F1 score on the 2020 dataset, respectively. We concluded that although discriminative LLMs performed better for this classification task when compared to generative LLMs, a combination of using generative models as feature extractors and employing a support vector machine for classification yielded promising results comparable to those obtained with discriminative models
Effective Natural Language Processing Algorithms for Gout Flare Early Alert from Chief Complaints
In this study, we extend the exploration of gout flare detection initiated by Osborne, J. D. 16et al, through the utilization of their dataset of Emergency Department (ED) triage nurse chief com- 17plaint notes. Addressing the challenge of identifying gout flares prospectively during an ED visit, 18where documentation is typically minimal, our research focuses on employing alternative Natural 19Language Processing (NLP) techniques to enhance the detection accuracy. This study investigates 20the application of medical domain-specific Large Language Models (LLMs), distinguishing between 21generative and discriminative models. Models such as BioGPT, RoBERTa-large-PubMed-M3, and 22BioElectra were implemented to compare their efficacy with the original implementation by Os- 23borne, J. D. et al. The best model was Roberta-large-PM-M3 with a 0.8 F1 Score on the Gout-CC-2019 24dataset followed by BioElectra with 0.76 F1 Score. We concluded that discriminative LLMs per- 25formed better for this classification task compared to generative LLMs. However, a combination of 26using generative models as feature extractors and employing SVM for the classification of embed- 27dings yielded promising results comparable to those obtained with discriminative models. Never- 28theless, all our implementations surpassed the results obtained in the original publication
Distinct miRNAs associated with various clinical presentations of SARS-CoV-2 infection.
MicroRNAs (miRNAs) have been shown to play important roles in viral infections, but their associations with SARS-CoV-2 infection remain poorly understood. Here, we detected 85 differentially expressed miRNAs (DE-miRNAs) from 2,336 known and 361 novel miRNAs that were identified in 233 plasma samples from 61 healthy controls and 116 patients with COVID-19 using the high-throughput sequencing and computational analysis. These DE-miRNAs were associated with SASR-CoV-2 infection, disease severity, and viral persistence in the patients with COVID-19, respectively. Gene ontology and KEGG pathway analyses of the DE-miRNAs revealed their connections to viral infections, immune responses, and lung diseases. Finally, we established a machine learning model using the DE-miRNAs between various groups for classification of COVID-19 cases with different clinical presentations. Our findings may help understand the contribution of miRNAs to the pathogenesis of COVID-19 and identify potential biomarkers and molecular targets for diagnosis and treatment of SARS-CoV-2 infection
- …