124 research outputs found
ScienceBenchmark: A Complex Real-World Benchmark for Evaluating Natural Language to SQL Systems
Natural Language to SQL systems (NL-to-SQL) have recently shown a significant
increase in accuracy for natural language to SQL query translation. This
improvement is due to the emergence of transformer-based language models, and
the popularity of the Spider benchmark - the de-facto standard for evaluating
NL-to-SQL systems. The top NL-to-SQL systems reach accuracies of up to 85\%.
However, Spider mainly contains simple databases with few tables, columns, and
entries, which does not reflect a realistic setting. Moreover, complex
real-world databases with domain-specific content have little to no training
data available in the form of NL/SQL-pairs leading to poor performance of
existing NL-to-SQL systems.
In this paper, we introduce ScienceBenchmark, a new complex NL-to-SQL
benchmark for three real-world, highly domain-specific databases. For this new
benchmark, SQL experts and domain experts created high-quality NL/SQL-pairs for
each domain. To garner more data, we extended the small amount of
human-generated data with synthetic data generated using GPT-3. We show that
our benchmark is highly challenging, as the top performing systems on Spider
achieve a very low performance on our benchmark. Thus, the challenge is
many-fold: creating NL-to-SQL systems for highly complex domains with a small
amount of hand-made training data augmented with synthetic data. To our
knowledge, ScienceBenchmark is the first NL-to-SQL benchmark designed with
complex real-world scientific databases, containing challenging training and
test data carefully validated by domain experts.Comment: 12 pages, 2 figures, 5 table
Translating Natural Language to SQL using Deep Learning
Οι βάσεις δεδομένων περιέχουν τεράστια ποσότητα δεδομένων, τα οποία χρησιμοποιούνται για την υποστήριξη ενός μεγάλου εύρους δραστηριοτήτων από επιχειρηματικές δραστηριότητες, επιστημονικά πειράματα μέχρι δραστηριότητες της καθημερινότητας μας. Παρά όλα αυτά παραμένουν μη προσβάσιμες για έναν χρήστη χωρίς γνώση Γλώσσας Δομημένων Ερωτημάτων (SQL). Οι διεπαφές φυσικής γλώσσας για βάσεις δεδομένων καταρίπτουν αυτά τα εμπόδια και τελευταία βρίσκονται σε άνοδο. Στα πλαίσια αυτής της πτυχιακής εργασίας, θα ξεκινήσουμε παρουσιάζοντας το πρόβλημα NL2SQL (μετάφραση φυσικής γλώσσας σε γλώσσα δομημένων ερωτημάτων), τα πιο σημαντικά του σημεία και την ανατομία ενός συστήματος NL2SQL. Θα συγκρίνουμε κάποια συστήματα και θα δούμε πως το καθένα από αυτά έχει επιλέξει να αντιμετωπίσει το πρόβλημα. Στο κύριο μέρος της εργασίας, θα εστιάσουμε στο SQLNet, ένα σύστημα το οποίο χρησιμοποιεί τεχνικές βαθιάς μάθησης για να αντιμετωπίσει το πρόβλημα NL2SQL. Επίσης, θα δοκιμάσουμε τη δική μας υλοποίηση του συστήματος, θα προσπαθήσουμε να εφαρμόσουμε κάποιες βελτιώσεις και θα ελέγξουμε πόσο καλά λειτουργεί σε διάφορες περιπτώσεις.Databases contain a vast amount of data, used to support a range of operations, from business operations, scientific experiments to activities in our everyday lives. However they are still inaccessible for non-technical users, without knowledge of Structured Query Language (SQL). Natural language interfaces to databases lift these obstacles for such users and they have recently bloomed. In this thesis, we will start by presenting the NL2SQL problem (translating Natural Language to Structured Query Language), its most important aspects and the anatomy of a NL2SQL system. We will compare some systems and see how each one of them chooses to tackle the problem. In the main part of this work, we will focus on the SQLNet system which uses deep learning methods to tackle the NL2SQL problem. We will also test our own implementation of the system, investigate possible improvements and test how well it works on various cases
oxLDL Downregulates the Dendritic Cell Homing Factors CCR7 and CCL21
Introduction. Dendritic cells (DCs) and oxLDL play an important role in the atherosclerotic process with DCs accumulating in the plaques during plaque progression. Our aim was to investigate the role of oxLDL in the modulation of the DC homing-receptor CCR7 and endothelial-ligand CCL21. Methods and Results. The expression of the DC homing-receptor CCR7 and its endothelial-ligand CCL21 was examined on atherosclerotic carotic plaques of 47 patients via qRT-PCR and immunofluorescence. In vitro, we studied the expression of CCR7 on DCs and CCL21 on human microvascular endothelial cells (HMECs) in response to oxLDL. CCL21- and CCR7-mRNA levels were significantly downregulated in atherosclerotic plaques versus non-atherosclerotic controls [90% for CCL21 and 81% for CCR7 (P < 0.01)]. In vitro, oxLDL reduced CCR7 mRNA levels on DCs by 30% and protein levels by 46%. Furthermore, mRNA expression of CCL21 was significantly reduced by 50% (P < 0.05) and protein expression by 24% in HMECs by oxLDL (P < 0.05). Conclusions. The accumulation of DCs in atherosclerotic plaques appears to be related to a downregulation of chemokines and their ligands, which are known to regulate DC migration. oxLDL induces an in vitro downregulation of CCR7 and CCL21, which may play a role in the reduction of DC migration from the plaques
High resolution carotid black-blood 3T MR with parallel imaging and dedicated 4-channel surface coils
Background: Most of the carotid plaque MR studies have been performed using black-blood protocols at 1.5 T without parallel imaging techniques. The purpose of this study was to evaluate a multi-sequence, black-blood MR protocol using parallel imaging and a dedicated 4-channel surface coil for vessel wall imaging of the carotid arteries at 3 T. Materials and methods: 14 healthy volunteers and 14 patients with intimal thickening as proven by duplex ultrasound had their carotid arteries imaged at 3 T using a multi-sequence protocol (time-of-flight MR angiography, pre-contrast T1w-, PDw- and T2w sequences in the volunteers, additional post-contrast T1w- and dynamic contrast enhanced sequences in patients). To assess intrascan reproducibility, 10 volunteers were scanned twice within 2 weeks. Results: Intrascan reproducibility for quantitative measurements of lumen, wall and outer wall areas was excellent with Intraclass Correlation Coefficients >0.98 and measurement errors of 1.5%, 4.5% and 1.9%, respectively. Patients had larger wall areas than volunteers in both common carotid and internal carotid arteries and smaller lumen areas in internal carotid arteries (p < 0.001). Positive correlations were found between wall area and cardiovascular risk factors such as age, hypertension, coronary heart disease and hypercholesterolemia (Spearman's r = 0.45-0.76, p < 0.05). No significant correlations were found between wall area and body mass index, gender, diabetes or a family history of cardiovascular disease. Conclusion: The findings of this study indicate that high resolution carotid black-blood 3 T MR with parallel imaging is a fast, reproducible and robust method to assess carotid atherosclerotic plaque in vivo and this method is ready to be used in clinical practice
ScienceBenchmark : a complex real-world benchmark for evaluating natural language to SQL systems
Natural Language to SQL systems (NL-to-SQL) have recently shown improved accuracy (exceeding 80%) for natural language to SQL query translation due to the emergence of transformer-based language models, and the popularity of the Spider benchmark. However, Spider mainly contains simple databases with few tables, columns, and entries, which do not reflect a realistic setting. Moreover, complex real-world databases with domain-specific content have little to no training data available in the form of NL/SQL-pairs leading to poor performance of existing NL-to-SQL systems.
In this paper, we introduce ScienceBenchmark, a new complex NL-to-SQL benchmark for three real-world, highly domain-specific databases. For this new benchmark, SQL experts and domain experts created high-quality NL/SQL-pairs for each domain. To garner more data, we extended the small amount of human-generated data with synthetic data generated using GPT-3. We show that our benchmark is highly challenging, as the top performing systems on Spider achieve a very low performance on our benchmark. Thus, the challenge is many-fold: creating NL-to-SQL systems for highly complex domains with a small amount of hand-made training data augmented with synthetic data. To our knowledge, ScienceBenchmark is the first NL-to-SQL benchmark designed with complex real-world scientific databases, containing challenging training and test data carefully validated by domain experts
Non-Contrast-Enhanced MR Angiography at 3 Tesla in Patients with Advanced Peripheral Arterial Occlusive Disease
Purpose: The aim of this study was to assess the diagnostic performance of ECG-gated non-contrast-enhanced quiescent interval single-shot (QISS) magnetic resonance angiography at a magnetic field strength of 3 Tesla in patients with advanced peripheral arterial occlusive disease (PAOD). Method and Materials: A total of 21 consecutive patients with advanced PAOD (Fontaine stage IIb and higher) referred for peripheral magnetic resonance angiography (MRA) were included. Imaging was performed on a 3 T whole body MR. Image quality and stenosis diameter were evaluated in comparison to contrast-enhanced continuous table and TWIST MRA (CE-MRA) as standard of reference. QISS images were acquired with a thickness of 1.5 mm each (high-resolution QISS, HR-QISS). Two blinded readers rated the image quality and the degree of stenosis for both HR-QISS and CE-MRA in 26 predefined arterial vessel segments on 5-point Likert scales. Results: With CE-MRA as the reference standard, HR-QISS showed high sensitivity (94.1%),specificity (97.8%),positive (95.1%),and negative predictive value (97.2%) for the detection of significant (>= 50%) stenosis. Interreader agreement for stenosis assessment of both HR-QISS and CE-MRA was excellent (kappa-values of 0.951 and 0.962, respectively). As compared to CR-MRA, image quality of HR-QISS was significantly lower for the distal aorta, the femoral and iliac arteries (each with p<0.01),while no significant difference was found in the popliteal (p = 0.09) and lower leg arteries (p = 0.78). Conclusion: Non-enhanced ECG-gated HR-QISS performs very well in subjects with severe PAOD and is a good alternative for patients with a high risk of nephrogenic systemic fibrosis
Relationship between H.Pylori infection and clinicopathological features and prognosis of gastric cancer
<p>Abstract</p> <p>Background</p> <p>Aimed to assess the relationship between H.Pylori and the clinicopathological features and prognosis of gastric cancer by quantitative detection of H.Pylori.</p> <p>Methods</p> <p>157 patients were enrolled, all patients had a record of clinicopathological parameters. Specimens including the tumor and non-neoplastic were detected for H.Pylori by Real-Time PCR and analyzed clinical data retrospectively. Variables independently affecting prognosis were investigated by means of multivariate analysis using the Cox proportional hazards model.</p> <p>Results</p> <p>H.Pylori infection was greater in non-neoplastic tissue than the tumor tissue (p < 0.05), H.Pylori infection and its copies were related to the tumor site and N staging (p < 0.05). Overall survival (OS) in all 157 patients has no correlation with the H.Pylori infection status (p = 0.715). As to the patients who underwent a curative surgery, relapse-free survival (RFS) has no correlation with the H.Pylori infection status (p = 0.639). Among the H.Pylori positive patients, OS and RFS of those with higher copies were longer than in patients with low copies, but there was no significant statistical difference.</p> <p>Conclusions</p> <p>H.Pylori infection status and its copies were related to N staging. The OS and RFS in patients with positive H.Pylori status has no significant difference from the patients with negative H.Pylori status.</p
- …