284 research outputs found
Model of a fluid at small and large length scales and the hydrophobic effect
We present a statistical field theory to describe large length scale effects
induced by solutes in a cold and otherwise placid liquid. The theory divides
space into a cubic grid of cells. The side length of each cell is of the order
of the bulk correlation length of the bulk liquid. Large length scale states of
the cells are specified with an Ising variable. Finer length scale effects are
described with a Gaussian field, with mean and variance affected by both the
large length scale field and by the constraints imposed by solutes. In the
absence of solutes and corresponding constraints, integration over the Gaussian
field yields an effective lattice gas Hamiltonian for the large length scale
field. In the presence of solutes, the integration adds additional terms to
this Hamiltonian. We identify these terms analytically. They can provoke large
length scale effects, such as the formation of interfaces and depletion layers.
We apply our theory to compute the reversible work to form a bubble in liquid
water, as a function of the bubble radius. Comparison with molecular simulation
results for the same function indicates that the theory is reasonably accurate.
Importantly, simulating the large length scale field involves binary arithmetic
only. It thus provides a computationally convenient scheme to incorporate
explicit solvent dynamics and structure in simulation studies of large
molecular assemblies
Metasql: A Generate-then-Rank Framework for Natural Language to SQL Translation
The Natural Language Interface to Databases (NLIDB) empowers non-technical
users with database access through intuitive natural language (NL)
interactions. Advanced approaches, utilizing neural sequence-to-sequence models
or large-scale language models, typically employ auto-regressive decoding to
generate unique SQL queries sequentially. While these translation models have
greatly improved the overall translation accuracy, surpassing 70% on NLIDB
benchmarks, the use of auto-regressive decoding to generate single SQL queries
may result in sub-optimal outputs, potentially leading to erroneous
translations. In this paper, we propose Metasql, a unified generate-then-rank
framework that can be flexibly incorporated with existing NLIDBs to
consistently improve their translation accuracy. Metasql introduces query
metadata to control the generation of better SQL query candidates and uses
learning-to-rank algorithms to retrieve globally optimized queries.
Specifically, Metasql first breaks down the meaning of the given NL query into
a set of possible query metadata, representing the basic concepts of the
semantics. These metadata are then used as language constraints to steer the
underlying translation model toward generating a set of candidate SQL queries.
Finally, Metasql ranks the candidates to identify the best matching one for the
given NL query. Extensive experiments are performed to study Metasql on two
public NLIDB benchmarks. The results show that the performance of the
translation models can be effectively improved using Metasql
Aniline hydrogenolysis on nickel: effects of surface hydrogen and surface structure
Fluorescence yield near-edge spectroscopy (FYNES) above the carbon K edge and temperature programmed reaction spectroscopy (TPRS) have been used as the methods for characterizing the reactivity and structure of adsorbed aniline and aniline derived species on the Ni(100) and Ni(111) surfaces over an extended range of temperatures and hydrogen pressures. The Ni(100) surface shows appreciably higher hydrogenolysis activity towards adsorbed aniline than the Ni(111) surface. Hydrogenolysis of aniline on the Ni(100) surface results in benzene formation at 470 K, both in reactive hydrogen atmospheres and in vacuum. External hydrogen significantly enhances the hydrogenolysis activity for aniline on the Ni(100) surface. Based on spectroscopic evidence, we believe that the dominant aniline hydrogenolysis reaction is preceded by partial hydrogenation of the aromatic ring of aniline in the presence of 0.001 Torr of external hydrogen on the (100) surface. In contrast, very little adsorbed aniline undergoes hydrogen induced C-N bond activation on the Ni(111) surface for hydrogen pressures as high as 10 β7 Torr below 500 K. Thermal dehydrogenation of aniline dominates with increasing temperature on the Ni(111) surface, resulting in the formation of a previously observed polymeric layer which is stable up to 820 K. Aniline is adsorbed at a smaller angle relative to the Ni(111) surface than the Ni(100) surface at temperatures below the hydrogenolysis temperature. We believe that the proximity and strong Ο-interaction between the aromatic ring of the aniline and the surface is one major factor which controls the competition between dehydrogenation and hydrogen addition. In this case the result is a substantial enhancement of aniline dehydrogenation relative to hydrogenation on the Ni(111) surface.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/44251/1/10562_2004_Article_BF00806886.pd
PURPLE: Making a Large Language Model a Better SQL Writer
Large Language Model (LLM) techniques play an increasingly important role in
Natural Language to SQL (NL2SQL) translation. LLMs trained by extensive corpora
have strong natural language understanding and basic SQL generation abilities
without additional tuning specific to NL2SQL tasks. Existing LLMs-based NL2SQL
approaches try to improve the translation by enhancing the LLMs with an
emphasis on user intention understanding. However, LLMs sometimes fail to
generate appropriate SQL due to their lack of knowledge in organizing complex
logical operator composition. A promising method is to input the LLMs with
demonstrations, which include known NL2SQL translations from various databases.
LLMs can learn to organize operator compositions from the input demonstrations
for the given task. In this paper, we propose PURPLE (Pre-trained models
Utilized to Retrieve Prompts for Logical Enhancement), which improves accuracy
by retrieving demonstrations containing the requisite logical operator
composition for the NL2SQL task on hand, thereby guiding LLMs to produce better
SQL translation. PURPLE achieves a new state-of-the-art performance of 80.5%
exact-set match accuracy and 87.8% execution match accuracy on the validation
set of the popular NL2SQL benchmark Spider. PURPLE maintains high accuracy
across diverse benchmarks, budgetary constraints, and various LLMs, showing
robustness and cost-effectiveness.Comment: 12 pages, accepted by ICDE 2024 (40th IEEE International Conference
on Data Engineering
Recommended from our members
Sarcoplasmic reticular Ca 2+ -ATPase inhibition paradoxically upregulates murine skeletal muscle Na v 1.4 function
Abstract: Skeletal muscle Na+ channels possess Ca2+- and calmodulin-binding sites implicated in Nav1.4 current (INa) downregulation following ryanodine receptor (RyR1) activation produced by exchange protein directly activated by cyclic AMP or caffeine challenge, effects abrogated by the RyR1-antagonist dantrolene which itself increased INa. These findings were attributed to actions of consequently altered cytosolic Ca2+, [Ca2+]i, on Nav1.4. We extend the latter hypothesis employing cyclopiazonic acid (CPA) challenge, which similarly increases [Ca2+]i, but through contrastingly inhibiting sarcoplasmic reticular (SR) Ca2+-ATPase. Loose patch clamping determined Na+ current (INa) families in intact native murine gastrocnemius skeletal myocytes, minimising artefactual [Ca2+]i perturbations. A bespoke flow system permitted continuous INa comparisons through graded depolarizing steps in identical stable membrane patches before and following solution change. In contrast to the previous studies modifying RyR1 activity, and imposing control solution changes, CPA (0.1 and 1 Β΅M) produced persistent increases in INa within 1β4 min of introduction. CPA pre-treatment additionally abrogated previously reported reductions in INa produced by 0.5 mM caffeine. Plots of peak current against voltage excursion demonstrated that 1 Β΅M CPA increased maximum INa by ~ 30%. It only slightly decreased half-maximal activating voltages (V0.5) and steepness factors (k), by 2 mV and 0.7, in contrast to the V0.5 and k shifts reported with direct RyR1 modification. These paradoxical findings complement previously reported downregulatory effects on Nav1.4 of RyR1-agonist mediated increases in bulk cytosolic [Ca2+]. They implicate possible local tubule-sarcoplasmic triadic domains containing reduced [Ca2+]TSR in the observed upregulation of Nav1.4 function following CPA-induced SR Ca2+ depletion
Sarcoplasmic reticular Ca 2+ -ATPase inhibition paradoxically upregulates murine skeletal muscle Na v 1.4 function
Abstract: Skeletal muscle Na+ channels possess Ca2+- and calmodulin-binding sites implicated in Nav1.4 current (INa) downregulation following ryanodine receptor (RyR1) activation produced by exchange protein directly activated by cyclic AMP or caffeine challenge, effects abrogated by the RyR1-antagonist dantrolene which itself increased INa. These findings were attributed to actions of consequently altered cytosolic Ca2+, [Ca2+]i, on Nav1.4. We extend the latter hypothesis employing cyclopiazonic acid (CPA) challenge, which similarly increases [Ca2+]i, but through contrastingly inhibiting sarcoplasmic reticular (SR) Ca2+-ATPase. Loose patch clamping determined Na+ current (INa) families in intact native murine gastrocnemius skeletal myocytes, minimising artefactual [Ca2+]i perturbations. A bespoke flow system permitted continuous INa comparisons through graded depolarizing steps in identical stable membrane patches before and following solution change. In contrast to the previous studies modifying RyR1 activity, and imposing control solution changes, CPA (0.1 and 1 Β΅M) produced persistent increases in INa within 1β4 min of introduction. CPA pre-treatment additionally abrogated previously reported reductions in INa produced by 0.5 mM caffeine. Plots of peak current against voltage excursion demonstrated that 1 Β΅M CPA increased maximum INa by ~ 30%. It only slightly decreased half-maximal activating voltages (V0.5) and steepness factors (k), by 2 mV and 0.7, in contrast to the V0.5 and k shifts reported with direct RyR1 modification. These paradoxical findings complement previously reported downregulatory effects on Nav1.4 of RyR1-agonist mediated increases in bulk cytosolic [Ca2+]. They implicate possible local tubule-sarcoplasmic triadic domains containing reduced [Ca2+]TSR in the observed upregulation of Nav1.4 function following CPA-induced SR Ca2+ depletion
Diimide formation on the Ni(100) surface
Diimide (N2H2), an extremely reactive species, is observed as a gas phase product from the Ni(100) surface in the 200 to 450 K range during hydrazine thermal decomposition and during thermal desorption of predissociated ammonia. These results suggest that the primary mechanism for diimide formation is recombination of an adsorbed NH surface intermediate. The observation that diimide can be formed from predissociated ammonia illustrates that a nitrogen-nitrogen bond in the precursor is not required for diimide formation. Diimide formation from predissociated ammonia is enhanced by coadsorbed hydrogen, which we believe stabilizes NH on the Ni(100) surface. In addition, the direct decomposition of adsorbed N2H4 contributes to the production of diimide at 230 K.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/30742/1/0000392.pd
MUC1-associated proliferation signature predicts outcomes in lung adenocarcinoma patients
Background: MUC1 protein is highly expressed in lung cancer. The cytoplasmic domain of MUC1 (MUC1-CD) induces tumorigenesis and resistance to DNA-damaging agents. We characterized MUC1-CD-induced transcriptional changes and examined their significance in lung cancer patients. Methods: Using DNA microarrays, we identified 254 genes that were differentially expressed in cell lines transformed by MUC1-CD compared to control cell lines. We then examined expression of these genes in 441 lung adenocarcinomas from a publicly available database. We employed statistical analyses independent of clinical outcomes, including hierarchical clustering, Student's t-tests and receiver operating characteristic (ROC) analysis, to select a seven-gene MUC1-associated proliferation signature (MAPS). We demonstrated the prognostic value of MAPS in this database using Kaplan-Meier survival analysis, log-rank tests and Cox models. The MAPS was further validated for prognostic significance in 84 lung adenocarcinoma patients from an independent database. Results: MAPS genes were found to be associated with proliferation and cell cycle regulation and included CCNB1, CDC2, CDC20, CDKN3, MAD2L1, PRC1 and RRM2. MAPS expressors (MAPS+) had inferior survival compared to non-expressors (MAPS-). In the initial data set, 5-year survival was 65% (MAPS-) vs. 45% (MAPS+, p < 0.0001). Similarly, in the validation data set, 5-year survival was 57% (MAPS-) vs. 28% (MAPS+, p = 0.005). Conclusions: The MAPS signature, comprised of MUC1-CD-dependent genes involved in the control of cell cycle and proliferation, is associated with poor outcomes in patients with adenocarcinoma of the lung. These data provide potential new prognostic biomarkers and treatment targets for lung adenocarcinoma
Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
<p>Abstract</p> <p>Background</p> <p>Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously violated, as extensive correlation between genes is a well-documented phenomenon.</p> <p>Results</p> <p>We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data.</p> <p>Conclusions</p> <p>These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature.</p
Real-Time Imaging of HIF-1Ξ± Stabilization and Degradation
HIF-1Ξ± is overexpressed in many human cancers compared to normal tissues due to the interaction of a multiplicity of factors and pathways that reflect specific genetic alterations and extracellular stimuli. We developed two HIF-1Ξ± chimeric reporter systems, HIF-1Ξ±/FLuc and HIF-1Ξ±(ΞODDD)/FLuc, to investigate the tightly controlled level of HIF-1Ξ± protein in normal (NIH3T3 and HEK293) and glioma (U87) cells. These reporter systems provided an opportunity to investigate the degradation of HIF-1Ξ± in different cell lines, both in culture and in xenografts. Using immunofluorescence microscopy, we observed different patterns of subcellular localization of HIF-1Ξ±/FLuc fusion protein between normal cells and cancer cells; similar differences were observed for HIF-1Ξ± in non-transduced, wild-type cells. A dynamic cytoplasmic-nuclear exchange of the fusion protein and HIF-1Ξ± was observed in NIH3T3 and HEK293 cells under different conditions (normoxia, CoCl2 treatment and hypoxia). In contrast, U87 cells showed a more persistent nuclear localization pattern that was less affected by different growing conditions. Employing a kinetic model for protein degradation, we were able to distinguish two components of HIF-1Ξ±/FLuc protein degradation and quantify the half-life of HIF-1Ξ± fusion proteins. The rapid clearance component (t1/2 βΌ4β6 min) was abolished by the hypoxia-mimetic CoCl2, MG132 treatment and deletion of ODD domain, and reflects the oxygen/VHL-dependent degradation pathway. The slow clearance component (t1/2 βΌ200 min) is consistent with other unidentified non-oxygen/VHL-dependent degradation pathways. Overall, the continuous bioluminescence readout of HIF-1Ξ±/FLuc stabilization in vitro and in vivo will facilitate the development and validation of therapeutics that affect the stability and accumulation of HIF-1Ξ±
- β¦