286 research outputs found

    Compositional Semantic Parsing with Large Language Models

    Full text link
    Humans can reason compositionally when presented with new tasks. Previous research shows that appropriate prompting techniques enable large language models (LLMs) to solve artificial compositional generalization tasks such as SCAN. In this work, we identify additional challenges in more realistic semantic parsing tasks with larger vocabulary and refine these prompting techniques to address them. Our best method is based on least-to-most prompting: it decomposes the problem using prompting-based syntactic parsing, then uses this decomposition to select appropriate exemplars and to sequentially generate the semantic parse. This method allows us to set a new state of the art for CFQ while requiring only 1% of the training data used by traditional approaches. Due to the general nature of our approach, we expect similar efforts will lead to new results in other tasks and domains, especially for knowledge-intensive applications.Comment: Fixed metadata. No other change

    Large Language Models Can Be Easily Distracted by Irrelevant Context

    Full text link
    Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on benchmarks where all information in the input context is relevant for solving the task. In this work, we investigate the distractibility of large language models, i.e., how the model problem-solving accuracy can be influenced by irrelevant context. In particular, we introduce Grade-School Math with Irrelevant Context (GSM-IC), an arithmetic reasoning dataset with irrelevant information in the problem description. We use this benchmark to measure the distractibility of cutting-edge prompting techniques for large language models, and find that the model performance is dramatically decreased when irrelevant information is included. We also identify several approaches for mitigating this deficiency, such as decoding with self-consistency and adding to the prompt an instruction that tells the language model to ignore the irrelevant information

    Capacity Planning of a Commodity Cluster in an Academic Environment: A Case Study

    Get PDF
    In this paper, the design of a simulation model for evaluating two alternative supercomputer configurations in an academic environment is presented. The workload is analyzed and modeled, and its effect on the relative performance of both systems is studied. The Integrated Capacity Planning Environment (ICPE) toolkit, developed for commodity cluster capacity planning, is successfully applied to the target environment. The ICPE is a tool for workload modeling, simulation modeling, and what-if analysis. A new characterization strategy is applied to the workload to more accurately model commodity cluster work- loads. Through what-if analysis, the sensitivity of the baseline system performance to workload change, and also the relative performance of the two proposed alternative systems are compared and evaluated. This case study demonstrates the usefulness of the methodology and the applicability of the tools in gauging system capacity and making design decisions

    Peripartum infections and associated maternal mortality in rural Malawi

    Get PDF
    Article approval pendingTo assess associations between maternal mortality and severe morbidity and human immunodeficiency virus (HIV) infection, uptake of antiretroviral therapy, obstetric infections, and nonobstetric infections in a rural Malawian district, where the estimated HIV prevalence is 21%

    Effects of an Oral Supplement Containing Calcium and Live Yeast on Circulating Calcium and Production Parameters Following I.V. Lipopolysaccharide Infusion in Dairy Cows

    Get PDF
    Administrating lipopolysaccharide (LPS) decreases circulating calcium (Ca) and markedly reduces both feed intake and milk yield in lactating cows. Calcium is involved in immune system activation and live yeast can increase feed intake. Whether supplemental live yeast and Ca benefits immune-challenged cows remains unclear. Therefore, study objectives were to evaluate if providing an oral supplement containing soluble Ca, live yeast and other micronutrients would ameliorate LPS-induced hypocalcemia and production parameters in lactating dairy cows. Providing an oral supplement containing Ca and live yeast prior to and following LPS administration markedly ameliorated the LPS-induced hypocalcemia and improved DMI and milk yield. Overall, utilizing an oral supplement may be a valuable management strategy to improve animal welfare and productivity during and following immunoactivation. Additionally, infusing i.v. LPS appears to be an effective technique to model hypocalcemia and to evaluate dietary strategies aimed at increasing circulating calcium in lactating dairy cows

    Functional Characteristics of the Gut Microbiome in C57BL/6 Mice Differentially Susceptible to Plasmodium yoelii

    Get PDF
    C57BL/6 mice are widely used for in vivo studies of immune function and metabolism in mammals. In a previous study, it was observed that when C57BL/6 mice purchased from different vendors were infected with Plasmodium yoelii, a causative agent of murine malaria, they exhibited both differential immune responses and significantly different parasite burdens: these patterns were reproducible when gut contents were transplanted into gnotobiotic mice. To gain insight into the mechanism of resistance, we removed whole ceca from mice purchased from two vendors, Taconic Biosciences (low parasitemia) and Charles River Laboratories (high parasitemia), to determine the combined host and microflora metabolome and metatranscriptome. With the exception of two Charles River samples, we observed 90% similarity in overall bacterial gene expression within vendors and 80% similarity between vendors. In total 33 bacterial genes were differentially expressed in Charles River mice (p-value \u3c 0.05) relative to the mice purchased from Taconic. Included among these, fliC, ureABC, and six members of the nuo gene family were overrepresented in microbiomes susceptible to more severe malaria. Moreover, 38 mouse genes were differentially expressed in these purported genetically identical mice. Differentially expressed genes included basigin, a cell surface receptor required for P. falciparum invasion of red blood cells. Differences in metabolite pools were detected, though their relevance to malaria infection, microbial community activity, or host response is not yet understood. Our data have provided new targets that may connect gut microbial activity to malaria resistance and susceptibility phenotypes in the C57BL/6 model organism

    Enhancing Logical Reasoning of Large Language Models through Logic-Driven Data Augmentation

    Full text link
    Combining large language models with logical reasoning enhance their capacity to address problems in a robust and reliable manner. Nevertheless, the intricate nature of logical reasoning poses challenges to gathering reliable data from web for building comprehensive training datasets, subsequently affecting the performance on downstream tasks. To address this, we introduce a novel logic-driven data augmentation approach, AMR-LDA. AMR-LDA converts the original text into an Abstract Meaning Representation (AMR) graph, a structured semantic representation that encapsulates the logic structure of the sentence, upon which operations are performed to generate logically modified AMR graphs. The modified AMR graphs are subsequently converted back into texts to create augmented data. Notably, our methodology is architecture-agnostic and enhances generative large language models, such as GPT-3.5 and GPT-4, through prompt augmentation, and fine-tuning discriminative large language models through contrastive learning with logic-driven data augmentation. Empirical evidence underscores the efficacy of our proposed method with improvement in performance across seven downstream tasks, such as logical reasoning reading comprehension, textual entailment, and natural language inference. Furthermore, our method ranked first on the ReClor leaderboard \url{https://eval.ai/web/challenges/challenge-page/503/leaderboard/1347}. The source code and data are publicly available \url{https://github.com/Strong-AI-Lab/Logical-Equivalence-driven-AMR-Data-Augmentation-for-Representation-Learning}.Comment: Accepted for oral presentation at the LLM@IJCAI 2023 non-archival symposiu

    Capacity Planning of a Commodity Cluster in an Academic Environment: A Case Study

    Get PDF
    Abstract. In this paper, the design of a simulation model for evaluating two alternative supercomputer configurations in an academic environment is presented. The workload is analyzed and modeled, and its effect on the relative performance of both systems is studied. The Integrated Capacity Planning Environment (ICPE) toolkit, developed for commodity cluster capacity planning, is successfully applied to the target environment. The ICPE is a tool for workload modeling, simulation modeling, and what-if analysis. A new characterization strategy is applied to the workload to more accurately model commodity cluster workloads. Through "what-if" analysis, the sensitivity of the baseline system performance to workload change, and also the relative performance of the two proposed alternative systems are compared and evaluated. This case study demonstrates the usefulness of the methodology and the applicability of the tools in gauging system capacity and making design decisions

    Systematic review of the performance of HIV viral load technologies on plasma samples.

    Get PDF
    BACKGROUND: Viral load (VL) monitoring is the standard of care in developing country settings for detecting HIV treatment failure. Since 2010 the World Health Organization has recommended a phase-in approach to VL monitoring in resource-limited settings. We conducted a systematic review of the accuracy and precision of HIV VL technologies for treatment monitoring. METHODS AND FINDINGS: A search of Medline and Embase was conducted for studies evaluating the accuracy or reproducibility of commercially available HIV VL assays. 37 studies were included for review including evaluations of the Amplicor Monitor HIV-1 v1.5 (n = 25), Cobas TaqMan v2.0 (n = 11), Abbott RealTime HIV-1 (n = 23), Versant HIV-1 RNA bDNA 3.0 (n = 15), Versant HIV-1 RNA kPCR 1.0 (n = 2), ExaVir Load v3 (n = 2), and NucliSens EasyQ v2.0 (n = 1). All currently available HIV VL assays are of sufficient sensitivity to detect plasma virus levels at a lower detection limit of 1,000 copies/mL. Bias data comparing the Abbott RealTime HIV-1, TaqMan v2.0 to the Amplicor Monitor v1.5 showed a tendency of the Abbott RealTime HIV-1 to under-estimate results while the TaqMan v2.0 overestimated VL counts. Compared to the Amplicor Monitor v1.5, 2-26% and 9-70% of results from the Versant bDNA 3.0 and Abbott RealTime HIV-1 differed by greater than 0.5log10. The average intra and inter-assay variation of the Abbott RealTime HIV-1 were 2.95% (range 2.0-5.1%) and 5.44% (range 1.17-30.00%) across the range of VL counts (2log10-7log10). CONCLUSIONS: This review found that all currently available HIV VL assays are of sufficient sensitivity to detect plasma VL of 1,000 copies/mL as a threshold to initiate investigations of treatment adherence or possible treatment failure. Sources of variability between VL assays include differences in technology platform, plasma input volume, and ability to detect HIV-1 subtypes. Monitoring of individual patients should be performed on the same technology platform to ensure appropriate interpretation of changes in VL. Prospero registration # CD42013003603

    Systematic review of the use of dried blood spots for monitoring HIV viral load and for early infant diagnosis.

    Get PDF
    BACKGROUND: Dried blood spots (DBS) have been used as alternative specimens to plasma to increase access to HIV viral load (VL) monitoring and early infant diagnosis (EID) in remote settings. We systematically reviewed evidence on the performance of DBS compared to plasma for VL monitoring and EID. METHODS AND FINDINGS: Thirteen peer reviewed HIV VL publications and five HIV EID papers were included. Depending on the technology and the viral load distribution in the study population, the percentage of DBS samples that are within 0.5 log of VL in plasma ranged from 52-100%. Because the input sample volume is much smaller in a blood spot, there is a risk of false negatives with DBS. Sensitivity of DBS VL was found to be 78-100% compared to plasma at VL below 1000 copies/ml, but this increased to 100% at a threshold of 5000 copies/ml. Unlike a plasma VL test which measures only cell free HIV RNA, a DBS VL also measures proviral DNA as well as cell-associated RNA, potentially leading to false positive results when using DBS. The systematic review showed that specificity was close to 100% at DBS VL above 5000 copies/ml, and this threshold would be the most reliable for predicting true virologic failure using DBS. For early infant diagnosis, DBS has a sensitivity of 100% compared to fresh whole blood or plasma in all studies. CONCLUSIONS: Although limited data are available for EID, DBS offer a highly sensitive and specific sampling strategy to make viral load monitoring and early infant diagnosis more accessible in remote settings. A standardized approach for sampling, storing, and processing DBS samples would be essential to allow successful implementation. TRIAL REGISTRATION: PROSPERO Registration #: CRD42013003621
    • …
    corecore