455 research outputs found
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers
Logical reasoning, i.e., deductively inferring the truth value of a
conclusion from a set of premises, is an important task for artificial
intelligence with wide potential impacts on science, mathematics, and society.
While many prompting-based strategies have been proposed to enable Large
Language Models (LLMs) to do such reasoning more effectively, they still appear
unsatisfactory, often failing in subtle and unpredictable ways. In this work,
we investigate the validity of instead reformulating such tasks as modular
neurosymbolic programming, which we call LINC: Logical Inference via
Neurosymbolic Computation. In LINC, the LLM acts as a semantic parser,
translating premises and conclusions from natural language to expressions in
first-order logic. These expressions are then offloaded to an external theorem
prover, which symbolically performs deductive inference. Leveraging this
approach, we observe significant performance gains on FOLIO and a balanced
subset of ProofWriter for three different models in nearly all experimental
conditions we evaluate. On ProofWriter, augmenting the comparatively small
open-source StarCoder+ (15.5B parameters) with LINC even outperforms GPT-3.5
and GPT-4 with Chain-of-Thought (CoT) prompting by an absolute 38% and 10%,
respectively. When used with GPT-4, LINC scores 26% higher than CoT on
ProofWriter while performing comparatively on FOLIO. Further analysis reveals
that although both methods on average succeed roughly equally often on this
dataset, they exhibit distinct and complementary failure modes. We thus provide
promising evidence for how logical reasoning over natural language can be
tackled through jointly leveraging LLMs alongside symbolic provers. All
corresponding code is publicly available at https://github.com/benlipkin/lin
Retracing Micro-Epidemics of Chagas Disease Using Epicenter Regression
Vector-borne transmission of Chagas disease has become an urban problem in the city of Arequipa, Peru, yet the debilitating symptoms that can occur in the chronic stage of the disease are rarely seen in hospitals in the city. The lack of obvious clinical disease in Arequipa has led to speculation that the local strain of the etiologic agent, Trypanosoma cruzi, has low chronic pathogenicity. The long asymptomatic period of Chagas disease leads us to an alternative hypothesis for the absence of clinical cases in Arequipa: transmission in the city may be so recent that most infected individuals have yet to progress to late stage disease. Here we describe a new method, epicenter regression, that allows us to infer the spatial and temporal history of disease transmission from a snapshot of a population\u27s infection status. We show that in a community of Arequipa, transmission of T. cruzi by the insect vector Triatoma infestans occurred as a series of focal micro-epidemics, the oldest of which began only around 20 years ago. These micro-epidemics infected nearly 5% of the community before transmission of the parasite was disrupted through insecticide application in 2004. Most extant human infections in our study community arose over a brief period of time immediately prior to vector control. According to our findings, the symptoms of chronic Chagas disease are expected to be absent, even if the strain is pathogenic in the chronic phase of disease, given the long asymptomatic period of the disease and short history of intense transmission
Retracing Micro-Epidemics of Chagas Disease Using Epicenter Regression
Vector-borne transmission of Chagas disease has become an urban problem in the city of Arequipa, Peru, yet the debilitating symptoms that can occur in the chronic stage of the disease are rarely seen in hospitals in the city. The lack of obvious clinical disease in Arequipa has led to speculation that the local strain of the etiologic agent, Trypanosoma cruzi, has low chronic pathogenicity. The long asymptomatic period of Chagas disease leads us to an alternative hypothesis for the absence of clinical cases in Arequipa: transmission in the city may be so recent that most infected individuals have yet to progress to late stage disease. Here we describe a new method, epicenter regression, that allows us to infer the spatial and temporal history of disease transmission from a snapshot of a population's infection status. We show that in a community of Arequipa, transmission of T. cruzi by the insect vector Triatoma infestans occurred as a series of focal micro-epidemics, the oldest of which began only around 20 years ago. These micro-epidemics infected nearly 5% of the community before transmission of the parasite was disrupted through insecticide application in 2004. Most extant human infections in our study community arose over a brief period of time immediately prior to vector control. According to our findings, the symptoms of chronic Chagas disease are expected to be absent, even if the strain is pathogenic in the chronic phase of disease, given the long asymptomatic period of the disease and short history of intense transmission. Traducción al español disponible en Alternative Language Text S1/A Spanish translation of this article is available in Alternative Language Text S
A randomized controlled trial of exercise in spinal and bulbar muscular atrophy.
OBJECTIVE: To determine the safety and efficacy of a home-based functional exercise program in spinal and bulbar muscular atrophy (SBMA).
METHODS: Subjects were randomly assigned to participate in 12 weeks of either functional exercises (intervention) or a stretching program (control) at the National Institutes of Health in Bethesda, MD. A total of 54 subjects enrolled, and 50 completed the study with 24 in the functional exercise group and 26 in the stretching control group. The primary outcome measure was the Adult Myopathy Assessment Tool (AMAT) total score, and secondary measures included total activity by accelerometry, muscle strength, balance, timed up and go, sit-to-stand test, health-related quality of life, creatine kinase, and insulin-like growth factor-1.
RESULTS: Functional exercise was well tolerated but did not lead to significant group differences in the primary outcome measure or any of the secondary measures. The functional exercise did not produce significantly more adverse events than stretching, and was not perceived to be difficult. To determine whether a subset of the subjects may have benefited, we divided them into high and low functioning based on baseline AMAT scores and performed a post hoc subgroup analysis. Low-functioning individuals receiving the intervention increased AMAT functional subscale scores compared to the control group.
INTERPRETATION: Although these trial results indicate that functional exercise had no significant effect on total AMAT scores or on mobility, strength, balance, and quality of life, post hoc findings indicate that low-functioning men with SBMA may respond better to functional exercises, and this warrants further investigation with appropriate exercise intensity
Genome-wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease.
Identifying genetic variants associated with circulating protein concentrations (protein quantitative trait loci; pQTLs) and integrating them with variants from genome-wide association studies (GWAS) may illuminate the proteome's causal role in disease and bridge a knowledge gap regarding SNP-disease associations. We provide the results of GWAS of 71 high-value cardiovascular disease proteins in 6861 Framingham Heart Study participants and independent external replication. We report the mapping of over 16,000 pQTL variants and their functional relevance. We provide an integrated plasma protein-QTL database. Thirteen proteins harbor pQTL variants that match coronary disease-risk variants from GWAS or test causal for coronary disease by Mendelian randomization. Eight of these proteins predict new-onset cardiovascular disease events in Framingham participants. We demonstrate that identifying pQTLs, integrating them with GWAS results, employing Mendelian randomization, and prospectively testing protein-trait associations holds potential for elucidating causal genes, proteins, and pathways for cardiovascular disease and may identify targets for its prevention and treatment
Recommended from our members
Best Practices and Joint Calling of the HumanExome BeadChip: The CHARGE Consortium
Genotyping arrays are a cost effective approach when typing previously-identified genetic polymorphisms in large numbers of samples. One limitation of genotyping arrays with rare variants (e.g., minor allele frequency [MAF] <0.01) is the difficulty that automated clustering algorithms have to accurately detect and assign genotype calls. Combining intensity data from large numbers of samples may increase the ability to accurately call the genotypes of rare variants. Approximately 62,000 ethnically diverse samples from eleven Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium cohorts were genotyped with the Illumina HumanExome BeadChip across seven genotyping centers. The raw data files for the samples were assembled into a single project for joint calling. To assess the quality of the joint calling, concordance of genotypes in a subset of individuals having both exome chip and exome sequence data was analyzed. After exclusion of low performing SNPs on the exome chip and non-overlap of SNPs derived from sequence data, genotypes of 185,119 variants (11,356 were monomorphic) were compared in 530 individuals that had whole exome sequence data. A total of 98,113,070 pairs of genotypes were tested and 99.77% were concordant, 0.14% had missing data, and 0.09% were discordant. We report that joint calling allows the ability to accurately genotype rare variation using array technology when large sample sizes are available and best practices are followed. The cluster file from this experiment is available at www.chargeconsortium.com/main/exomechip
Genome-Wide Analysis of Natural Selection on Human Cis-Elements
Background: It has been speculated that the polymorphisms in the non-coding portion of the human genome underlie much of the phenotypic variability among humans and between humans and other primates. If so, these genomic regions may be undergoing rapid evolutionary change, due in part to natural selection. However, the non-coding region is a heterogeneous mix of functional and non-functional regions. Furthermore, the functional regions are comprised of a variety of different types of elements, each under potentially different selection regimes. Findings and Conclusions: Using the HapMap and Perlegen polymorphism data that map to a stringent set of putative binding sites in human proximal promoters, we apply the Derived Allele Frequency distribution test of neutrality to provide evidence that many human-specific and primate-specific binding sites are likely evolving under positive selection. We also discuss inherent limitations of publicly available human SNP datasets that complicate the inference of selection pressures. Finally, we show that the genes whose proximal binding sites contain high frequency derived alleles are enriched for positive regulation of protein metabolism and developmental processes. Thus our genome-scale investigation provide
- …