4,411 research outputs found

    Revealing the structure of language model capabilities

    Full text link
    Building a theoretical understanding of the capabilities of large language models (LLMs) is vital for our ability to predict and explain the behavior of these systems. Here, we investigate the structure of LLM capabilities by extracting latent capabilities from patterns of individual differences across a varied population of LLMs. Using a combination of Bayesian and frequentist factor analysis, we analyzed data from 29 different LLMs across 27 cognitive tasks. We found evidence that LLM capabilities are not monolithic. Instead, they are better explained by three well-delineated factors that represent reasoning, comprehension and core language modeling. Moreover, we found that these three factors can explain a high proportion of the variance in model performance. These results reveal a consistent structure in the capabilities of different LLMs and demonstrate the multifaceted nature of these capabilities. We also found that the three abilities show different relationships to model properties such as model size and instruction tuning. These patterns help refine our understanding of scaling laws and indicate that changes to a model that improve one ability might simultaneously impair others. Based on these findings, we suggest that benchmarks could be streamlined by focusing on tasks that tap into each broad model ability.Comment: 10 pages, 3 figures + references and appendices, for data and analysis code see https://github.com/RyanBurnell/revealing-LLM-capabilitie

    Development of FuGO: An ontology for functional genomics investigations

    Get PDF
    The development of the Functional Genomics Investigation Ontology (FuGO) is a collaborative, international effort that will provide a resource for annotating functional genomics investigations, including the study design, protocols and instrumentation used, the data generated and the types of analysis performed on the data. FuGO will contain both terms that are universal to all functional genomics investigations and those that are domain specific. In this way, the ontology will serve as the “semantic glue” to provide a common understanding of data from across these disparate data sources. In addition, FuGO will reference out to existing mature ontologies to avoid the need to duplicate these resources, and will do so in such a way as to enable their ease of use in annotation. This project is in the early stages of development; the paper will describe efforts to initiate the project, the scope and organization of the project, the work accomplished to date, and the challenges encountered, as well as future plans

    Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

    Get PDF
    The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)(1). In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%

    Axial Vector JPC=1++J^{PC}=1^{++} Charmonium and Bottomonium Hybrid Mass Predictions with QCD Sum-Rules

    Full text link
    Axial vector (JPC=1++)(J^{PC}=1^{++}) charmonium and bottomonium hybrid masses are determined via QCD Laplace sum-rules. Previous sum-rule studies in this channel did not incorporate the dimension-six gluon condensate, which has been shown to be important for 1−−1^{--} and 0−+0^{-+} heavy quark hybrids. An updated analysis of axial vector charmonium and bottomonium hybrids is presented, including the effects of the dimension-six gluon condensate. The axial vector charmonium and bottomonium hybrid masses are predicted to be 5.13 GeV and 11.32 GeV, respectively. We discuss the implications of this result for the charmonium-like XYZ states and the charmonium hybrid multiplet structure observed in recent lattice calculations.Comment: 10 pages, 7 figures. Updated to match published versio

    Exact Solution for the Exterior Field of a Rotating Neutron Star

    Get PDF
    A four-parameter class of exact asymptotically flat solutions of the Einstein-Maxwell equations involving only rational functions is presented. It is able to describe the exterior field of a slowly or rapidly rotating neutron star with poloidal magnetic field.Comment: Accepted for publication in Phys. Rev. D as Rapid Communication. 8 pages, 2 eps figure

    Heavy-Light Mesons with Quenched Lattice NRQCD: Results on Decay Constants

    Get PDF
    We present a quenched lattice calculation of heavy-light meson decay constants, using non-relativistic (NRQCD) heavy quarks in the mass region of the bb quark and heavier, and clover-improved light quarks. The NRQCD Hamiltonian and the heavy-light current include the corrections at first order in the expansion in the inverse heavy quark mass. We study the dependence of the decay constants on the heavy meson mass MM, for light quarks with the tree level (cSWc_{SW} = 1), as well as the tadpole improved clover coefficient. We compare decay constants from NRQCD with results from clover (cSW=1c_{SW}=1) heavy quarks. Having calculated the current renormalisation constant ZAZ_A in one-loop perturbation theory, we demonstrate how the heavy mass dependence of the pseudoscalar decay constants changes after renormalisation. For the first time, we quote a result for fBf_B from NRQCD including the full one-loop matching factors at O(α/M)O(\alpha/M).Comment: 45 pages, latex, 24 postscript figure

    Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome

    Get PDF
    Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30–42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10–20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits

    Angular Momentum and Vortex Formation in Bose-Einstein-Condensed Cold Dark Matter Haloes

    Full text link
    (Abridged) Extensions of the standard model of particle physics predict very light bosons, ranging from about 10^{-5} eV for the QCD axion to 10^{-33} eV for ultra-light particles, which could be the cold dark matter (CDM) in the Universe. If so, their phase-space density must be high enough to form a Bose-Einstein condensate (BEC). The fluid-like nature of BEC-CDM dynamics differs from that of standard collisionless CDM (sCDM), so observations of galactic haloes may distinguish them. sCDM has problems with galaxy observations on small scales, which BEC-CDM may overcome for a large range of particle mass m and self-interaction strength g. For quantum-coherence on galactic scales of radius R and mass M, either the de-Broglie wavelength lambda_deB ~ m_H \cong 10^{-25}(R/100 kpc)^{-1/2}(M/10^{12} M_solar)^{-1/2} eV, or else lambda_deB << R but self-interaction balances gravity, requiring m >> m_H and g >> g_H \cong 2 x 10^{-64} (R/100 kpc)(M/10^{12} M_solar)^{-1} eV cm^3. Here we study the largely-neglected effects of angular momentum. Spin parameters lambda \cong 0.05 are expected from tidal-torquing by large-scale structure, just as for sCDM. Since lab BECs develop quantum vortices if rotated rapidly enough, we ask if this angular momentum is sufficient to form vortices in BEC haloes, affecting their structure with potentially observable consequences. The minimum angular momentum for this, L_{QM} = ℏM/m\hbar M/m, requires m >= 9.5 m_H for lambda = 0.05, close to the particle mass required to influence structure on galactic scales. We study the equilibrium of self-gravitating, rotating BEC haloes which satisfy the Gross-Pitaevskii-Poisson equations, to calculate if and when vortices are energetically favoured. Vortices form as long as self-interaction is strong enough, which includes a large part of the range of m and g of interest for BEC-CDM haloes.Comment: Several typos and numerical typos (incl. in Fig.6, Table 2 and Table 3) have been corrected and references have been updated after proof-reading stage; MNRAS in press; 29 pages; 11 figure

    The gut microbiota of people with asthma influences lung inflammation in gnotobiotic mice

    Get PDF
    The gut microbiota in early childhood is linked to asthma risk, but may continue to affect older patients with asthma. Here, we profile the gut microbiota of 38 children (19 asthma, median age 8) and 57 adults (17 asthma, median age 28) by 16S rRNA sequencing and find individuals with asthma harbored compositional differences from healthy controls in both adults and children. We develop a model to aid the design of mechanistic experiments in gnotobiotic mice and show enterotoxigeni

    Resequencing Candidate Genes Implicates Rare Variants in Asthma Susceptibility

    Get PDF
    Common variation in over 100 genes has been implicated in the risk of developing asthma, but the contribution of rare variants to asthma susceptibility remains largely unexplored. We selected nine genes that showed the strongest signatures of weak purifying selection from among 53 candidate asthma-associated genes, and we sequenced the coding exons and flanking noncoding regions in 450 asthmatic cases and 515 nonasthmatic controls. We observed an overall excess of p values <0.05 (p = 0.02), and rare variants in four genes (AGT, DPP10, IKBKAP, and IL12RB1) contributed to asthma susceptibility among African Americans. Rare variants in IL12RB1 were also associated with asthma susceptibility among European Americans, despite the fact that the majority of rare variants in IL12RB1 were specific to either one of the populations. The combined evidence of association with rare noncoding variants in IL12RB1 remained significant (p = 3.7 × 10−4) after correcting for multiple testing. Overall, the contribution of rare variants to asthma susceptibility was predominantly due to noncoding variants in sequences flanking the exons, although nonsynonymous rare variants in DPP10 and in IL12RB1 were associated with asthma in African Americans and European Americans, respectively. This study provides evidence that rare variants contribute to asthma susceptibility. Additional studies are required for testing whether prioritizing genes for resequencing on the basis of signatures of purifying selection is an efficient means of identifying novel rare variants that contribute to complex disease
    • 

    corecore