39 research outputs found

    How should the advent of large language models affect the practice of science?

    Full text link
    Large language models (LLMs) are being increasingly incorporated into scientific workflows. However, we have yet to fully grasp the implications of this integration. How should the advent of large language models affect the practice of science? For this opinion piece, we have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate. Schulz et al. make the argument that working with LLMs is not fundamentally different from working with human collaborators, while Bender et al. argue that LLMs are often misused and over-hyped, and that their limitations warrant a focus on more specialized, easily interpretable tools. Marelli et al. emphasize the importance of transparent attribution and responsible use of LLMs. Finally, Botvinick and Gershman advocate that humans should retain responsibility for determining the scientific roadmap. To facilitate the discussion, the four perspectives are complemented with a response from each group. By putting these different perspectives in conversation, we aim to bring attention to important considerations within the academic community regarding the adoption of LLMs and their impact on both current and future scientific practices

    Individualization as driving force of clustering phenomena in humans

    Get PDF
    One of the most intriguing dynamics in biological systems is the emergence of clustering, the self-organization into separated agglomerations of individuals. Several theories have been developed to explain clustering in, for instance, multi-cellular organisms, ant colonies, bee hives, flocks of birds, schools of fish, and animal herds. A persistent puzzle, however, is clustering of opinions in human populations. The puzzle is particularly pressing if opinions vary continuously, such as the degree to which citizens are in favor of or against a vaccination program. Existing opinion formation models suggest that "monoculture" is unavoidable in the long run, unless subsets of the population are perfectly separated from each other. Yet, social diversity is a robust empirical phenomenon, although perfect separation is hardly possible in an increasingly connected world. Considering randomness did not overcome the theoretical shortcomings so far. Small perturbations of individual opinions trigger social influence cascades that inevitably lead to monoculture, while larger noise disrupts opinion clusters and results in rampant individualism without any social structure. Our solution of the puzzle builds on recent empirical research, combining the integrative tendencies of social influence with the disintegrative effects of individualization. A key element of the new computational model is an adaptive kind of noise. We conduct simulation experiments to demonstrate that with this kind of noise, a third phase besides individualism and monoculture becomes possible, characterized by the formation of metastable clusters with diversity between and consensus within clusters. When clusters are small, individualization tendencies are too weak to prohibit a fusion of clusters. When clusters grow too large, however, individualization increases in strength, which promotes their splitting.Comment: 12 pages, 4 figure

    Impure Public Technologies and Environmental Policy

    Full text link
    Analyses of public goods regularly address the case of pure public goods. However, a large number of (international) public goods exhibit characteristics of different degrees of publicness, i.e. they are impure public goods. In our analysis of transfers helping to overcome the inefficient provision of such goods, we therefore apply the Lancastrian characteristics approach. In contrast to the existing literature, we consider the case of a continuum of impure public goods. We employ the example of international conditional transfers targeting to overcome suboptimal low climate protection efforts by influencing the abatement technology choice of countries

    Automatic identification of variables in epidemiological datasets using logic regression

    Get PDF
    textabstractBackground: For an individual participant data (IPD) meta-analysis, multiple datasets must be transformed in a consistent format, e.g. using uniform variable names. When large numbers of datasets have to be processed, this can be a time-consuming and error-prone task. Automated or semi-automated identification of variables can help to reduce the workload and improve the data quality. For semi-automation high sensitivity in the recognition of matching variables is particularly important, because it allows creating software which for a target variable presents a choice of source variables, from which a user can choose the matching one, with only low risk of having missed a correct source variable. Methods: For each variable in a set of target variables, a number of simple rules were manually created. With logic regression, an optimal Boolean combination of these rules was searched for every target variable, using a random subset of a large database of epidemiological and clinical cohort data (construction subset). In a second subset of this database (validation subset), this optimal combination rules were validated. Results: In the construction sample, 41 target variables were allocated on average with a positive predictive value (PPV) of 34%, and a negative predictive value (NPV) of 95%. In the validation sample, PPV was 33%, whereas NPV remained at 94%. In the construction sample, PPV was 50% or less in 63% of all variables, in the validation sample in 71% of all variables. Conclusions: We demonstrated that the application of logic regression in a complex data management task in large epidemiological IPD meta-analyses is feasible. However, the performance of the algorithm is poor, which may require backup strategies

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    Science of science

    No full text
    BACKGROUND: The increasing availability of digital data on scholarly inputs and outputs—from research funding, productivity, and collaboration to paper citations and scientist mobility—offers unprecedented opportunities to explore the structure and evolution of science. The science of science (SciSci) offers a quantitative understanding of the interactions among scientific agents across diverse geographic and temporal scales: It provides insights into the conditions underlying creativity and the genesis of scientific discovery, with the ultimate goal of developing tools and policies that have the potential to accelerate science. In the past decade, SciSci has benefited from an influx of natural, computational, and social scientists who together have developed big data–based capabilities for empirical analysis and generative modeling that capture the unfolding of science, its institutions, and its workforce. The value proposition of SciSci is that with a deeper understanding of the factors that drive successful science, we can more effectively address environmental, societal, and technological problems. ADVANCES: Science can be described as a complex, self-organizing, and evolving network of scholars, projects, papers, and ideas. This representation has unveiled patterns characterizing the emergence of new scientific fields through the study of collaboration networks and the path of impactful discoveries through the study of citation networks. Microscopic models have traced the dynamics of citation accumulation, allowing us to predict the future impact of individual papers. SciSci has revealed choices and trade-offs that scientists face as they advance both their own careers and the scientific horizon. For example, measurements indicate that scholars are risk-averse, preferring to study topics related to their current expertise, which constrains the potential of future discoveries. Those willing to break this pattern engage in riskier careers but become more likely to make major breakthroughs. Overall, the highest-impact science is grounded in conventional combinations of prior work but features unusual combinations. Last, as the locus of research is shifting into teams, SciSci is increasingly focused on the impact of team research, finding that small teams tend to disrupt science and technology with new ideas drawing on older and less prevalent ones. In contrast, large teams tend to develop recent, popular ideas, obtaining high, but often short-lived, impact. OUTLOOK: SciSci offers a deep quantitative understanding of the relational structure between scientists, institutions, and ideas because it facilitates the identification of fundamental mechanisms responsible for scientific discovery. These interdisciplinary data-driven efforts complement contributions from related fields such as sciento-metrics and the economics and sociology of science. Although SciSci seeks long-standing universal laws and mechanisms that apply across various fields of science, a fundamental challenge going forward is accounting for undeniable differences in culture, habits, and preferences between different fields and countries. This variation makes some cross-domain insights difficult to appreciate and associated science policies difficult to implement. The differences among the questions, data, and skills specific to each discipline suggest that further insights can be gained from domain-specific SciSci studies, which model and identify opportunities adapted to the needs of individual research fields

    Private Provision of Public Goods: Incentives for Donations

    No full text
    In many countries the government supports individuals' and companies' donations dedicated to charity organizations or { more general { to public goods. Yet the effects of governmental support with respect to the provision of public goods has been and still is subject to an extensive debate in the economic literature. Starting from Warr's (1982, 1983) famous neutrality result an array of conditions has been identified under which this result holds or not. In this paper we examine the commonly used policy approach to subsi- dize the private provision of public goods by granting agents deductions with respect to their income or corporate tax burden. We especially take into ac- count that most income tax schemes are progressive and that deductibility is limited. The problems that arise from these specific properties of the con- sidered tax-refund schemes are pointed out first. We then turn towards the effects which such a tax-refund scheme has with respect to the provision of the public good on the one hand and individual as well as aggregate wel- fare on the other hand. We show that the effects of this commonly practised method of supporting private public good provision depend crucially on the specific properties of the progressive tax scheme and the preference structure of agents. While Pareto-improvements and even Pareto-efficiency can result from the implementation of such a scheme, it is also conceivable that at least some agents perceive a utility reduction. Due to the dependency of welfare effects on the tariff structure, income tax reforms as they are planned in many countries might not only induce a reduction in private public good provision, but might also alter the induced welfare effects
    corecore