75 research outputs found

    Can language models learn from explanations in context?

    Full text link
    Large language models can perform new tasks by adapting to a few in-context examples. For humans, rapid learning from examples can benefit from explanations that connect examples to task principles. We therefore investigate whether explanations of few-shot examples can allow language models to adapt more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with explanations of answers to a small subset of questions, as well as a variety of matched control explanations. We evaluate the effects of various zero-shot and few-shot prompts that include different types of explanations, instructions, and controls on the performance of a range of large language models. We analyze these results using statistical multilevel modeling techniques that account for the nested dependencies among conditions, tasks, prompts, and models. We find that explanations of examples can improve performance. Adding untuned explanations to a few-shot prompt offers a modest improvement in performance; about 1/3 the effect size of adding few-shot examples, but twice the effect size of task instructions. We then show that explanations tuned for performance on a small validation set offer substantially larger benefits; building a prompt by selecting examples and explanations together substantially improves performance over selecting examples alone. Hand-tuning explanations can substantially improve performance on challenging tasks. Furthermore, even untuned explanations outperform carefully matched controls, suggesting that the benefits are due to the link between an example and its explanation, rather than lower-level features of the language used. However, only large models can benefit from explanations. In summary, explanations can support the in-context learning abilities of large language models o

    Fine-tuning language models to find agreement among humans with diverse preferences

    Full text link
    Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might a machine help people with diverse views find agreement? We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions. Human participants provide written opinions on thousands of questions touching on moral and political issues (e.g., "should we raise taxes on the rich?"), and rate the LLM's generated candidate consensus statements for agreement and quality. A reward model is then trained to predict individual preferences, enabling it to quantify and rank consensus statements in terms of their appeal to the overall group, defined according to different aggregation (social welfare) functions. The model produces consensus statements that are preferred by human users over those from prompted LLMs (>70%) and significantly outperforms a tight fine-tuned baseline that lacks the final ranking step. Further, our best model's consensus statements are preferred over the best human-generated opinions (>65%). We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent, revealing the sensitivity of the consensus to individual contributions. These results highlight the potential to use LLMs to help groups of humans align their values with one another

    Bright light-emitting diodes based on organometal halide perovskite.

    Get PDF
    Solid-state light-emitting devices based on direct-bandgap semiconductors have, over the past two decades, been utilized as energy-efficient sources of lighting. However, fabrication of these devices typically relies on expensive high-temperature and high-vacuum processes, rendering them uneconomical for use in large-area displays. Here, we report high-brightness light-emitting diodes based on solution-processed organometal halide perovskites. We demonstrate electroluminescence in the near-infrared, green and red by tuning the halide compositions in the perovskite. In our infrared device, a thin 15 nm layer of CH3NH3PbI(3-x)Cl(x) perovskite emitter is sandwiched between larger-bandgap titanium dioxide (TiO2) and poly(9,9'-dioctylfluorene) (F8) layers, effectively confining electrons and holes in the perovskite layer for radiative recombination. We report an infrared radiance of 13.2 W sr(-1) m(-2) at a current density of 363 mA cm(-2), with highest external and internal quantum efficiencies of 0.76% and 3.4%, respectively. In our green light-emitting device with an ITO/PEDOT:PSS/CH3NH3PbBr3/F8/Ca/Ag structure, we achieved a luminance of 364 cd m(-2) at a current density of 123 mA cm(-2), giving external and internal quantum efficiencies of 0.1% and 0.4%, respectively. We show, using photoluminescence studies, that radiative bimolecular recombination is dominant at higher excitation densities. Hence, the quantum efficiencies of the perovskite light-emitting diodes increase at higher current densities. This demonstration of effective perovskite electroluminescence offers scope for developing this unique class of materials into efficient and colour-tunable light emitters for low-cost display, lighting and optical communication applications.This is the author accepted manuscript and will be under embargo until 3/2/15. The final version is published in Nature Nanotechnology: http://www.nature.com/nnano/journal/vaop/ncurrent/full/nnano.2014.149.html

    Negated antonyms

    No full text
    Basic experimental semantic / pragmatic interpretations of negated antonyms (e.g., "not unhappy") as well as negations of various kinds ("sad", "not happy", "unhappy"

    Supplemental materials for preprint: How many observations is one generic worth?

    No full text

    Incremental Understanding of Conjunctive Generic Sentences

    No full text

    Generic interpretation

    No full text

    Incremental understanding of generics

    No full text

    Warm (for Winter): Inferring Comparison Classes in Communication

    No full text
    The meanings of natural language utterances depend heavily on context. Yet, what counts as context is often only implicit in conversation. The utterance it's warm outside signals that the temperature outside is relatively high, but the temperature could be high relative to a number of different comparison classes: other days of the year, other weeks, other seasons, etc. Theories of context sensitivity in language agree that the comparison class is a crucial variable for understanding meaning, but little is known about how a listener decides upon the comparison class. Using the case study of gradable adjectives (e.g., warm), we extend a Bayesian model of pragmatic inference to reason flexibly about the comparison class and test its qualitative predictions in a large-scale free-production experiment. We find that human listeners infer the comparison class by reasoning about the kinds of observations that would be remarkable enough for a speaker to mention, given the speaker and listener's shared knowledge of the world. Further, we quantitatively synthesize the model and data using Bayesian data analysis, which reveals that usage frequency and a preference for basic-level categories are two main factors in comparison class inference. This work presents new data and reveals the mechanisms by which human listeners recover the relevant aspects of context when understanding language
    • …
    corecore