75 research outputs found
Can language models learn from explanations in context?
Large language models can perform new tasks by adapting to a few in-context
examples. For humans, rapid learning from examples can benefit from
explanations that connect examples to task principles. We therefore investigate
whether explanations of few-shot examples can allow language models to adapt
more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with
explanations of answers to a small subset of questions, as well as a variety of
matched control explanations. We evaluate the effects of various zero-shot and
few-shot prompts that include different types of explanations, instructions,
and controls on the performance of a range of large language models. We analyze
these results using statistical multilevel modeling techniques that account for
the nested dependencies among conditions, tasks, prompts, and models. We find
that explanations of examples can improve performance. Adding untuned
explanations to a few-shot prompt offers a modest improvement in performance;
about 1/3 the effect size of adding few-shot examples, but twice the effect
size of task instructions. We then show that explanations tuned for performance
on a small validation set offer substantially larger benefits; building a
prompt by selecting examples and explanations together substantially improves
performance over selecting examples alone. Hand-tuning explanations can
substantially improve performance on challenging tasks. Furthermore, even
untuned explanations outperform carefully matched controls, suggesting that the
benefits are due to the link between an example and its explanation, rather
than lower-level features of the language used. However, only large models can
benefit from explanations. In summary, explanations can support the in-context
learning abilities of large language models o
Fine-tuning language models to find agreement among humans with diverse preferences
Recent work in large language modeling (LLMs) has used fine-tuning to align
outputs with the preferences of a prototypical user. This work assumes that
human preferences are static and homogeneous across individuals, so that
aligning to a a single "generic" user will confer more general alignment. Here,
we embrace the heterogeneity of human preferences to consider a different
challenge: how might a machine help people with diverse views find agreement?
We fine-tune a 70 billion parameter LLM to generate statements that maximize
the expected approval for a group of people with potentially diverse opinions.
Human participants provide written opinions on thousands of questions touching
on moral and political issues (e.g., "should we raise taxes on the rich?"), and
rate the LLM's generated candidate consensus statements for agreement and
quality. A reward model is then trained to predict individual preferences,
enabling it to quantify and rank consensus statements in terms of their appeal
to the overall group, defined according to different aggregation (social
welfare) functions. The model produces consensus statements that are preferred
by human users over those from prompted LLMs (>70%) and significantly
outperforms a tight fine-tuned baseline that lacks the final ranking step.
Further, our best model's consensus statements are preferred over the best
human-generated opinions (>65%). We find that when we silently constructed
consensus statements from only a subset of group members, those who were
excluded were more likely to dissent, revealing the sensitivity of the
consensus to individual contributions. These results highlight the potential to
use LLMs to help groups of humans align their values with one another
Bright light-emitting diodes based on organometal halide perovskite.
Solid-state light-emitting devices based on direct-bandgap semiconductors have, over the past two decades, been utilized as energy-efficient sources of lighting. However, fabrication of these devices typically relies on expensive high-temperature and high-vacuum processes, rendering them uneconomical for use in large-area displays. Here, we report high-brightness light-emitting diodes based on solution-processed organometal halide perovskites. We demonstrate electroluminescence in the near-infrared, green and red by tuning the halide compositions in the perovskite. In our infrared device, a thin 15 nm layer of CH3NH3PbI(3-x)Cl(x) perovskite emitter is sandwiched between larger-bandgap titanium dioxide (TiO2) and poly(9,9'-dioctylfluorene) (F8) layers, effectively confining electrons and holes in the perovskite layer for radiative recombination. We report an infrared radiance of 13.2 W sr(-1) m(-2) at a current density of 363 mA cm(-2), with highest external and internal quantum efficiencies of 0.76% and 3.4%, respectively. In our green light-emitting device with an ITO/PEDOT:PSS/CH3NH3PbBr3/F8/Ca/Ag structure, we achieved a luminance of 364 cd m(-2) at a current density of 123 mA cm(-2), giving external and internal quantum efficiencies of 0.1% and 0.4%, respectively. We show, using photoluminescence studies, that radiative bimolecular recombination is dominant at higher excitation densities. Hence, the quantum efficiencies of the perovskite light-emitting diodes increase at higher current densities. This demonstration of effective perovskite electroluminescence offers scope for developing this unique class of materials into efficient and colour-tunable light emitters for low-cost display, lighting and optical communication applications.This is the author accepted manuscript and will be under embargo until 3/2/15. The final version is published in Nature Nanotechnology: http://www.nature.com/nnano/journal/vaop/ncurrent/full/nnano.2014.149.html
Negated antonyms
Basic experimental semantic / pragmatic interpretations of negated antonyms (e.g., "not unhappy") as well as negations of various kinds ("sad", "not happy", "unhappy"
Warm (for Winter): Inferring Comparison Classes in Communication
The meanings of natural language utterances depend heavily on context. Yet, what counts as context is often only implicit in conversation. The utterance it's warm outside signals that the temperature outside is relatively high, but the temperature could be high relative to a number of different comparison classes: other days of the year, other weeks, other seasons, etc. Theories of context sensitivity in language agree that the comparison class is a crucial variable for understanding meaning, but little is known about how a listener decides upon the comparison class. Using the case study of gradable adjectives (e.g., warm), we extend a Bayesian model of pragmatic inference to reason flexibly about the comparison class and test its qualitative predictions in a large-scale free-production experiment. We find that human listeners infer the comparison class by reasoning about the kinds of observations that would be remarkable enough for a speaker to mention, given the speaker and listener's shared knowledge of the world. Further, we quantitatively synthesize the model and data using Bayesian data analysis, which reveals that usage frequency and a preference for basic-level categories are two main factors in comparison class inference. This work presents new data and reveals the mechanisms by which human listeners recover the relevant aspects of context when understanding language
- …