884 research outputs found
A high-reproducibility and high-accuracy method for automated topic classification
Much of human knowledge sits in large databases of unstructured text.
Leveraging this knowledge requires algorithms that extract and record metadata
on unstructured text documents. Assigning topics to documents will enable
intelligent search, statistical characterization, and meaningful
classification. Latent Dirichlet allocation (LDA) is the state-of-the-art in
topic classification. Here, we perform a systematic theoretical and numerical
analysis that demonstrates that current optimization techniques for LDA often
yield results which are not accurate in inferring the most suitable model
parameters. Adapting approaches for community detection in networks, we propose
a new algorithm which displays high-reproducibility and high-accuracy, and also
has high computational efficiency. We apply it to a large set of documents in
the English Wikipedia and reveal its hierarchical structure. Our algorithm
promises to make "big data" text analysis systems more reliable.Comment: 23 pages, 24 figure
CogBench: a large language model walks into a psychology lab
Large language models (LLMs) have significantly advanced the field of
artificial intelligence. Yet, evaluating them comprehensively remains
challenging. We argue that this is partly due to the predominant focus on
performance metrics in most benchmarks. This paper introduces CogBench, a
benchmark that includes ten behavioral metrics derived from seven cognitive
psychology experiments. This novel approach offers a toolkit for phenotyping
LLMs' behavior. We apply CogBench to 35 LLMs, yielding a rich and diverse
dataset. We analyze this data using statistical multilevel modeling techniques,
accounting for the nested dependencies among fine-tuned versions of specific
LLMs. Our study highlights the crucial role of model size and reinforcement
learning from human feedback (RLHF) in improving performance and aligning with
human behavior. Interestingly, we find that open-source models are less
risk-prone than proprietary models and that fine-tuning on code does not
necessarily enhance LLMs' behavior. Finally, we explore the effects of
prompt-engineering techniques. We discover that chain-of-thought prompting
improves probabilistic reasoning, while take-a-step-back prompting fosters
model-based behaviors
Food Pulses Increase Longevity and Induce Cyclical Egg Production in Mediterranean Fruit Flies
1. Inasmuch as virtually all studies on mortality and reproduction in insects are conducted under conditions in which food availability is constant, little is known about the demographic response of insects to variable food environments. For example, it is not known if and to what extent the life expectancy of insects subjected to shortages of high-quality food will increase and/or whether this increase is associated with major decreases in lifetime reproduction.
2. Therefore cohorts of 100 individual female medflies were subjected to different sets of conditions of protein availability (interspersed with sugar-only diets) including ad libitum sugar-only (no protein), ad libitum protein and full (protein) diet either every 2nd, 4th, 6th, 11th, or 21st day, as well as two lag-treatments (1 day full diet followed by 30 days sugar-only, followed by one of two cyclical treatments).
3. Both life expectancy and lifetime reproduction were strongly affected by specific treatments. Specifically (i) mortality was inversely related to frequency of protein availability whereas lifetime reproduction was directly related; (ii) distinct cycles in reproduction began to appear when food pulse cycles were as short as every 4 days. However, egg-laying peaks and troughs were particularly pronounced in the 10- and 20-day food pulse cycles; (iii) the peak and trough levels were inversely related to cycle length; and (iv) the within-cycle height was independent of cycle length, occurring 4 days after protein food was made available to the cohort whether the cycle length was 5, 10 or 20 days.
4. The results shed new light on the within- and between-cycle and lifetime dynamics of reproduction when insects are subjected to variable food environments and indicate that medfly females track food level very closely
Recommended from our members
Genetic Analysis of the APAF1 Gene in Male Germ Cell Tumors
Cytogenetic and molecular analyses have shown that the chromosome band 12q22 is recurrently deleted in male germ cell tumors (GCTs), indicating the presence of a candidate tumor suppressor gene (TSG) in this region. To identify the TSG, we mapped the APAF1 gene, a proapoptotic mammalian homologue of ced-4, to chromosomal band 12q22, that suggested that this might be the candidate deleted gene in GCTs. We further localized the gene between the polymorphic markers D12S1671 and D12S1082 at 12q22 to determine the role of APAF1 in the pathogenesis of GCT, and we characterized its normal genomic structure and analyzed its alterations in GCTs. The APAF1 gene comprises 27 exons, with the coding region spanning 26. The region containing APAF1 was found to be deleted in GCT by fluorescence in situ hybridization analysis, but without evidence of coding sequence alterations. RT-PCR and Western blot analysis showed APAF1 gene expression at detectable levels in all GCT cell lines analyzed. An aberrant-sized APAF1 protein was seen in one cell line. This and 2 other cell lines carrying APAF1 deletions also exhibited defects in dATP-mediated caspase-3 activation. Caspase-3 activity was effectively restored by addition of recombinant caspase-9 and APAF1 proteins, and to a lesser extent by caspase-9 alone, but not by APAF1 alone. These data do not support a TSG role for APAF1, but defects in other components of the apoptotic pathway that may be related to 12q22 deletion cannot be ruled out
Can language models learn from explanations in context?
Large language models can perform new tasks by adapting to a few in-context
examples. For humans, rapid learning from examples can benefit from
explanations that connect examples to task principles. We therefore investigate
whether explanations of few-shot examples can allow language models to adapt
more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with
explanations of answers to a small subset of questions, as well as a variety of
matched control explanations. We evaluate the effects of various zero-shot and
few-shot prompts that include different types of explanations, instructions,
and controls on the performance of a range of large language models. We analyze
these results using statistical multilevel modeling techniques that account for
the nested dependencies among conditions, tasks, prompts, and models. We find
that explanations of examples can improve performance. Adding untuned
explanations to a few-shot prompt offers a modest improvement in performance;
about 1/3 the effect size of adding few-shot examples, but twice the effect
size of task instructions. We then show that explanations tuned for performance
on a small validation set offer substantially larger benefits; building a
prompt by selecting examples and explanations together substantially improves
performance over selecting examples alone. Hand-tuning explanations can
substantially improve performance on challenging tasks. Furthermore, even
untuned explanations outperform carefully matched controls, suggesting that the
benefits are due to the link between an example and its explanation, rather
than lower-level features of the language used. However, only large models can
benefit from explanations. In summary, explanations can support the in-context
learning abilities of large language models o
- …