1,002 research outputs found
Can You Follow Me? Testing Situational Understanding in ChatGPT
Understanding sentence meanings and updating information states appropriately
across time -- what we call "situational understanding" (SU) -- is a critical
ability for human-like AI agents. SU is essential in particular for chat
models, such as ChatGPT, to enable consistent, coherent, and effective dialogue
between humans and AI. Previous works have identified certain SU limitations in
non-chatbot Large Language models (LLMs), but the extent and causes of these
limitations are not well understood, and capabilities of current chat-based
models in this domain have not been explored. In this work we tackle these
questions, proposing a novel synthetic environment for SU testing which allows
us to do controlled and systematic testing of SU in chat-oriented models,
through assessment of models' ability to track and enumerate environment
states. Our environment also allows for close analysis of dynamics of model
performance, to better understand underlying causes for performance patterns.
We apply our test to ChatGPT, the state-of-the-art chatbot, and find that
despite the fundamental simplicity of the task, the model's performance
reflects an inability to retain correct environment states across time. Our
follow-up analyses suggest that performance degradation is largely because
ChatGPT has non-persistent in-context memory (although it can access the full
dialogue history) and it is susceptible to hallucinated updates -- including
updates that artificially inflate accuracies. Our findings suggest overall that
ChatGPT is not currently equipped for robust tracking of situation states, and
that trust in the impressive dialogue performance of ChatGPT comes with risks.
We release the codebase for reproducing our test environment, as well as all
prompts and API responses from ChatGPT, at
https://github.com/yangalan123/SituationalTesting.Comment: EMNLP 2023 Main Paper (Camera Ready
Equipping Transformer with Random-Access Reading for Long-Context Understanding
Long-context modeling presents a significant challenge for transformer-based
large language models (LLMs) due to the quadratic complexity of the
self-attention mechanism and issues with length extrapolation caused by
pretraining exclusively on short inputs. Existing methods address computational
complexity through techniques such as text chunking, the kernel approach, and
structured attention, and tackle length extrapolation problems through
positional encoding, continued pretraining, and data engineering. These
approaches typically require to the document,
necessitating reading from the first to the last token. We contend that for
goal-oriented reading of long documents, such sequential access is not
necessary, and a proficiently trained model can learn to omit hundreds of less
pertinent tokens. Inspired by human reading behaviors and existing empirical
observations, we propose , a novel reading strategy
that enables transformers to efficiently process long documents without
examining every token. Experimental results from pretraining, fine-tuning, and
inference phases validate the efficacy of our method.Comment: Preliminary works for a Google Student Researcher Projec
When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models
Recent studies suggest that self-reflective prompting can significantly
enhance the reasoning capabilities of Large Language Models (LLMs). However,
the use of external feedback as a stop criterion raises doubts about the true
extent of LLMs' ability to emulate human-like self-reflection. In this paper,
we set out to clarify these capabilities under a more stringent evaluation
setting in which we disallow any kind of external feedback. Our findings under
this setting show a split: while self-reflection enhances performance in
TruthfulQA, it adversely affects results in HotpotQA. We conduct follow-up
analyses to clarify the contributing factors in these patterns, and find that
the influence of self-reflection is impacted both by reliability of accuracy in
models' initial responses, and by overall question difficulty: specifically,
self-reflection shows the most benefit when models are less likely to be
correct initially, and when overall question difficulty is higher. We also find
that self-reflection reduces tendency toward majority voting. Based on our
findings, we propose guidelines for decisions on when to implement
self-reflection. We release the codebase for reproducing our experiments at
https://github.com/yanhong-lbh/LLM-SelfReflection-Eval.Comment: NAACL 2024 Findings paper (Camera-Ready Version
Characteristics of the Hydrogen Electrode in High Temperature Steam Electrolysis Process
YSZ-electrolyte supported solid oxide electrolyzer cells (SOECs) using LSM-YSZ oxygen electrode but with three types of hydrogen electrode, Ni–SDC, Ni–YSZ and LSCM–YSZ have been fabricated and characterized under different steam contents in the feeding gas at 850°C. Electrochemical impedance spectra results show that cell resistances increase with the increase in steam concentrations under both open circuit voltage and electrolysis conditions, suggesting that electrolysis reaction becomes more difficult in high steam content. Pt reference electrode was applied to evaluate the contributions of the hydrogen electrode and oxygen electrode in the electrolysis process. Electrochemical impedance spectra and over potential of both electrodes were measured under the same testing conditions. Experimental results show that steam contents mainly affect the behavior of the hydrogen electrode but have little influence on the oxygen electrode. Further, contribution from the hydrogen electrode is dominant in the electrolysis process for Ni–based SOECs, but this contribution decreases for LSCM–based SOECs
An inflamed mood:studies on the role of inflammation in the pathophysiology and treatment outcome of major depressive disorder
High Efficiency Secondary Somatic Embryogenesis in Hovenia dulcis
Embryogenic callus was obtained from mature seed explants on medium supplemented with 2,4-dichlorophenoxyacetic acid. Primary somatic embryos (SEs) can only develop into abnormal plants. Well-developed SEs could be obtained through secondary somatic embryogenesis both in solid and liquid cultures. Temperature strongly affected induction frequency of secondary embryogenesis. Relatively high temperature (30∘C) and germinated SEs explants were effective for induction of secondary somatic embryos, and low temperature (20∘C) was more suitable for further embryo development, plantlet conversion, and transplant survival. Somatic embryos formed on agar medium had larger cotyledons than those of embryos formed in liquid medium. Supplementing 0.1 mg L−1 6-benzyladenine (BA) was effective for plant conversion; the rate of plant conversion was 43.3% in somatic embryos from solid culture and 36.5% in embryos from liquid culture. In vitro plants were successfully acclimatized in the greenhouse. The protocol established in this study will be helpful for large-scale vegetative propagation of this medicinal tree
A robust and active hybrid catalyst for facile oxygen reduction in solid oxide fuel cells
The sluggish oxygen reduction reaction (ORR) greatly reduces the energy efficiency of solid oxide fuel cells (SOFCs). Here we report our findings in dramatically enhancing the ORR kinetics and durability of the state-of-the-art La[subscript 0.6]Sr[subscript 0.4]Co[subscript 0.2]Fe[subscript 0.8]O[subscript 3](LSCF) cathode using a hybrid catalyst coating composed of a conformal PrNi[subscript 0.5]Mn[subscript 0.5]O[subscript 3](PNM) thin film with exsoluted PrOxnanoparticles. At 750°C, the hybrid catalyst-coated LSCF cathode shows a polarization resistance of ∼0.022 Ω cm[superscript 2], about 1/6 of that for a bare LSCF cathode (∼0.134 Ω cm[superscript 2]). Further, anode-supported cells with the hybrid catalyst-coated LSCF cathode demonstrate remarkable peak power densities (∼1.21 W cm[superscript -2]) while maintaining excellent durability (0.7 V for ∼500 h). Near Ambient X-ray Photoelectron Spectroscopy (XPS) and Near Edge X-Ray Absorption Fine Structure (NEXAFS) analyses, together with density functional theory (DFT) calculations, indicate that the oxygen-vacancy-rich surfaces of the PrOxnanoparticles greatly accelerate the rate of electron transfer in the ORR whereas the thin PNM film facilitates rapid oxide-ion transport while drastically enhancing the surface stability of the LSCF electrode
- …
