784 research outputs found
LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations
Can a Large Language Model (LLM) solve simple abstract reasoning problems? We
explore this broad question through a systematic analysis of GPT on the
Abstraction and Reasoning Corpus (ARC), a representative benchmark of abstract
reasoning ability from limited examples in which solutions require some "core
knowledge" of concepts such as objects, goal states, counting, and basic
geometry. GPT-4 solves only 13/50 of the most straightforward ARC tasks when
using textual encodings for their two-dimensional input-output grids. Our
failure analysis reveals that GPT-4's capacity to identify objects and reason
about them is significantly influenced by the sequential nature of the text
that represents an object within a text encoding of a task. To test this
hypothesis, we design a new benchmark, the 1D-ARC, which consists of
one-dimensional (array-like) tasks that are more conducive to GPT-based
reasoning, and where it indeed performs better than on the (2D) ARC. To
alleviate this issue, we propose an object-based representation that is
obtained through an external tool, resulting in nearly doubling the performance
on solved ARC tasks and near-perfect scores on the easier 1D-ARC. Although the
state-of-the-art GPT-4 is unable to "reason" perfectly within non-language
domains such as the 1D-ARC or a simple ARC subset, our study reveals that the
use of object-based representations can significantly improve its reasoning
ability. Visualizations, GPT logs, and data are available at
https://khalil-research.github.io/LLM4ARC.Comment: 17 pages, 11 figure
High-fidelity imaging of a band insulator in a three-dimensional optical lattice clock
We report on the observation of a high-density, band insulating state in a
three-dimensional optical lattice clock. Filled with a nuclear-spin polarized
degenerate Fermi gas of 87Sr, the 3D lattice has one atom per site in the
ground motional state, thus guarding against frequency shifts due to contact
interactions. At this high density where the average distance between atoms is
comparable to the probe wavelength, standard imaging techniques suffer from
large systematic errors. To spatially probe frequency shifts in the clock and
measure thermodynamic properties of this system, accurate imaging techniques at
high optical depths are required. Using a combination of highly saturated
fluorescence and absorption imaging, we confirm the density distribution in our
3D optical lattice in agreement with a single spin band insulating state.
Combining our clock platform with this high filling fraction opens the door to
studying new classes of long-lived, many-body states arising from dipolar
interactions.Comment: 10 pages, 8 figure
Observation of mHz-level cooperative Lamb shifts in an optical atomic clock
We report on the direct observation of resonant electric dipole-dipole
interactions in a cubic array of atoms in the many-excitation limit. The
interactions, mediated by single-atom couplings to the shared electromagnetic
vacuum, are shown to produce spatially-dependent cooperative Lamb shifts when
spectroscopically interrogating the mHz-wide optical clock transition in
strontium-87. We show that the ensemble-averaged shifts can be suppressed below
the level of evaluated systematic uncertainties for state-of-the-art optical
atomic clocks. Additionally, we demonstrate that excitation of the atomic
dipoles near a Bragg angle can enhance these effects by nearly an order of
magnitude compared to non-resonant geometries. Given the remarkable precision
of frequency measurements and the high accuracy of the modeled response, our
work demonstrates that such a clock is a novel platform for studies of the
quantum many-body physics of spins with long-range interactions mediated by
propagating photons
Psychological Safety and Norm Clarity in Software Engineering Teams
In the software engineering industry today, companies primarily conduct their
work in teams. To increase organizational productivity, it is thus crucial to
know the factors that affect team effectiveness. Two team-related concepts that
have gained prominence lately are psychological safety and team norms. Still,
few studies exist that explore these in a software engineering context.
Therefore, with the aim of extending the knowledge of these concepts, we
examined if psychological safety and team norm clarity associate positively
with software developers' self-assessed team performance and job satisfaction,
two important elements of effectiveness.
We collected industry survey data from practitioners (N = 217) in 38
development teams working for five different organizations. The result of
multiple linear regression analyses indicates that both psychological safety
and team norm clarity predict team members' self-assessed performance and job
satisfaction. The findings also suggest that clarity of norms is a stronger
(30\% and 71\% stronger, respectively) predictor than psychological safety.
This research highlights the need to examine, in more detail, the
relationship between social norms and software development. The findings of
this study could serve as an empirical baseline for such, future work.Comment: Submitted to CHASE'201
- …