774 research outputs found

    LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations

    Full text link
    Can a Large Language Model (LLM) solve simple abstract reasoning problems? We explore this broad question through a systematic analysis of GPT on the Abstraction and Reasoning Corpus (ARC), a representative benchmark of abstract reasoning ability from limited examples in which solutions require some "core knowledge" of concepts such as objects, goal states, counting, and basic geometry. GPT-4 solves only 13/50 of the most straightforward ARC tasks when using textual encodings for their two-dimensional input-output grids. Our failure analysis reveals that GPT-4's capacity to identify objects and reason about them is significantly influenced by the sequential nature of the text that represents an object within a text encoding of a task. To test this hypothesis, we design a new benchmark, the 1D-ARC, which consists of one-dimensional (array-like) tasks that are more conducive to GPT-based reasoning, and where it indeed performs better than on the (2D) ARC. To alleviate this issue, we propose an object-based representation that is obtained through an external tool, resulting in nearly doubling the performance on solved ARC tasks and near-perfect scores on the easier 1D-ARC. Although the state-of-the-art GPT-4 is unable to "reason" perfectly within non-language domains such as the 1D-ARC or a simple ARC subset, our study reveals that the use of object-based representations can significantly improve its reasoning ability. Visualizations, GPT logs, and data are available at https://khalil-research.github.io/LLM4ARC.Comment: 17 pages, 11 figure

    High-fidelity imaging of a band insulator in a three-dimensional optical lattice clock

    Full text link
    We report on the observation of a high-density, band insulating state in a three-dimensional optical lattice clock. Filled with a nuclear-spin polarized degenerate Fermi gas of 87Sr, the 3D lattice has one atom per site in the ground motional state, thus guarding against frequency shifts due to contact interactions. At this high density where the average distance between atoms is comparable to the probe wavelength, standard imaging techniques suffer from large systematic errors. To spatially probe frequency shifts in the clock and measure thermodynamic properties of this system, accurate imaging techniques at high optical depths are required. Using a combination of highly saturated fluorescence and absorption imaging, we confirm the density distribution in our 3D optical lattice in agreement with a single spin band insulating state. Combining our clock platform with this high filling fraction opens the door to studying new classes of long-lived, many-body states arising from dipolar interactions.Comment: 10 pages, 8 figure

    Observation of mHz-level cooperative Lamb shifts in an optical atomic clock

    Full text link
    We report on the direct observation of resonant electric dipole-dipole interactions in a cubic array of atoms in the many-excitation limit. The interactions, mediated by single-atom couplings to the shared electromagnetic vacuum, are shown to produce spatially-dependent cooperative Lamb shifts when spectroscopically interrogating the mHz-wide optical clock transition in strontium-87. We show that the ensemble-averaged shifts can be suppressed below the level of evaluated systematic uncertainties for state-of-the-art optical atomic clocks. Additionally, we demonstrate that excitation of the atomic dipoles near a Bragg angle can enhance these effects by nearly an order of magnitude compared to non-resonant geometries. Given the remarkable precision of frequency measurements and the high accuracy of the modeled response, our work demonstrates that such a clock is a novel platform for studies of the quantum many-body physics of spins with long-range interactions mediated by propagating photons

    Psychological Safety and Norm Clarity in Software Engineering Teams

    Full text link
    In the software engineering industry today, companies primarily conduct their work in teams. To increase organizational productivity, it is thus crucial to know the factors that affect team effectiveness. Two team-related concepts that have gained prominence lately are psychological safety and team norms. Still, few studies exist that explore these in a software engineering context. Therefore, with the aim of extending the knowledge of these concepts, we examined if psychological safety and team norm clarity associate positively with software developers' self-assessed team performance and job satisfaction, two important elements of effectiveness. We collected industry survey data from practitioners (N = 217) in 38 development teams working for five different organizations. The result of multiple linear regression analyses indicates that both psychological safety and team norm clarity predict team members' self-assessed performance and job satisfaction. The findings also suggest that clarity of norms is a stronger (30\% and 71\% stronger, respectively) predictor than psychological safety. This research highlights the need to examine, in more detail, the relationship between social norms and software development. The findings of this study could serve as an empirical baseline for such, future work.Comment: Submitted to CHASE'201
    • …
    corecore