1,010 research outputs found

    Concadia: Towards Image-Based Text Generation with a Purpose

    Full text link
    Current deep learning models often achieve excellent results on benchmark image-to-text datasets but fail to generate texts that are useful in practice. We argue that to close this gap, it is vital to distinguish descriptions from captions based on their distinct communicative roles. Descriptions focus on visual features and are meant to replace an image (often to increase accessibility), whereas captions appear alongside an image to supply additional information. To motivate this distinction and help people put it into practice, we introduce the publicly available Wikipedia-based dataset Concadia consisting of 96,918 images with corresponding English-language descriptions, captions, and surrounding context. Using insights from Concadia, models trained on it, and a preregistered human-subjects experiment with human- and model-generated texts, we characterize the commonalities and differences between descriptions and captions. In addition, we show that, for generating both descriptions and captions, it is useful to augment image-to-text models with representations of the textual context in which the image appeared.Comment: Proceedings of EMNLP 202

    Critical Cultural Awareness: Contributions To A Globalizing Psychology

    Get PDF
    The number of psychologists whose work crosses cultural boundaries is increasing. Without a critical awareness of their own cultural grounding, they risk imposing the assumptions, concepts, practices, and values of U.S.-centered psychology on societies where they do not fit, as a brief example from the 2004 Indian Ocean tsunami shows. Hermeneutic thinkers offer theoretical resources for gaining cultural awareness. Culture, in the hermeneutic view, is the constellation of meanings that constitutes a way of life. Such cultural meanings-especially in the form of folk psychologies and moral visions-inevitably shape every psychology, including U.S. psychology. The insights of hermeneutics, as well as its conceptual resources and research approaches, open the way for psychological knowledge and practice that are more culturally situated

    Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

    Full text link
    Obtaining human-interpretable explanations of large, general-purpose language models is an urgent goal for AI safety. However, it is just as important that our interpretability methods are faithful to the causal dynamics underlying model behavior and able to robustly generalize to unseen inputs. Distributed Alignment Search (DAS) is a powerful gradient descent method grounded in a theory of causal abstraction that has uncovered perfect alignments between interpretable symbolic algorithms and small deep learning models fine-tuned for specific tasks. In the present paper, we scale DAS significantly by replacing the remaining brute-force search steps with learned parameters -- an approach we call Boundless DAS. This enables us to efficiently search for interpretable causal structure in large language models while they follow instructions. We apply Boundless DAS to the Alpaca model (7B parameters), which, off the shelf, solves a simple numerical reasoning problem. With Boundless DAS, we discover that Alpaca does this by implementing a causal model with two interpretable boolean variables. Furthermore, we find that the alignment of neural representations with these variables is robust to changes in inputs and instructions. These findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models. Our tool is extensible to larger LLMs and is released publicly at `https://github.com/stanfordnlp/pyvene`.Comment: NeurIPS 2023 with Author Correction

    pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

    Full text link
    Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene\textbf{pyvene}, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene\textbf{pyvene} supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene\textbf{pyvene} provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at https://github.com/stanfordnlp/pyvene.Comment: 8 pages, 3 figure

    Dosimetric Evaluation of PSMA PET-Delineated Dominant Intraprostatic Lesion Simultaneous Infield Boosts

    Get PDF
    Purpose: Prostate cancer is multifocal. However, there often exists a single dominant focus in the gland responsible for driving the biology of the disease. Dose escalation to the dominant lesion is a proposed strategy to increase tumor control. We applied radiobiological modeling to evaluate the dosimetric feasibility and benefit of dominant intraprostatic lesion simultaneous in-field boosts (DIL-SIB) to the gross tumor volume (GTV), defined using a novel molecular positron emission tomography (PET) probe (18F-DCFPyL) directed against prostate specific membrane antigen (PSMA). Methods and Materials: Patients with clinically localized, biopsy-proven prostate cancer underwent preoperative [ F]-DCFPyL PET/computed tomography (CT). DIL-SIB plans were generated by importing the PET/CT into the RayStation treatment planning system. GTV-PET for the DIL-SIB was defined by the highest %SUVmax (percentage of maximum standardized uptake value) that generated a biologically plausible volume. Volumetric arc–based plans incorporating prostate plus DIL-SIB treatment were generated. Tumor control probability (TCP) and normal tissue complication probability (NTCP) with fractionation schemes and boost doses specified in the FLAME (Investigate the Benefit of a Focal Lesion Ablative Microboost in Prostate Cancer; NCT01168479), PROFIT (Prostate Fractionated Irradiation Trial; NCT00304759), PACE (Prostate Advances in Comparative Evidence; NCT01584258), and hypoFLAME (Hypofractionated Focal Lesion Ablative Microboost in prostatE Cancer 2.0; NCT02853110) protocols were compared. Results: Comparative DIL-SIB plans for 6 men were generated from preoperative [ F]-DCFPyL PET/CT. Median boost GTV volume was 1.015 cm (0.42-1.83 cm ). Median minimum (D99%) DIL-SIB dose for F35 , F20 , F5 , and F5 were 97.3 Gy, 80.8 Gy, 46.5 Gy, and 51.5Gy. TCP within the GTV ranged from 84% to 88% for the standard plan and 95% to 96% for the DIL-SIB plans. Within the rest of the prostate, TCP ranged from 89% to 91% for the standard plans and 90% to 92% for the DIL-SIB plans. NTCP for the rectum NTCP was similar for the DIL-SIB plans (0.3%-2.7%) compared with standard plans (0.7%-2.6%). Overall, DIL-SIB plans yielded higher uncomplicated TCP (NTCP, 90%-94%) versus standard plans (NTCP, 83%-85%). Conclusions: PSMA PET provides a novel approach to define GTV for SIB-DIL dose escalation. Work is ongoing to validate PSMA PET-delineated GTV through correlation to coregistered postprostatectomy digitized histopathology. 18 18 3 3 BS BS BS BS

    The Evolution of Cuspy Triaxial Galaxies Harboring Central Black Holes

    Full text link
    We use numerical simulations to study the evolution of triaxial elliptical galaxies with central black holes. In contrast to earlier numerical studies which used galaxy models with central density ``cores,'' our galaxies have steep central cusps, like those observed in real ellipticals. As a black hole grows in these cuspy triaxial galaxies, the inner regions become rounder owing to chaos induced in the orbit families which populate the model. At larger radii, however, the models maintain their triaxiality, and orbital analyses show that centrophilic orbits there resist stochasticity over many dynamical times. While black hole induced evolution is strong in the inner regions of these galaxies, and reaches out beyond the nominal ``sphere of influence'' of a black hole, our simulations do not show evidence for a rapid {\it global} transformation of the host. The triaxiality of observed elliptical galaxies is therefore not inconsistent with the presence of supermassive black holes at their centers.Comment: 15 pages, 7 figures (1 color). Accepted for publication in Ap

    A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

    Full text link
    We respond to the recent paper by Makelov et al. (2023), which reviews subspace interchange intervention methods like distributed alignment search (DAS; Geiger et al. 2023) and claims that these methods potentially cause "interpretability illusions". We first review Makelov et al. (2023)'s technical notion of what an "interpretability illusion" is, and then we show that even intuitive and desirable explanations can qualify as illusions in this sense. As a result, their method of discovering "illusions" can reject explanations they consider "non-illusory". We then argue that the illusions Makelov et al. (2023) see in practice are artifacts of their training and evaluation paradigms. We close by emphasizing that, though we disagree with their core characterization, Makelov et al. (2023)'s examples and discussion have undoubtedly pushed the field of interpretability forward.Comment: 20 pages, 14 figure

    Integration of the Total Lightning Jump Algorithm into Current Operational Warning Environment Conceptual Models

    Get PDF
    Key points that this analysis will begin to address are: 1)What physically is going on in the cloud when there is a jump in lightning? - Updraft variations, ice fluxes. 2)How do these processes fit in with severe storm conceptual models? 3)What would this information provide an end user (i.e., the forecaster)? - Relate LJA to radar observations, like changes in reflectivity, MESH, VIL, etc. based multi-Doppler derived physical relationships 4) How do we best transistionthis algorithm into the warning decision process. The known relationship between lightning updraft strength/volume and precipitation ice mass production can be extended to the concept of the lightning jump. Examination of the first lightning jump times from 329 storms in Schultz et al. shows an increase in the mean reflectivity profile and mixed phase echo volume during the 10 minutes prior to the lightning jump. Limited dual-Doppler results show that the largest lightning jumps are well correlated in time with increases in updraft strength/volume and precipitation ice mass production; however, the smaller magnitude lightning jumps appear to have more subtle relationships to updraft and ice mass characteristics
    corecore