1,010 research outputs found
Concadia: Towards Image-Based Text Generation with a Purpose
Current deep learning models often achieve excellent results on benchmark
image-to-text datasets but fail to generate texts that are useful in practice.
We argue that to close this gap, it is vital to distinguish descriptions from
captions based on their distinct communicative roles. Descriptions focus on
visual features and are meant to replace an image (often to increase
accessibility), whereas captions appear alongside an image to supply additional
information. To motivate this distinction and help people put it into practice,
we introduce the publicly available Wikipedia-based dataset Concadia consisting
of 96,918 images with corresponding English-language descriptions, captions,
and surrounding context. Using insights from Concadia, models trained on it,
and a preregistered human-subjects experiment with human- and model-generated
texts, we characterize the commonalities and differences between descriptions
and captions. In addition, we show that, for generating both descriptions and
captions, it is useful to augment image-to-text models with representations of
the textual context in which the image appeared.Comment: Proceedings of EMNLP 202
Critical Cultural Awareness: Contributions To A Globalizing Psychology
The number of psychologists whose work crosses cultural boundaries is increasing. Without a critical awareness of their own cultural grounding, they risk imposing the assumptions, concepts, practices, and values of U.S.-centered psychology on societies where they do not fit, as a brief example from the 2004 Indian Ocean tsunami shows. Hermeneutic thinkers offer theoretical resources for gaining cultural awareness. Culture, in the hermeneutic view, is the constellation of meanings that constitutes a way of life. Such cultural meanings-especially in the form of folk psychologies and moral visions-inevitably shape every psychology, including U.S. psychology. The insights of hermeneutics, as well as its conceptual resources and research approaches, open the way for psychological knowledge and practice that are more culturally situated
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
Obtaining human-interpretable explanations of large, general-purpose language
models is an urgent goal for AI safety. However, it is just as important that
our interpretability methods are faithful to the causal dynamics underlying
model behavior and able to robustly generalize to unseen inputs. Distributed
Alignment Search (DAS) is a powerful gradient descent method grounded in a
theory of causal abstraction that has uncovered perfect alignments between
interpretable symbolic algorithms and small deep learning models fine-tuned for
specific tasks. In the present paper, we scale DAS significantly by replacing
the remaining brute-force search steps with learned parameters -- an approach
we call Boundless DAS. This enables us to efficiently search for interpretable
causal structure in large language models while they follow instructions. We
apply Boundless DAS to the Alpaca model (7B parameters), which, off the shelf,
solves a simple numerical reasoning problem. With Boundless DAS, we discover
that Alpaca does this by implementing a causal model with two interpretable
boolean variables. Furthermore, we find that the alignment of neural
representations with these variables is robust to changes in inputs and
instructions. These findings mark a first step toward faithfully understanding
the inner-workings of our ever-growing and most widely deployed language
models. Our tool is extensible to larger LLMs and is released publicly at
`https://github.com/stanfordnlp/pyvene`.Comment: NeurIPS 2023 with Author Correction
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions
Interventions on model-internal states are fundamental operations in many
areas of AI, including model editing, steering, robustness, and
interpretability. To facilitate such research, we introduce ,
an open-source Python library that supports customizable interventions on a
range of different PyTorch modules. supports complex
intervention schemes with an intuitive configuration format, and its
interventions can be static or include trainable parameters. We show how
provides a unified and extensible framework for performing
interventions on neural models and sharing the intervened upon models with
others. We illustrate the power of the library via interpretability analyses
using causal abstraction and knowledge localization. We publish our library
through Python Package Index (PyPI) and provide code, documentation, and
tutorials at https://github.com/stanfordnlp/pyvene.Comment: 8 pages, 3 figure
Dosimetric Evaluation of PSMA PET-Delineated Dominant Intraprostatic Lesion Simultaneous Infield Boosts
Purpose: Prostate cancer is multifocal. However, there often exists a single dominant focus in the gland responsible for driving the biology of the disease. Dose escalation to the dominant lesion is a proposed strategy to increase tumor control. We applied radiobiological modeling to evaluate the dosimetric feasibility and benefit of dominant intraprostatic lesion simultaneous in-field boosts (DIL-SIB) to the gross tumor volume (GTV), defined using a novel molecular positron emission tomography (PET) probe (18F-DCFPyL) directed against prostate specific membrane antigen (PSMA). Methods and Materials: Patients with clinically localized, biopsy-proven prostate cancer underwent preoperative [ F]-DCFPyL PET/computed tomography (CT). DIL-SIB plans were generated by importing the PET/CT into the RayStation treatment planning system. GTV-PET for the DIL-SIB was defined by the highest %SUVmax (percentage of maximum standardized uptake value) that generated a biologically plausible volume. Volumetric arc–based plans incorporating prostate plus DIL-SIB treatment were generated. Tumor control probability (TCP) and normal tissue complication probability (NTCP) with fractionation schemes and boost doses specified in the FLAME (Investigate the Benefit of a Focal Lesion Ablative Microboost in Prostate Cancer; NCT01168479), PROFIT (Prostate Fractionated Irradiation Trial; NCT00304759), PACE (Prostate Advances in Comparative Evidence; NCT01584258), and hypoFLAME (Hypofractionated Focal Lesion Ablative Microboost in prostatE Cancer 2.0; NCT02853110) protocols were compared. Results: Comparative DIL-SIB plans for 6 men were generated from preoperative [ F]-DCFPyL PET/CT. Median boost GTV volume was 1.015 cm (0.42-1.83 cm ). Median minimum (D99%) DIL-SIB dose for F35 , F20 , F5 , and F5 were 97.3 Gy, 80.8 Gy, 46.5 Gy, and 51.5Gy. TCP within the GTV ranged from 84% to 88% for the standard plan and 95% to 96% for the DIL-SIB plans. Within the rest of the prostate, TCP ranged from 89% to 91% for the standard plans and 90% to 92% for the DIL-SIB plans. NTCP for the rectum NTCP was similar for the DIL-SIB plans (0.3%-2.7%) compared with standard plans (0.7%-2.6%). Overall, DIL-SIB plans yielded higher uncomplicated TCP (NTCP, 90%-94%) versus standard plans (NTCP, 83%-85%). Conclusions: PSMA PET provides a novel approach to define GTV for SIB-DIL dose escalation. Work is ongoing to validate PSMA PET-delineated GTV through correlation to coregistered postprostatectomy digitized histopathology. 18 18 3 3 BS BS BS BS
The Evolution of Cuspy Triaxial Galaxies Harboring Central Black Holes
We use numerical simulations to study the evolution of triaxial elliptical
galaxies with central black holes. In contrast to earlier numerical studies
which used galaxy models with central density ``cores,'' our galaxies have
steep central cusps, like those observed in real ellipticals. As a black hole
grows in these cuspy triaxial galaxies, the inner regions become rounder owing
to chaos induced in the orbit families which populate the model. At larger
radii, however, the models maintain their triaxiality, and orbital analyses
show that centrophilic orbits there resist stochasticity over many dynamical
times. While black hole induced evolution is strong in the inner regions of
these galaxies, and reaches out beyond the nominal ``sphere of influence'' of a
black hole, our simulations do not show evidence for a rapid {\it global}
transformation of the host. The triaxiality of observed elliptical galaxies is
therefore not inconsistent with the presence of supermassive black holes at
their centers.Comment: 15 pages, 7 figures (1 color). Accepted for publication in Ap
A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments
We respond to the recent paper by Makelov et al. (2023), which reviews
subspace interchange intervention methods like distributed alignment search
(DAS; Geiger et al. 2023) and claims that these methods potentially cause
"interpretability illusions". We first review Makelov et al. (2023)'s technical
notion of what an "interpretability illusion" is, and then we show that even
intuitive and desirable explanations can qualify as illusions in this sense. As
a result, their method of discovering "illusions" can reject explanations they
consider "non-illusory". We then argue that the illusions Makelov et al. (2023)
see in practice are artifacts of their training and evaluation paradigms. We
close by emphasizing that, though we disagree with their core characterization,
Makelov et al. (2023)'s examples and discussion have undoubtedly pushed the
field of interpretability forward.Comment: 20 pages, 14 figure
Integration of the Total Lightning Jump Algorithm into Current Operational Warning Environment Conceptual Models
Key points that this analysis will begin to address are: 1)What physically is going on in the cloud when there is a jump in lightning? - Updraft variations, ice fluxes. 2)How do these processes fit in with severe storm conceptual models? 3)What would this information provide an end user (i.e., the forecaster)? - Relate LJA to radar observations, like changes in reflectivity, MESH, VIL, etc. based multi-Doppler derived physical relationships 4) How do we best transistionthis algorithm into the warning decision process. The known relationship between lightning updraft strength/volume and precipitation ice mass production can be extended to the concept of the lightning jump. Examination of the first lightning jump times from 329 storms in Schultz et al. shows an increase in the mean reflectivity profile and mixed phase echo volume during the 10 minutes prior to the lightning jump. Limited dual-Doppler results show that the largest lightning jumps are well correlated in time with increases in updraft strength/volume and precipitation ice mass production; however, the smaller magnitude lightning jumps appear to have more subtle relationships to updraft and ice mass characteristics
- …