84 research outputs found
Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example
Post-hoc explanation methods are gaining popularity for interpreting,
understanding, and debugging neural networks. Most analyses using such methods
explain decisions in response to inputs drawn from the test set. However, the
test set may have few examples that trigger some model behaviors, such as
high-confidence failures or ambiguous classifications. To address these
challenges, we introduce a flexible model inspection framework: Bayes-TrEx.
Given a data distribution, Bayes-TrEx finds in-distribution examples with a
specified prediction confidence. We demonstrate several use cases of
Bayes-TrEx, including revealing highly confident (mis)classifications,
visualizing class boundaries via ambiguous examples, understanding novel-class
extrapolation behavior, and exposing neural network overconfidence. We use
Bayes-TrEx to study classifiers trained on CLEVR, MNIST, and Fashion-MNIST, and
we show that this framework enables more flexible holistic model analysis than
just inspecting the test set. Code is available at
https://github.com/serenabooth/Bayes-TrEx.Comment: Accepted at AAAI 202
Quality-Diversity Generative Sampling for Learning with Synthetic Data
Generative models can serve as surrogates for some real data sources by
creating synthetic training datasets, but in doing so they may transfer biases
to downstream tasks. We focus on protecting quality and diversity when
generating synthetic training datasets. We propose quality-diversity generative
sampling (QDGS), a framework for sampling data uniformly across a user-defined
measure space, despite the data coming from a biased generator. QDGS is a
model-agnostic framework that uses prompt guidance to optimize a quality
objective across measures of diversity for synthetically generated data,
without fine-tuning the generative model. Using balanced synthetic datasets
generated by QDGS, we first debias classifiers trained on color-biased shape
datasets as a proof-of-concept. By applying QDGS to facial data synthesis, we
prompt for desired semantic concepts, such as skin tone and age, to create an
intersectional dataset with a combined blend of visual features. Leveraging
this balanced data for training classifiers improves fairness while maintaining
accuracy on facial recognition benchmarks. Code available at:
https://github.com/Cylumn/qd-generative-sampling.Comment: Accepted at AAAI 2024; 7 pages main, 12 pages total, 9 figure
Models of human preference for learning reward functions
The utility of reinforcement learning is limited by the alignment of reward
functions with the interests of human stakeholders. One promising method for
alignment is to learn the reward function from human-generated preferences
between pairs of trajectory segments, a type of reinforcement learning from
human feedback (RLHF). These human preferences are typically assumed to be
informed solely by partial return, the sum of rewards along each segment. We
find this assumption to be flawed and propose modeling human preferences
instead as informed by each segment's regret, a measure of a segment's
deviation from optimal decision-making. Given infinitely many preferences
generated according to regret, we prove that we can identify a reward function
equivalent to the reward function that generated those preferences, and we
prove that the previous partial return model lacks this identifiability
property in multiple contexts. We empirically show that our proposed regret
preference model outperforms the partial return preference model with finite
training data in otherwise the same setting. Additionally, we find that our
proposed regret preference model better predicts real human preferences and
also learns reward functions from these preferences that lead to policies that
are better human-aligned. Overall, this work establishes that the choice of
preference model is impactful, and our proposed regret preference model
provides an improvement upon a core assumption of recent research. We have open
sourced our experimental code, the human preferences dataset we gathered, and
our training and preference elicitation interfaces for gathering a such a
dataset.Comment: 16 pages (40 pages with references and appendix), 23 figure
The Role of the Reducible Dopant in Solid Electrolyte-Lithium Metal Interfaces
Garnet solid electrolytes, of the form Li7La3Zr2O12 (LLZO), remain an enticing prospect for solid-state batteries owing to their chemical and electrochemical stability in contact with metallic lithium. Dopants, often employed to stabilize the fast ion conducting cubic garnet phase, typically have no effect on the chemical stability of LLZO in contact with Li metal but have been found recently to impact the properties of the Li/garnet interface. For dopants more “reducible” than Zr (e.g., Nb and Ti), contradictory reports of either raised or reduced Li/garnet interfacial resistances have been attributed to the dopant. Here, we investigate the Li/LLZO interface in W-doped Li7La3Zr2O12 (LLZWO) to determine the influence of a “reducible” dopant on the electrochemical properties of the Li/garnet interface. Single-phase LLZWO is synthesized by a new sol–gel approach and densified by spark plasma sintering. Interrogating the resulting Li/LLZWO interface/interphase by impedance, muon spin relaxation and X-ray absorption spectroscopies uncover the significant impact of surface lithiation on electrochemical performance. Upon initial contact, an interfacial reaction occurs between LLZWO and Li metal, leading to the reduction of surface W6+ centers and an initial reduction of the Li/garnet interfacial resistance. Propagation of this surface reaction, driven by the high mobility of Li+ ions through the grain surfaces, thickens the resistive interphases throughout the material and impedes Li+ ion transport between the grains. The resulting high resistance accumulating in the system impedes cycling at high current densities. These insights shed light on the nature of lithiated interfaces in garnet solid electrolytes containing a reducible dopant where high Li+ ion mobility and the reducible nature of the dopant can significantly affect electrochemical performance
Rest-frame UV line emission from the intergalactic medium at 2<z<5
Rest-frame UV emission lines offer the possibility to directly image the gas
around high-redshift galaxies with upcoming optical instruments. We use a suite
of large, hydrodynamical simulations to predict the nature and detectability of
emission lines from the intergalactic medium at 2<z<5. The brightest emission
comes from HI Ly-alpha and the strongest metal line, CIII, is about an order of
magnitude fainter, although HI Ly-alpha may be fainter if the gas is
self-shielded to the UV background or if dust is important. The highest surface
brightness regions for CIV, SiIII, SiIV and OVI are fainter than CIII by
factors of a few. The NV and NeVIII lines, as well as HeII H-alpha, are
substantially weaker but their maximum surface brightnesses still exceed 100
photon/cm^2/s/sr at z=2 (for 2" pixels). Lower ionisation lines arise in denser
and colder gas that produces clumpier emission. The brightest HI Ly-alpha
emission arises in highly overdense gas, but the highest surface brightness
emission from high-ionisation metal lines traces a wider range of
overdensities. Bright metal-line emission traces gas with temperatures close to
the peak of the corresponding emissivity curve. While HI Ly-alpha, HeII
H-alpha, CIII, SiIII, and SiIV are excellent probes of cold accretion flows and
the colder parts of outflows, CIV, NV, OVI, and NeVIII are powerful tracers of
the diffuse WHIM and galactic winds. A comparison of results from simulations
with varying physical prescriptions demonstrates that the predictions for the
brighter metal-line emission are robust to within factors of a few. Several
emission lines from the high-redshift IGM will become detectable in the near
future, possibly starting with the Cosmic Web Imager on Palomar. MUSE and the
Keck Cosmic Web Imager have the potential to revolutionise studies of the
interactions between high-redshift galaxies and their environment. (Abridged)Comment: 21 pages, 17 figures. Accepted for publication by MNRA
Anion-polarisation--directed short-range-order in antiperovskite LiFeSO
Short-range ordering in cation-disordered cathodes can have a significant
effect on their electrochemical properties. Here, we characterise the cation
short-range order in the antiperovskite cathode material LiFeSO, using
density functional theory, Monte Carlo simulations, and synchrotron X-ray
pair-distribution-function data. We predict partial short-range
cation-ordering, characterised by favourable OLiFe oxygen coordination
with a preference for polar cis-OLiFe over non-polar
trans-OLiFe configurations. This preference for polar cation
configurations produces long-range disorder, in agreement with experimental
data. The predicted short-range-order preference contrasts with that for a
simple point-charge model, which instead predicts preferential
trans-OLiFe oxygen coordination and corresponding long-range
crystallographic order. The absence of long-range order in LiFeSO can
therefore be attributed to the relative stability of cis-OLiFe and
other non-OLiFe oxygen-coordination motifs. We show that this effect is
associated with the polarisation of oxide and sulfide anions in polar
coordination environments, which stabilises these polar short-range cation
orderings. We propose similar anion-polarisation-directed short-range-ordering
may be present in other heterocationic materials that contain cations with
different formal charges. Our analysis also illustrates the limitations of
using simple point-charge models to predict the structure of cation-disordered
materials, where other factors, such as anion polarisation, may play a critical
role in directing both short- and long-range structural correlations
IEEE P7001: A proposed standard on transparency
This paper describes IEEE P7001, a new draft standard on transparency of autonomous systems. In the paper, we outline the development and structure of the draft standard. We present the rationale for transparency as a measurable, testable property. We outline five stakeholder groups: users, the general public and bystanders, safety certification agencies, incident/accident investigators and lawyers/expert witnesses, and explain the thinking behind the normative definitions of “levels” of transparency for each stakeholder group in P7001. The paper illustrates the application of P7001 through worked examples of both specification and assessment of fictional autonomous systems
- …