15 research outputs found
Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Estimating what would be an individual's potential response to varying levels
of exposure to a treatment is of high practical relevance for several important
fields, such as healthcare, economics and public policy. However, existing
methods for learning to estimate counterfactual outcomes from observational
data are either focused on estimating average dose-response curves, or limited
to settings with only two treatments that do not have an associated dosage
parameter. Here, we present a novel machine-learning approach towards learning
counterfactual representations for estimating individual dose-response curves
for any number of treatments with continuous dosage parameters with neural
networks. Building on the established potential outcomes framework, we
introduce performance metrics, model selection criteria, model architectures,
and open benchmarks for estimating individual dose-response curves. Our
experiments show that the methods developed in this work set a new
state-of-the-art in estimating individual dose-response
Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks
Robustness has become an important consideration in deep learning. With the
help of explainable AI, mismatches between an explained model's decision
strategy and the user's domain knowledge (e.g. Clever Hans effects) have been
identified as a starting point for improving faulty models. However, it is less
clear what to do when the user and the explanation agree. In this paper, we
demonstrate that acceptance of explanations by the user is not a guarantee for
a machine learning model to be robust against Clever Hans effects, which may
remain undetected. Such hidden flaws of the model can nevertheless be
mitigated, and we demonstrate this by contributing a new method,
Explanation-Guided Exposure Minimization (EGEM), that preemptively prunes
variations in the ML model that have not been the subject of positive
explanation feedback. Experiments demonstrate that our approach leads to models
that strongly reduce their reliance on hidden Clever Hans strategies, and
consequently achieve higher accuracy on new data.Comment: 18 pages + supplemen
Human alignment of neural network representations
Today's computer vision models achieve human or near-human level performance
across a wide variety of vision tasks. However, their architectures, data, and
learning algorithms differ in numerous ways from those that give rise to human
vision. In this paper, we investigate the factors that affect the alignment
between the representations learned by neural networks and human mental
representations inferred from behavioral responses. We find that model scale
and architecture have essentially no effect on the alignment with human
behavioral responses, whereas the training dataset and objective function both
have a much larger impact. These findings are consistent across three datasets
of human similarity judgments collected using two different tasks. Linear
transformations of neural network representations learned from behavioral
responses from one dataset substantially improve alignment with human
similarity judgments on the other two datasets. In addition, we find that some
human concepts such as food and animals are well-represented by neural networks
whereas others such as royal or sports-related objects are not. Overall,
although models trained on larger, more diverse datasets achieve better
alignment with humans than models trained on ImageNet alone, our results
indicate that scaling alone is unlikely to be sufficient to train neural
networks with conceptual representations that match those used by humans.Comment: Accepted for publication at ICLR 202
Improving neural network representations using human similarity judgments
Deep neural networks have reached human-level performance on many computer
vision tasks. However, the objectives used to train these networks enforce only
that similar images are embedded at similar locations in the representation
space, and do not directly constrain the global structure of the resulting
space. Here, we explore the impact of supervising this global structure by
linearly aligning it with human similarity judgments. We find that a naive
approach leads to large changes in local representational structure that harm
downstream performance. Thus, we propose a novel method that aligns the global
structure of representations while preserving their local structure. This
global-local transform considerably improves accuracy across a variety of
few-shot learning and anomaly detection tasks. Our results indicate that human
visual representations are globally organized in a way that facilitates
learning from few examples, and incorporating this global structure into neural
network representations improves performance on downstream tasks.Comment: Published as a conference paper at NeurIPS 202
VISION - Vienna survey in Orion. III. Young stellar objects in Orion A
38 pages, 25 figures, Accepted for publication by A&A. Reproduced with permission from Astronomy & Astrophysics. © 2018 ESOWe extend and refine the existing young stellar object (YSO) catalogs for the Orion A molecular cloud, the closest massive star-forming region to Earth. This updated catalog is driven by the large spatial coverage (18.3 deg^2, ~950 pc^2), seeing limited resolution (~0.7''), and sensitivity (Ks<19 mag) of the ESO-VISTA near-infrared survey of the Orion A cloud (VISION). Combined with archival mid- to far-infrared data, the VISTA data allow for a refined and more robust source selection. We estimate that among previously known protostars and pre-main-sequence stars with disks, source contamination levels (false positives) are at least ∼7% and ∼2.5%, respectively, mostly due to background galaxies and nebulosities. We identify 274 new YSO candidates using VISTA/Spitzer based selections within previously analyzed regions, and VISTA/WISE based selections to add sources in the surroundings, beyond previously analyzed regions. The WISE selection method recovers about 59% of the known YSOs in Orion A's low-mass star-forming part L1641, which shows what can be achieved by the all-sky WISE survey in combination with deep near-infrared data in regions without the influence of massive stars. The new catalog contains 2978 YSOs, which were classified based on the de-reddened mid-infrared spectral index into 188 protostars, 184 flat-spectrum sources, and 2606 pre-main-sequence stars with circumstellar disks. We find a statistically significant difference in the spatial distribution of the three evolutionary classes with respect to regions of high dust column-density, confirming that flat-spectrum sources are at a younger evolutionary phase compared to Class IIs, and are not a sub-sample seen at particular viewing angles.Peer reviewedFinal Accepted Versio
Preemptively pruning Clever-Hans strategies in deep neural networks
Robustness has become an important consideration in deep learning. With the help of explainable AI, mismatches between an explained model’s decision strategy and the user’s domain knowledge (e.g. Clever Hans effects) have been identified as a starting point for improving faulty models. However, it is less clear what to do when the user and the explanation agree. In this paper, we demonstrate that acceptance of explanations by the user is not a guarantee for a machine learning model to be robust against Clever Hans effects, which may remain undetected. Such hidden flaws of the model can nevertheless be mitigated, and we demonstrate this by contributing a new method, Explanation-Guided Exposure Minimization (EGEM), that preemptively prunes variations in the ML model that have not been the subject of positive explanation feedback. Experiments demonstrate that our approach leads to models that strongly reduce their reliance on hidden Clever Hans strategies, and consequently achieve higher accuracy on new data
Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Estimating what would be an individual's potential response to varying levels of exposure to a treatment is of high practical relevance for several important fields, such as healthcare, economics and public policy. However, existing methods for learning to estimate counterfactual outcomes from observational data are either focused on estimating average dose-response curves, or limited to settings with only two treatments that do not have an associated dosage parameter. Here, we present a novel machine-learning approach towards learning counterfactual representations for estimating individual dose-response curves for any number of treatments with continuous dosage parameters with neural networks. Building on the established potential outcomes framework, we introduce performance metrics, model selection criteria, model architectures, and open benchmarks for estimating individual dose-response curves. Our experiments show that the methods developed in this work set a new state-of-the-art in estimating individual dose-response.ISSN:2159-5399ISSN:2374-346