1,668 research outputs found

    Quantifying causal influences

    Full text link
    Many methods for causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between nn variables. Given the joint distribution on all these variables, the DAG contains all information about how intervening on one variable changes the distribution of the other n1n-1 variables. However, quantifying the causal influence of one variable on another one remains a nontrivial question. Here we propose a set of natural, intuitive postulates that a measure of causal strength should satisfy. We then introduce a communication scenario, where edges in a DAG play the role of channels that can be locally corrupted by interventions. Causal strength is then the relative entropy distance between the old and the new distribution. Many other measures of causal strength have been proposed, including average causal effect, transfer entropy, directed information, and information flow. We explain how they fail to satisfy the postulates on simple DAGs of 3\leq3 nodes. Finally, we investigate the behavior of our measure on time-series, supporting our claims with experiments on simulated data.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1145 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Profiling compliers and non-compliers for instrumental-variable analysis

    Get PDF
    Instrumental-variable (IV) estimation is an essential method for applied researchers across the social and behavioral sciences who analyze randomized control trials marred by noncompliance or leverage partially exogenous treatment variation in observational studies. The potential outcome framework is a popular model to motivate the assumptions underlying the identification of the local average treatment effect (LATE) and to stratify the sample into compliers, always-takers, and never-takers. However, applied research has thus far paid little attention to the characteristics of compliers and noncompliers. Yet, profiling compliers and noncompliers is necessary to understand what subpopulation the researcher is making inferences about and an important first step in evaluating the external validity (or lack thereof) of the LATE estimated for compliers. In this letter, we discuss the assumptions necessary for profiling, which are weaker than the assumptions necessary for identifying the LATE if the instrument is randomly assigned. We introduce a simple and general method to characterize compliers, always-takers, and never-takers in terms of their covariates and provide easy-to-use software in R and STATA that implements our estimator. We hope that our method and software facilitate the profiling of compliers and noncompliers as a standard practice accompanying any IV analysis

    Avoiding Discrimination through Causal Reasoning

    Full text link
    Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively. Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from "What is the right fairness criterion?" to "What do we want to assume about the causal data generating process?" Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them.Comment: Advances in Neural Information Processing Systems 30, 2017 http://papers.nips.cc/paper/6668-avoiding-discrimination-through-causal-reasonin

    DashQL -- Complete Analysis Workflows with SQL

    Full text link
    We present DashQL, a language that describes complete analysis workflows in self-contained scripts. DashQL combines SQL, the grammar of relational database systems, with a grammar of graphics in a grammar of analytics. It supports preparing and visualizing arbitrarily complex SQL statements in a single coherent language. The proximity to SQL facilitates holistic optimizations of analysis workflows covering data input, encoding, transformations, and visualizations. These optimizations use model and query metadata for visualization-driven aggregation, remote predicate pushdown, and adaptive materialization. We introduce the DashQL language as an extension of SQL and describe the efficient and interactive processing of text-based analysis workflows

    Do Immigrants Move to Welfare? Subnational Evidence from Switzerland

    Get PDF
    The welfare magnet hypothesis holds that immigrants are likely to relocate to regions with generous welfare benefits. Although this assumption has motivated extensive reforms to immigration policy and social programs, the empirical evidence remains contested. In this study, we assess detailed administrative records from Switzerland covering the full population of social assistance recipients between 2005 and 2015. By leveraging local variations in cash transfers and exogenous shocks to benefit levels, we identify how benefits shape intracountry residential decisions. We find limited evidence that immigrants systematically move to localities with higher benefits. The lack of significant welfare migration within a context characterized by high variance in benefits and low barriers to movement suggests that the prevalence of this phenomenon may be overstated. These findings have important implications in the European setting where subnational governments often possess discretion over welfare and parties frequently mobilize voters around the issue of “benefit tourism.”

    Data Navigator: An accessibility-centered data navigation toolkit

    Full text link
    Making data visualizations accessible for people with disabilities remains a significant challenge in current practitioner efforts. Existing visualizations often lack an underlying navigable structure, fail to engage necessary input modalities, and rely heavily on visual-only rendering practices. These limitations exclude people with disabilities, especially users of assistive technologies. To address these challenges, we present Data Navigator: a system built on a dynamic graph structure, enabling developers to construct navigable lists, trees, graphs, and flows as well as spatial, diagrammatic, and geographic relations. Data Navigator supports a wide range of input modalities: screen reader, keyboard, speech, gesture detection, and even fabricated assistive devices. We present 3 case examples with Data Navigator, demonstrating we can provide accessible navigation structures on top of raster images, integrate with existing toolkits at scale, and rapidly develop novel prototypes. Data Navigator is a step towards making accessible data visualizations easier to design and implement.Comment: To appear at IEEE VIS 202

    Large-scale multilayer architecture of single-atom arrays with individual addressability

    Full text link
    We report on the realization of large-scale 3D multilayer configurations of planar arrays of individual neutral atoms with immediate applications in quantum science and technology: a microlens-generated Talbot optical lattice In this novel platform, the single-beam illumination of a microlens array constitutes a structurally robust and wavelength-universal method for the realization of 3D atom arrays with favourable scaling properties due to the inherent self-imaging of the focal structure. Thus, 3D scaling comes without the requirement of extra resources. We demonstrate the trapping and imaging of individual rubidium atoms and the in-plane assembly of defect-free single-atom arrays in several Talbot planes. We present interleaved lattices with dynamic position control and parallelized sub-lattice addressing of spin states

    Sharpening up Galactic all-sky maps with complementary data - A machine learning approach

    Full text link
    Galactic all-sky maps at very disparate frequencies, like in the radio and γ\gamma-ray regime, show similar morphological structures. This mutual information reflects the imprint of the various physical components of the interstellar medium. We want to use multifrequency all-sky observations to test resolution improvement and restoration of unobserved areas for maps in certain frequency ranges. For this we aim to reconstruct or predict from sets of other maps all-sky maps that, in their original form, lack a high resolution compared to other available all-sky surveys or are incomplete in their spatial coverage. Additionally, we want to investigate the commonalities and differences that the ISM components exhibit over the electromagnetic spectrum. We build an nn-dimensional representation of the joint pixel-brightness distribution of nn maps using a Gaussian mixture model and see how predictive it is: How well can one map be reproduced based on subsets of other maps? Tests with mock data show that reconstructing the map of a certain frequency from other frequency regimes works astonishingly well, predicting reliably small-scale details well below the spatial resolution of the initially learned map. Applied to the observed multifrequency data sets of the Milky Way this technique is able to improve the resolution of, e.g., the low-resolution Fermi LAT maps as well as to recover the sky from artifact-contaminated data like the ROSAT 0.855 keV map. The predicted maps generally show less imaging artifacts compared to the original ones. A comparison of predicted and original maps highlights surprising structures, imaging artifacts (fortunately not reproduced in the prediction), and features genuine to the respective frequency range that are not present at other frequency bands. We discuss limitations of this machine learning approach and ideas how to overcome them
    corecore