149 research outputs found

    Balance Optimization Subset Selection: a framework for causal inference with observational data

    Get PDF
    Observational data are prevalent in many fields of research, and it is desirable to use this data to explore potential causal relationships. Additional assumptions and methods for post-processing the data are needed to construct unbiased estimators of causal effects because such data is non-random. This dissertation describes the Balance Optimization Subset Selection (BOSS) framework to apply causal inference to observational data. BOSS is designed to identify the subset of observational data that is most appropriate for computing causal estimates. To do this, it compares the available treatment units to potential sets of control units on a set of confounding factors, called covariates, with the goal of identifying a control group that minimizes a measure of covariate imbalance. Which imbalance measure to use with BOSS is an important consideration that depends both on the quality of the available observational data and on the assumptions that a researcher is willing to make. The standard assumption for observational data, known as strong ignorability, is extended in several ways to be directly applicable to BOSS. Under these additional assumptions, specific levels of covariate balance are both necessary and sufficient for the treatment effect estimate to be unbiased. There is a trade-off in that weaker assumptions require a higher level of covariate balance in order to guarantee estimator unbiasedness. These additional assumptions bridge the gap between existing parametric and non-parametric methods. Each imbalance measure for BOSS leads to an associated optimization problem. The computational complexity of these problems is discussed, and efficient algorithms are developed to handle several special cases. A constant factor approximation algorithm is also presented for one imbalance measure. Given the potential applications of BOSS, identifying optimal or near-optimal solutions for these problems is of great practical interest. Heuristics and exact algorithms are considered, and computational tests demonstrate their effectiveness at minimizing imbalance. Additional tests validate BOSS on a well-studied dataset from the literature and highlight the value of alternate optima as a way to corroborate the assumptions that are made

    Incremental sentence processing is guided by a preference for agents: EEG evidence from Basque

    Full text link
    Comprehenders across languages tend to interpret role-ambiguous arguments as the subject or the agent of a sentence during parsing. However, the evidence for such a subject/agent preference rests on the comprehension of transitive, active-voice sentences where agents/subjects canonically precede patients/objects. The evidence is thus potentially confounded by the canonical order of arguments. Transitive sentence stimuli additionally conflate the semantic agent role and the syntactic subject function. We resolve these two confounds in an experiment on the comprehension of intransitive sentences in Basque. When exposed to sentence-initial role-ambiguous arguments, comprehenders preferentially interpreted these as agents and had to revise their interpretation when the verb disambiguated to patient-initial readings. The revision was reflected in an N400 component in ERPs and a decrease in power in the alpha and lower beta bands. This finding suggests that sentence processing is guided by a top-down heuristic to interpret ambiguous arguments as agents, independently of word order and independently of transitivity

    Surprisal from language models can predict ERPs in processing predicate-argument structures only if enriched by an Agent Preference principle

    Full text link
    Language models based on artificial neural networks increasingly capture key aspects of how humans process sentences. Most notably, model-based surprisals predict event-related potentials such as N400 amplitudes during parsing. Assuming that these models represent realistic estimates of human linguistic experience, their success in modelling language processing raises the possibility that the human processing system relies on no other principles than the general architecture of language models and on sufficient linguistic input. Here, we test this hypothesis on N400 effects observed during the processing of verb-final sentences in German, Basque, and Hindi. By stacking Bayesian generalised additive models, we show that, in each language, N400 amplitudes and topographies in the region of the verb are best predicted when model-based surprisals are complemented by an Agent Preference principle that transiently interprets initial role-ambiguous NPs as agents, leading to reanalysis when this interpretation fails. Our findings demonstrate the need for this principle independently of usage frequencies and structural differences between languages. The principle has an unequal force, however. Compared to surprisal, its effect is weakest in German, stronger in Hindi, and still stronger in Basque. This gradient is correlated with the extent to which grammars allow unmarked NPs to be patients, a structural feature that boosts reanalysis effects. We conclude that language models gain more neurobiological plausibility by incorporating an Agent Preference. Conversely, theories of human processing profit from incorporating surprisal estimates in addition to principles like the Agent Preference, which arguably have distinct evolutionary roots

    An agent-first preference in a patient-first language during sentence comprehension

    Full text link
    The language comprehension system preferentially assumes that agents come first during incremental processing. While this might reflect a biologically fixed bias, shared with other domains and other species, the evidence is limited to languages that place agents first, and so the bias could also be learned from usage frequency. Here, we probe the bias with electroencephalograph (EEG)y in Äiwoo, a language that by default places patients first, but where sentence-initial nouns are still locally ambiguous between patient or agent roles. Comprehenders transiently interpreted non-human nouns as patients, eliciting a negativity when disambiguation was toward the less common agent-initial order. By contrast and against frequencies, human nouns were transiently interpreted as agents, eliciting a N400-like negativity when disambiguation was toward patient-initial order. Consistent with the notion of a fixed property, the agent bias is robust against usage frequency for human referents. However, this bias can be reversed by frequency experience for non-human referents

    Cross-linguistic differences in case marking shape neural power dynamics and gaze behavior during sentence planning

    Get PDF
    Languages differ in how they mark the dependencies between verbs and arguments, e.g., by case. An eye tracking and EEG picture description study examined the influence of case marking on the time course of sentence planning in Basque and Swiss German. While German assigns an unmarked (nominative) case to subjects, Basque specifically marks agent arguments through ergative case. Fixations to agents and event-related synchronization (ERS) in the theta and alpha frequency bands, as well as desynchronization (ERD) in the alpha and beta bands revealed multiple effects of case marking on the time course of early sentence planning. Speakers decided on case marking under planning early when preparing sentences with ergative-marked agents in Basque, whereas sentences with unmarked agents allowed delaying structural commitment across languages. These findings support hierarchically incremental accounts of sentence planning and highlight how cross-linguistic differences shape the neural dynamics underpinning language use.This work was funded by Swiss National Science Foundation Grant Nr. 100015_160011 (B.B. and M.M.), the NCCR Evolving Language, Swiss National Science Foundation Agreement Nr. #51NF40_180888 (B.B. and M. M.), and the PhD Program in Linguistics and the Graduate Research Campus of the University of Zurich (A.E.). DEB is supported by a grant from the Harvard Data Science Initiative and the Branco Weiss Foundation. I.B.-S. is supported by an Australian Research Council Future Fellowship (FT160100437). I.L. is supported by grants from the Spanish Ministry of Economy and Competitiveness (Grant No. FFI2015-64183-P) and the Basque Government (IT1169-19). The authors thank Anne-Lise Giraud for the suggestion to include beta-band analyses, Vitória Piai for advice on EEG data processing, Giuachin Kreiliger for statistical consultation, Andrina Balsofiore and Edurne Petrirena for help recording the lead-in fragments, Nathalie Rieser and Debora Beuret for help with data collection and processing, and the Phonogram Archives of the University of Zurich for technical support. The authors also thank two anonymous reviewers for their helpful comments on an earlier version of the manuscript

    The Agent Preference in Visual Event Apprehension

    Full text link
    A central aspect of human experience and communication is understanding events in terms of agent (“doer”) and patient (“undergoer” of action) roles. These event roles are rooted in general cognition and prominently encoded in language, with agents appearing as more salient and preferred over patients. An unresolved question is whether this preference for agents already operates during apprehension, that is, the earliest stage of event processing, and if so, whether the effect persists across different animacy configurations and task demands. Here we contrast event apprehension in two tasks and two languages that encode agents differently; Basque, a language that explicitly case-marks agents (‘ergative’), and Spanish, which does not mark agents. In two brief exposure experiments, native Basqueand Spanish speakers saw pictures for only 300 ms, and subsequently described them or answered probe questions about them. We compared eye fixations and behavioral correlates of event role extraction with Bayesian regression. Agents received more attention and were recognized better across languages and tasks. At the same time, language and task demands affected the attention to agents. Our findings show that a general preference for agents exists in event apprehension, but it can be modulated by task and language demands

    Publisher Correction: Coherent diffractive imaging of single helium nanodroplets with a high harmonic generation source

    Get PDF
    In the original version of this Article, the affiliation for Luca Poletto was incorrectly given as ‘European XFEL GmbH, Holzkoppel 4, 22869 Schenefeld, Hamburg, Germany’, instead of the correct ‘CNR, Istituto di Fotonica e Nanotecnologie Padova, Via Trasea 7, 35131 Padova, Italy’. This has now been corrected in both the PDF and HTML versions of the Article
    corecore