6,709 research outputs found

    A comparison of parsing technologies for the biomedical domain

    Get PDF
    This paper reports on a number of experiments which are designed to investigate the extent to which current nlp resources are able to syntactically and semantically analyse biomedical text. We address two tasks: parsing a real corpus with a hand-built widecoverage grammar, producing both syntactic analyses and logical forms; and automatically computing the interpretation of compound nouns where the head is a nominalisation (e.g., hospital arrival means an arrival at hospital, while patient arrival means an arrival of a patient). For the former task we demonstrate that exible and yet constrained `preprocessing ' techniques are crucial to success: these enable us to use part-of-speech tags to overcome inadequate lexical coverage, and to `package up' complex technical expressions prior to parsing so that they are blocked from creating misleading amounts of syntactic complexity. We argue that the xml-processing paradigm is ideally suited for automatically preparing the corpus for parsing. For the latter task, we compute interpretations of the compounds by exploiting surface cues and meaning paraphrases, which in turn are extracted from the parsed corpus. This provides an empirical setting in which we can compare the utility of a comparatively deep parser vs. a shallow one, exploring the trade-o between resolving attachment ambiguities on the one hand and generating errors in the parses on the other. We demonstrate that a model of the meaning of compound nominalisations is achievable with the aid of current broad-coverage parsers

    Integrative priming occurs rapidly and uncontrollably during lexical processing

    Get PDF
    Lexical priming, whereby a prime word facilitates recognition of a related target word (e.g., nurse ? doctor), is typically attributed to association strength, semantic similarity, or compound familiarity. Here, the authors demonstrate a novel type of lexical priming that occurs among unassociated, dissimilar, and unfamiliar concepts (e.g., horse ? doctor). Specifically, integrative priming occurs when a prime word can be easily integrated with a target word to create a unitary representation. Across several manipulations of timing (stimulus onset asynchrony) and list context (relatedness proportion), lexical decisions for the target word were facilitated when it could be integrated with the prime word. Moreover, integrative priming was dissociated from both associative priming and semantic priming but was comparable in terms of both prevalence (across participants) and magnitude (within participants). This observation of integrative priming challenges present models of lexical priming, such as spreading activation, distributed representation, expectancy, episodic retrieval, and compound cue models. The authors suggest that integrative priming may be explained by a role activation model of relational integration

    Transfer and Multi-Task Learning for Noun-Noun Compound Interpretation

    Full text link
    In this paper, we empirically evaluate the utility of transfer and multi-task learning on a challenging semantic classification task: semantic interpretation of noun--noun compounds. Through a comprehensive series of experiments and in-depth error analysis, we show that transfer learning via parameter initialization and multi-task learning via parameter sharing can help a neural classification model generalize over a highly skewed distribution of relations. Further, we demonstrate how dual annotation with two distinct sets of relations over the same set of compounds can be exploited to improve the overall accuracy of a neural classifier and its F1 scores on the less frequent, but more difficult relations.Comment: EMNLP 2018: Conference on Empirical Methods in Natural Language Processing (EMNLP

    Organic Haze as a Biosignature in Anoxic Earth-like Atmospheres

    Full text link
    Early Earth may have hosted a biologically-mediated global organic haze during the Archean eon (3.8-2.5 billion years ago). This haze would have significantly impacted multiple aspects of our planet, including its potential for habitability and its spectral appearance. Here, we model worlds with Archean-like levels of carbon dioxide orbiting the ancient sun and an M4V dwarf (GJ 876) and show that organic haze formation requires methane fluxes consistent with estimated Earth-like biological production rates. On planets with high fluxes of biogenic organic sulfur gases (CS2, OCS, CH3SH, and CH3SCH3), photochemistry involving these gases can drive haze formation at lower CH4/CO2 ratios than methane photochemistry alone. For a planet orbiting the sun, at 30x the modern organic sulfur gas flux, haze forms at a CH4/CO2 ratio 20% lower than at 1x the modern organic sulfur flux. For a planet orbiting the M4V star, the impact of organic sulfur gases is more pronounced: at 1x the modern Earth organic sulfur flux, a substantial haze forms at CH4/CO2 ~ 0.2, but at 30x the organic sulfur flux, the CH4/CO2 ratio needed to form haze decreases by a full order of magnitude. Detection of haze at an anomalously low CH4/CO2 ratio could suggest the influence of these biogenic sulfur gases, and therefore imply biological activity on an exoplanet. When these organic sulfur gases are not readily detectable in the spectrum of an Earth-like exoplanet, the thick organic haze they can help produce creates a very strong absorption feature at UV-blue wavelengths detectable in reflected light at a spectral resolution as low as 10. In direct imaging, constraining CH4 and CO2 concentrations will require higher spectral resolution, and R > 170 is needed to accurately resolve the structure of the CO2 feature at 1.57 {\mu}m, likely, the most accessible CO2 feature on an Archean-like exoplanet.Comment: accepted for publication in Astrobiolog

    English compound and non-compound processing in bilingual and multilingual speakers: effects of dominance and sequential multilingualism

    Get PDF
    This article reports on a study investigating the relative influence of the first and dominant language on L2 and L3 morpho-lexical processing. A lexical decision task compared the responses to English NV-er compounds (e.g., taxi driver) and non-compounds provided by a group of native speakers and three groups of learners at various levels of English proficiency: L1 Spanish-L2 English sequential bilinguals and two groups of early Spanish-Basque bilinguals with English as their L3. Crucially, the two trilingual groups differed in their first and dominant language (i.e., L1 Spanish-L2 Basque vs. L1 Basque-L2 Spanish). Our materials exploit an (a)symmetry between these languages: while Basque and English pattern together in the basic structure of (productive) NV-er compounds, Spanish presents a construction that differs in directionality as well as inflection of the verbal element (V[3SG] + N). Results show between and within group differences in accuracy and response times that may be ascribable to two factors besides proficiency: the number of languages spoken by a given participant and their dominant language. An examination of response bias reveals an influence of the participants' first and dominant language on the processing of NV-er compounds. Our data suggest that morphological information in the nonnative lexicon may extend beyond morphemic structure and that, similarly to bilingualism, there are costs to sequential multilingualism in lexical retrieval

    Terminology and Interpreting in LSP Conferences: A Computer-aided vs. Empirical-based Approach

    Get PDF
    Conference interpreters are called to work in highly technical communicative events, therefore they need to acquire specialized knowledge in terms of terminology (LSP), in order to produce adequate target texts. The goal of the study is to compare two different methodologies for the creation of glossaries to be used during simultaneous interpreting in the medical domain; one is more empirical and represents the most frequently adopted approach among conference interpreters; the second is supported by WordSmith Tools for the selection of contexts of use. The glossaries created with WordSmith Tools will be compared with those created manually, and both will be tested in the translation booth for completeness, clarity, and adequacy

    User experiments with the Eurovision cross-language image retrieval system

    Get PDF
    In this paper we present Eurovision, a text-based system for cross-language (CL) image retrieval. The system is evaluated by multilingual users for two search tasks with the system configured in English and five other languages. To our knowledge this is the first published set of user experiments for CL image retrieval. We show that: (1) it is possible to create a usable multilingual search engine using little knowledge of any language other than English, (2) categorizing images assists the user's search, and (3) there are differences in the way users search between the proposed search tasks. Based on the two search tasks and user feedback, we describe important aspects of any CL image retrieval system

    Wildfire Smoke Particle Properties and Evolution, from Space-Based Multi-Angle Imaging

    Get PDF
    Emitted smoke composition is determined by properties of the biomass burning source and ambient ecosystem. However, conditions that mediate the partitioning of black carbon (BC) and brown carbon (BrC) formation, as well as the spatial and temporal factors that drive particle evolution, are not understood adequately for many climate and air-quality related modeling applications. In situ observations provide considerable detail about aerosol microphysical and chemical properties, although sampling is extremely limited. Satellites offer the frequent global coverage that would allow for statistical characterization of emitted and evolved smoke, but generally lack microphysical detail. However, once properly validated, data from the National Aeronautics and Space Administration (NASA) Earth Observing Systems Multi-Angle Imaging Spectroradiometer (MISR) instrument can create at least a partial picture of smoke particle properties and plume evolution. We use in situ data from the Department of Energys Biomass Burning Observation Project (BBOP) field campaign to assess the strengths and limitations of smoke particle retrieval results from the MISR Research Aerosol (RA) retrieval algorithm. We then use MISR to characterize wildfire smoke particle properties and to identify the relevant aging factors in several cases, to the extent possible. The RA successfully maps qualitative changes in effective particle size, light absorption, and its spectral dependence, when compared to in situ observations. By observing the entire plume uniformly, the satellite data can be interpreted in terms of smoke plume evolution, including size-selective deposition, new-particle formation, and locations within the plume where BC or BrC dominates
    corecore