496 research outputs found

    Bridging languages through images with deep partial canonical correlation analysis

    Get PDF
    We present a deep neural network that leverages images to improve bilingual text embeddings. Relying on bilingual image tags and descriptions, our approach conditions text embedding induction on the shared visual information for both languages, producing highly correlated bilingual embeddings. In particular, we propose a novel model based on Partial Canonical Correlation Analysis (PCCA). While the original PCCA finds linear projections of two views in order to maximize their canonical correlation conditioned on a shared third variable, we introduce a non-linear Deep PCCA (DPCCA) model, and develop a new stochastic iterative algorithm for its optimization. We evaluate PCCA and DPCCA on multilingual word similarity and cross-lingual image description retrieval. Our models outperform a large variety of previous methods, despite not having access to any visual signal during test time inference. Our code and data are available at: https://github.com/rotmanguy/DPCCA

    Improvement of lung preservation - From experiment to clinical practice

    Get PDF
    Background. Reperfusion injury represents a severe early complication following lung transplantation. Among the pathogenetic factors, the high potassium content of Euro-Collins(R) solution is discussed. Material and Methods: In a pig model of orthotopic left-sided lung transplantation we investigated the effect of Euro-Collins solution (EC: n=6) versus low potassium dextran (LPD: Perfadex(R): n = 6). Sham-operated (n = 6) animals served as control. Transplant function, cellular energy metabolism and endothelial morphology served as parameters. In a clinical investigation, 124 patients were evaluated following single (EC: n = 31; LPD n = 37) or double (EC: n = 17; LPD n = 39) lung transplantation, whose organs where preserved with EC (n = 48) or LPD (n = 76). Duration of ischemia, duration of ventilation and stay on ICU were registered. Primary transplant function was evaluated according to AaDO(2) values. Cause of early death (30 days) was declared. Results: Experimental results: After flush with EC and 18 h ischemia, a reduction of tissue ATP content (p < 0.01 vs inital value and LPD) was noted. Endothelial damage after ischemia was severe (p < 0.05 vs control), paO(2) was significantly decreased. Clinical results: In the LPD group, duration of ischemia was longer for the grafts transplanted first (SLTx and DLTx: p = 0.0009) as well as second (2. organ DLTx: p = 0.045). Primary transplant function was improved (day 0: SLTx: p = 0.0015; DLTx: p = 0.0095, both vs EC). Duration of ventilation and stay on ICU were shorter (n.s.). Reperfusion injury-associated death was reduced from 8% (EC) to 0 (LPD). Conclusion: In experimental lung preservation, LPD lead to an improved graft function. These results were confirmed in clinical lung transplantation. Clinical lung preservation, therefore, should be carried out by use of LPD. Copyright (C) 2002 S. Karger AG, Basel

    Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP

    Get PDF
    The transfer or share of knowledge between languages is a popular solution to resource scarcity in NLP. However, the effectiveness of cross-lingual transfer can be challenged by variation in syntactic structures. Frameworks such as Universal Dependencies (UD) are designed to be cross-lingually consistent, but even in carefully designed resources trees representing equivalent sentences may not always overlap. In this paper, we measure cross-lingual syntactic variation, or anisomorphism, in the UD treebank collection, considering both morphological and structural properties. We show that reducing the level of anisomorphism yields consistent gains in cross-lingual transfer tasks. We introduce a source language selection procedure that facilitates effective cross-lingual parser transfer, and propose a typologically driven method for syntactic tree processing which reduces anisomorphism. Our results show the effectiveness of this method for both machine translation and cross-lingual sentence similarity, demonstrating the importance of syntactic structure compatibility for boosting cross-lingual transfer in NLP

    Do we really need fully unsupervised cross-lingual embeddings?

    Get PDF
    Recent efforts in cross-lingual word embedding (CLWE) learning have predominantly focused on fully unsupervised approaches that project monolingual embeddings into a shared cross-lingual space without any cross-lingual signal. The lack of any supervision makes such approaches conceptually attractive. Yet, their only core difference from (weakly) supervised projection-based CLWE methods is in the way they obtain a seed dictionary used to initialize an iterative self-learning procedure. The fully unsupervised methods have arguably become more robust, and their primary use case is CLWE induction for pairs of resource-poor and distant languages. In this paper, we question the ability of even the most robust unsupervised CLWE approaches to induce meaningful CLWEs in these more challenging settings. A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e.g., they yield zero BLI performance for 87/210 pairs). Even when they succeed, they never surpass the performance of weakly supervised methods (seeded with 500-1,000 translation pairs) using the same self-learning procedure in any BLI setup, and the gaps are often substantial. These findings call for revisiting the main motivations behind fully unsupervised CLWE methods

    Towards zero-shot language modeling

    Get PDF
    Can we construct a neural language model which is inductively biased towards learning human language? Motivated by this question, we aim at constructing an informative prior for held-out languages on the task of character-level, open-vocabulary language modeling. We obtain this prior as the posterior over network weights conditioned on the data from a sample of training languages, which is approximated through Laplace’s method. Based on a large and diverse sample of languages, the use of our prior outperforms baseline models with an uninformative prior in both zero-shot and few-shot settings, showing that the prior is imbued with universal linguistic knowledge. Moreover, we harness broad language-specific information available for most languages of the world, i.e., features from typological databases, as distant supervision for held-out languages. We explore several language modeling conditioning techniques, including concatenation and meta-networks for parameter generation. They appear beneficial in the few-shot setting, but ineffective in the zero-shot setting. Since the paucity of even plain digital text affects the majority of the world’s languages, we hope that these insights will broaden the scope of applications for language technology

    On the relation between linguistic typology and (limitations of) multilingual language modeling

    Get PDF
    A key challenge in cross-lingual NLP is developing general language-independent architectures that are equally applicable to any language. However, this ambition is largely hampered by the variation in structural and semantic properties, i.e. the typological profiles of the world's languages. In this work, we analyse the implications of this variation on the language modeling (LM) task. We present a large-scale study of state-of-the art n-gram based and neural language models on 50 typologically diverse languages covering a wide variety of morphological systems. Operating in the full vocabulary LM setup focused on word-level prediction, we demonstrate that a coarse typology of morphological systems is predictive of absolute LM performance. Moreover, fine-grained typological features such as exponence, flexivity, fusion, and inflectional synthesis are borne out to be responsible for the proliferation of low-frequency phenomena which are organically difficult to model by statistical architectures, or for the meaning ambiguity of character n-grams. Our study strongly suggests that these features have to be taken into consideration during the construction of next-level language-agnostic LM architectures, capable of handling morphologically complex languages such as Tamil or Korean.ERC grant Lexica

    Cross-lingual semantic specialization via lexical relation induction

    Get PDF
    Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints. However, this technique cannot be leveraged in many languages, because their structured external resources are typically incomplete or non-existent. To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. Our specialization transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation; and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. This allows us to specialize any set of distributional vectors in the target language with the refined constraints. We prove the effectiveness of our method through intrinsic word similarity evaluation in 8 languages, and with 3 downstream tasks in 5 languages: lexical simplification, dialog state tracking, and semantic textual similarity. The gains over the previous state-of-art specialization methods are substantial and consistent across languages. Our results also suggest that the transfer method is effective even for lexically distant source-target language pairs. Finally, as a by-product, our method produces lists of WordNet-style lexical relations in resource-poor languages

    Impact of salinity on element incorporation in two benthic foraminiferal species with contrasting magnesium contents

    Get PDF
    Accurate reconstructions of seawater salinity could provide valuable constraints for studying past ocean circulation, the hydrological cycle and sea level change. Controlled growth experiments and field studies have shown the potential of foraminiferal Na ∕ Ca as a direct salinity proxy. Incorporation of minor and trace elements in foraminiferal shell carbonate varies, however, greatly between species and hence extrapolating calibrations to other species needs validation by additional (culturing) studies. Salinity is also known to impact other foraminiferal carbonate-based proxies, such as Mg ∕ Ca for temperature and Sr ∕ Ca for sea water carbonate chemistry. Better constraints on the role of salinity on these proxies will therefore improve their reliability. Using a controlled growth experiment spanning a salinity range of 20 units and analysis of element composition on single chambers using laser ablation-Q-ICP-MS, we show here that Na ∕ Ca correlates positively with salinity in two benthic foraminiferal species (<i>Ammonia tepida</i> and <i>Amphistegina lessonii</i>). The Na ∕ Ca values differ between the two species, with an approximately 2-fold higher Na ∕ Ca in <i>A. lessonii</i> than in <i>A. tepida</i>, coinciding with an offset in their Mg content ( ∼  35 mmol molM<super>−2</super> versus  ∼  2.5 mmol mol−<super>1</super> for <i>A. lessonii</i> and <i>A. tepida</i>, respectively). Despite the offset in average Na ∕ Ca values, the slopes of the Na ∕ Ca–salinity regressions are similar between these two species (0.077 versus 0.064 mmol mol<super>−1</super> change per salinity unit). In addition, Mg ∕ Ca and Sr ∕ Ca are positively correlated with salinity in cultured <i>A. tepida</i> but show no correlation with salinity for <i>A. lessonii</i>. Electron microprobe mapping of incorporated Na and Mg of the cultured specimens shows that within chamber walls of <i>A. lessonii</i>, Na ∕ Ca and Mg ∕ Ca occur in elevated bands in close proximity to the primary organic lining. Between species, Mg banding is relatively similar, even though Mg content is 10 times lower and that variation within the chamber wall is much less pronounced in <i>A. tepida</i>. In addition, Na banding is much less prominent in this species than it is in <i>A. lessonii</i>. Inter-species differences in element banding reported here are hypothesized to be caused by differences in biomineralization controls responsible for element uptake

    Fe-binding organic ligands in coastal and frontal regions of the western Antarctic Peninsula

    Get PDF
    Organic ligands are a key factor determining the availability of dissolved iron (DFe) in the high-nutrient low-chlorophyll (HNLC) areas of the Southern Ocean. In this study, organic speciation of Fe is investigated along a natural gradient of the western Antarctic Peninsula, from an ice-covered shelf to the open ocean. An electrochemical approach, competitive ligand exchange – adsorptive cathodic stripping voltammetry (CLE-AdCSV), was applied. Our results indicated that organic ligands in the surface water on the shelf are associated with ice-algal exudates, possibly combined with melting of sea ice. Organic ligands in the deeper shelf water are supplied via the resuspension of slope or shelf sediments. Further offshore, organic ligands are most likely related to the development of phytoplankton blooms in open ocean waters. On the shelf, total ligand concentrations ([Lt]) were between 1.2 and 6.4 nM eq. Fe. The organic ligands offshore ranged between 1.0 and 3.0 nM eq. Fe. The southern boundary of the Antarctic Circumpolar Current (SB ACC) separated the organic ligands on the shelf from bloom-associated ligands offshore. Overall, organic ligand concentrations always exceeded DFe concentrations (excess ligand concentration, [L′] = 0.8–5.0 nM eq. Fe). The [L′] made up to 80 % of [Lt], suggesting that any additional Fe input can be stabilized in the dissolved form via organic complexation. The denser modified Circumpolar Deep Water (mCDW) on the shelf showed the highest complexation capacity of Fe (αFe'L; the product of [L′] and conditional binding strength of ligands, KFe'Lcond). Since Fe is also supplied by shelf sediments and glacial discharge, the high complexation capacity over the shelf can keep Fe dissolved and available for local primary productivity later in the season upon sea-ice melting.</p
    • …
    corecore