109 research outputs found

    Better, Faster, Stronger Sequence Tagging Constituent Parsers

    Get PDF
    Sequence tagging models for constituent parsing are faster, but less accurate than other types of parsers. In this work, we address the following weaknesses of such constituent parsers: (a) high error rates around closing brackets of long constituents, (b) large label sets, leading to sparsity, and (c) error propagation arising from greedy decoding. To effectively close brackets, we train a model that learns to switch between tagging schemes. To reduce sparsity, we decompose the label set and use multi-task learning to jointly learn to predict sublabels. Finally, we mitigate issues from greedy decoding through auxiliary losses and sentence-level fine-tuning with policy gradient. Combining these techniques, we clearly surpass the performance of sequence tagging constituent parsers on the English and Chinese Penn Treebanks, and reduce their parsing time even further. On the SPMRL datasets, we observe even greater improvements across the board, including a new state of the art on Basque, Hebrew, Polish and Swedish.Comment: NAACL 2019 (long papers). Contains corrigendu

    What can we learn from Semantic Tagging?

    Full text link
    We investigate the effects of multi-task learning using the recently introduced task of semantic tagging. We employ semantic tagging as an auxiliary task for three different NLP tasks: part-of-speech tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where negative transfer between tasks is less likely. Our findings show considerable improvements for all tasks, particularly in the learning what to share setting, which shows consistent gains across all tasks.Comment: 9 pages with references and appendixes. EMNLP 2018 camera read

    Zkoumání úlohy univerzálního sémantického značkování pomocí neuronových sítí, řešením jiných úloh a vícejazyčným učením

    Get PDF
    July 19, 2018 V diplomové práci prezentujeme výzkum paralelního a přenosového učení s využitím nedávno představené úlohy sémantického značkování. Zaprvé vybrané úlohy počítačového zpracování přirozeného jazyka používáme jako podpůrné úlohy pro sémantické značkování. Zadruhé se vydáváme opačným směrem, a sice sémantické značkování používáme jako podpůrnou úlohu pro tři různé úlohy počí- tačového zpracování přirozeného jazyka: tvaroslovné značkování, parsing na platformě Univer- sal Dependencies a odvozování v přirozeném jazyce. Porovnáváme úplné a částečné sdílení neu- ronových sítí spolu s učením s méně pravděpodobným nastavením negativního přenosu mezi úlo- hami. Na závěr zkoumáme vícejazyčné učení v paralelním učení. V experimentech demonstrujeme různé kombinace paralelního učení a přenosového učení. Výsledky jsou pozitivní. 1 References 2July 19, 2018 In this thesis we present an investigation of multi-task and transfer learning using the recently introduced task of semantic tagging. First we employ a number of natural language processing tasks as auxiliaries for semantic tag- ging. Secondly, going in the other direction, we employ seman- tic tagging as an auxiliary task for three di erent NLP tasks: Part-of-Speech Tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where neg- ative transfer between tasks is less likely. Fi- nally, we investigate multi-lingual learning framed as a special case of multi-task learning. Our ndings show considerable improvements for most experiments, demonstrating a variety of cases where multi-task and transfer learning methods are bene cial. 1 References 2Ústav formální a aplikované lingvistikyInstitute of Formal and Applied LinguisticsFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult

    Better, Faster, Stronger Sequence Tagging Constituent Parsers

    Get PDF

    The Sensitivity of Language Models and Humans to Winograd Schema Perturbations

    Full text link
    Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. Overall, humans are correct more often than out-of-the-box models, and the models are sometimes right for the wrong reasons. Finally, we show that fine-tuning on a large, task-specific dataset can offer a solution to these issues.Comment: ACL 202

    Mapping Brains with Language Models: A Survey

    Full text link
    Over the years, many researchers have seemingly made the same observation: Brain and language model activations exhibit some structural similarities, enabling linear partial mappings between features extracted from neural recordings and computational language models. In an attempt to evaluate how much evidence has been accumulated for this observation, we survey over 30 studies spanning 10 datasets and 8 metrics. How much evidence has been accumulated, and what, if anything, is missing before we can draw conclusions? Our analysis of the evaluation methods used in the literature reveals that some of the metrics are less conservative. We also find that the accumulated evidence, for now, remains ambiguous, but correlations with model size and quality provide grounds for cautious optimism

    Increased expression of T-helper cell activation markers in peripheral blood of children with atopic asthma

    Get PDF
    Background: Activated T-helper (CD4) cells have been implicated to contribute to the pathogenesis of bronchial asthma. However, the profile of circulating CD4 subsets in relation to disease activity and asthma severity is unclear.Objective: To study the dynamic changes in peripheral blood CD4 cells expressing the activation markers naïve/memory (CD45RA/CD45RO) and interleukin–2 light chain receptor (CD25) in asthmatic children during and after resolution of acute asthma attacks and to determine whether the expression of these activation markers would be of value in monitoring asthma severity and the response to glucocorticoid inhalation.Methods: Peripheral blood samples were obtained from 20 asthmatic children aged between 0.5 and 9 years (mean±SD: 4.37±2.37 years) with acute asthma attacks, 10 children with lower respiratory tract infection and 20 healthy, age-matched subjects. CD4 cells expressing CD45RA, CD45RO, CD45RA+RO+ and CD25 were analyzed by dual flow cytometry and serum IgE was measured by ELISA. In asthmatic children, the measurements were repeated after the resolution of acute attacks.Results: During acute asthma attacks, the percentages of CD45RA, CD45RO, CD45RA+RO+ and CD25 were significantly increased as compared to the control group (p<0.05 for CD45RA and <0.0001 for the other 3 subsets). After resolution of asthma attacks, a significant reduction of all subsets was noticed and the percentages of CD45RA and CD45RO decreased to normal values while those of CD45RA+RO+ and CD25 remained significantly higher than the controls (p<0.05 for each marker). Unlike healthy children and patients with acute lower respiratory infections, asthmatic children showed increased CD45RO/CD45RA ratio (>1) and a significant increase of the percentage of CD45RA+RO+. During acute asthma attacks, patients with severe persistent asthma showed the highest percentages of all T- helper subsets when compared to those with moderate or mild persistent asthma. Positive correlations were found between serum IgE levels and both CD45RO and CD25 (r = 0.962, p<0.001 and 0.882, p<0.05 respectively) during acute asthma attacks and these correlations remained significant in remission (r = 0.632, p<0.05 and 0.589, p<0.05 respectively). Glucocorticoid inhalation therapy induced a significant reduction in the percentage of CD45RO, CD45RA+RO+ and CD25.Conclusion: Peripheral blood T-helper cell activation markers are reliable indicators for monitoring disease activity and severity of asthma. The reversed ratio of memory/ naïve T-helper cells together with the presence of a clone of cells co-expressing both naive and memory surface markers feature atopic asthma from acute lower respiratory infections. Glucocorticoid inhalation therapy induces a significant inhibition of peripheral blood T-helper cell activation markers.Key words: Children, atopic asthma, T-helper cell subsets, glucocorticoid inhalation, lower respiratory infections, CD45RO, CD45RA, CD25

    Attention Can Reflect Syntactic Structure (If You Let It)

    Full text link
    Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English -- a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We show that full trees can be decoded above baseline accuracy from single attention heads, and that individual relations are often tracked by the same heads across languages. Furthermore, in an attempt to address recent debates about the status of attention as an explanatory mechanism, we experiment with fine-tuning mBERT on a supervised parsing objective while freezing different series of parameters. Interestingly, in steering the objective to learn explicit linguistic structure, we find much of the same structure represented in the resulting attention patterns, with interesting differences with respect to which parameters are frozen
    • …
    corecore