102 research outputs found

    Conversion of a Russian dependency treebank into HPSG derivations

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 7-18. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

    Language models, surprisal and fantasy in Slavic intercomprehension

    Get PDF
    In monolingual human language processing, the predictability of a word given its surrounding sentential context is crucial. With regard to receptive multilingualism, it is unclear to what extent predictability in context interplays with other linguistic factors in understanding a related but unknown language – a process called intercomprehension. We distinguish two dimensions influencing processing effort during intercomprehension: surprisal in sentential context and linguistic distance. Based on this hypothesis, we formulate expectations regarding the difficulty of designed experimental stimuli and compare them to the results from think-aloud protocols of experiments in which Czech native speakers decode Polish sentences by agreeing on an appropriate translation. On the one hand, orthographic and lexical distances are reliable predictors of linguistic similarity. On the other hand, we obtain the predictability of words in a sentence with the help of trigram language models. We find that linguistic distance (encoding similarity) and in-context surprisal (predictability in context) appear to be complementary, with neither factor outweighing the other, and that our distinguishing of these two measurable dimensions is helpful in understanding certain unexpected effects in human behaviour

    Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages

    Full text link
    State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9% to 77% depending on the diversity of acoustic conditions in the source domain.Comment: To appear in INTERSPEECH 202

    On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers

    Get PDF
    This contribution seeks to provide a rational probabilistic explanation for the intelligibility of words in a genetically related language that is unknown to the reader, a phenomenon referred to as intercomprehension. In this research domain, linguistic distance, among other factors, was proved to correlate well with the mutual intelligibility of individual words. However, the role of context for the intelligibility of target words in sentences was subject to very few studies. To address this, we analyze data from web-based experiments in which Czech (CS) respondents were asked to translate highly predictable target words at the final position of Polish sentences. We compare correlations of target word intelligibility with data from 3-g language models (LMs) to their correlations with data obtained from context-aware LMs. More specifically, we evaluate two context-aware LM architectures: Long Short-Term Memory (LSTMs) that can, theoretically, take infinitely long-distance dependencies into account and Transformer-based LMs which can access the whole input sequence at the same time. We investigate how their use of context affects surprisal and its correlation with intelligibility

    Cancer-Associated Fibroblasts Suppress CD8<sup>+</sup> T-cell Infiltration and Confer Resistance to Immune-Checkpoint Blockade

    Get PDF
    \ua92022 The Authors. Immune-checkpoint blockade (ICB) promotes antitumor immune responses and can result in durable patient benefit. However, response rates in breast cancer patients remain modest, stimulating efforts to discover novel treatment options. Cancer-associated fibroblasts (CAF) represent a major component of the breast tumor microenvironment and have known immunosuppressive functions in addition to their well-established roles in directly promoting tumor growth and metastasis. Here we utilized paired syngeneic mouse mammary carcinoma models to show that CAF abundance is associated with insensitivity to combination aCTLA4 and aPD-L1 ICB. CAF-rich tumors exhibited an immunologically cold tumor microenvironment, with transcriptomic, flow cytometric, and quantitative histopathologic analyses demonstrating a relationship between CAF density and a CD8+ T-cell–excluded tumor phenotype. The CAF receptor Endo180 (Mrc2) is predominantly expressed on myofibroblastic CAFs, and its genetic deletion depleted a subset of aSMA-expressing CAFs and impaired tumor progression in vivo. The addition of wild-type, but not Endo180-deficient, CAFs in coimplantation studies restricted CD8+ T-cell intratumoral infiltration, and tumors in Endo180 knockout mice exhibited increased CD8+ T-cell infiltration and enhanced sensitivity to ICB compared with tumors in wild-type mice. Clinically, in a trial of melanoma patients, high MRC2 mRNA levels in tumors were associated with a poor response to aPD-1 therapy, highlighting the potential benefits of therapeutically targeting a specific CAF subpopulation in breast and other CAF-rich cancers to improve clinical responses to immunotherapy

    Loss of G9a preserves mutation patterns but increases chromatin accessibility, genomic instability and aggressiveness in skin tumours

    Get PDF
    Mutations in, and the altered expression of, epigenetic modifiers are pervasive in human tumours, making epigenetic factors attractive antitumour targets. The open-versus-closed chromatin state within the cells-of-origin of cancer correlates with the uneven distribution of mutations. However, the long-term effect of targeting epigenetic modifiers on mutability in patients with cancer is unclear. Here, we increased chromatin accessibility by deleting the histone H3 lysine 9 (H3K9) methyltransferase G9a in murine epidermis and show that this does not alter the single nucleotide variant burden or global genomic distribution in chemical mutagen-induced squamous tumours. G9a-depleted tumours develop after a prolonged latency compared with their wild-type counterparts, but are more aggressive and have an expanded cancer progenitor pool, pronounced genomic instability and frequent loss-of-function p53 mutations. Thus, we call for caution when assessing long-term therapeutic benefits of chromatin modifier inhibitors, which may promote more aggressive disease

    Epigenetic control of <em>IL-23</em> expression in keratinocytes is important for chronic skin inflammation

    Get PDF
    Although IL-23 is expressed by psoriatic keratinocytes as well as immune cells, only the immune cell derived IL-23 is thought to be important for the development of psoriasis. Here the authors provide evidence that keratinocyte-produced IL-23 is sufficient to cause a chronic skin inflammation

    Dnmt3a and Dnmt3b Associate with Enhancers to Regulate Human Epidermal Stem Cell Homeostasis

    Get PDF
    The genome-wide localization and function of endogenous Dnmt3a and Dnmt3b in adult stem cells are unknown. Here, we show that in human epidermal stem cells, the two proteins bind in a histone H3K36me3-dependent manner to the most active enhancers and are required to produce their associated enhancer RNAs. Both proteins prefer super-enhancers associated to genes that either define the ectodermal lineage or establish the stem cell and differentiated states. However, Dnmt3a and Dnmt3b differ in their mechanisms of enhancer regulation: Dnmt3a associates with p63 to maintain high levels of DNA hydroxymethylation at the center of enhancers in a Tet2-dependent manner, whereas Dnmt3b promotes DNA methylation along the body of the enhancer. Depletion of either protein inactivates their target enhancers and profoundly affects epidermal stem cell function. Altogether, we reveal novel functions for Dnmt3a and Dnmt3b at enhancers that could contribute to their roles in disease and tumorigenesis.The S.A.B. laboratory research is supported by the European Research Council (ERC), the Worldwide Cancer Research Foundation, the Foundation La Marató de TV3, the Spanish Ministry of Economy and Development, the Foundation Vencer el Cancer (‘‘Beat Cancer’’), the Government of Cataluña (SGR and Mario Salvia’ grants), the Foundation Fundación Botín, and the Institute for Research in Biomedicine (IRB-Barcelona). L.R. is a La Caixa Foundation Ph.D. fellow. G.S. was supported by an AXA postdoctoral fellowship. IRB Barcelona is the recipient of a Severo Ochoa Award of Excellence from MINECO (Government of Spain). L.D.C. was supported by grants from the Spanish Ministerio de Educación y Ciencia (SAF2013-48926-P) and the European Commission’s 7th Framework Program 4DCellFate grant number 277899. We are grateful to the Common Fund’s Epigenomic Program from the NIH (USA) for providing the bisulphite whole genome sequencing data of human EpSC

    The role of autophagy in the cross-talk between epithelial-mesenchymal transitioned tumor cells and cancer stem-like cells

    Get PDF
    Epithelial-mesenchymal transition (EMT) and cancer stem-like cells (CSC) are becoming highly relevant targets in anticancer drug discovery. A large body of evidence suggests that epithelial-mesenchymal transitioned tumor cells (EMT tumor cells) and CSCs have similar functions. There is also an overlap regarding the stimuli that can induce the generation of EMT tumor cells and CSCs. Moreover, direct evidence has been brought that EMT can give rise to CSCs. It is unclear however, whether EMT tumor cells should be considered CSCs or if they have to undergo further changes. In this article we summarize available evidence suggesting that, indeed, additional programs must be engaged and we propose that macroautophagy (hereafter, autophagy) represents a key trait distinguishing CSCs from EMT tumor cells. Thus, CSCs have often been reported to be in an autophagic state and blockade of autophagy inhibits CSCs. On the other hand, there is ample evidence showing that EMT and autophagy are distinct events. CSCs, however, represent, by themselves, a heterogeneous population. Thus, CSCs have been distinguished in predominantly noncycling and cycling CSCs, the latter representing CSCs that self-renew and replenish the pool of differentiated tumor cells. We now suggest that the non-cycling CSC subpopulation is in an autophagic state. We propose also two models to explain the relationship between EMT tumor cells and these two major CSC subpopulations: a branching model in which EMT tumor cells can give rise to cycling or non-cycling CSCs, respectively, and a hierarchical model in which EMT tumor cells are first induced to become autophagic CSCs and, subsequently, cycling CSCs. Finally, we address the therapeutic consequences of these insights

    MAF amplification licenses ERα through epigenetic remodelling to drive breast cancer metastasis

    Get PDF
    MAF amplification increases the risk of breast cancer (BCa) metastasis through mechanisms that are still poorly understood yet have important clinical implications. Oestrogen-receptor-positive (ER+) BCa requires oestrogen for both growth and metastasis, albeit by ill-known mechanisms. Here we integrate proteomics, transcriptomics, epigenomics, chromatin accessibility and functional assays from human and syngeneic mouse BCa models to show that MAF directly interacts with oestrogen receptor alpha (ERα), thereby promoting a unique chromatin landscape that favours metastatic spread. We identify metastasis-promoting genes that are de novo licensed following oestrogen exposure in a MAF-dependent manner. The histone demethylase KDM1A is key to the epigenomic remodelling that facilitates the expression of the pro-metastatic MAF/oestrogen-driven gene expression program, and loss of KDM1A activity prevents this metastasis. We have thus determined that the molecular basis underlying MAF/oestrogen-mediated metastasis requires genetic, epigenetic and hormone signals from the systemic environment, which influence the ability of BCa cells to metastasize
    corecore