Search CORE

96 research outputs found

Towards zero-shot language modeling

Author: Cotterell R
Korhonen A
Ponti EM
Reichart R
Vulić I
Publication venue: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference
Publication date: 01/01/2019
Field of study

Can we construct a neural language model which is inductively biased towards learning human language? Motivated by this question, we aim at constructing an informative prior for held-out languages on the task of character-level, open-vocabulary language modeling. We obtain this prior as the posterior over network weights conditioned on the data from a sample of training languages, which is approximated through Laplace’s method. Based on a large and diverse sample of languages, the use of our prior outperforms baseline models with an uninformative prior in both zero-shot and few-shot settings, showing that the prior is imbued with universal linguistic knowledge. Moreover, we harness broad language-specific information available for most languages of the world, i.e., features from typological databases, as distant supervision for held-out languages. We explore several language modeling conditioning techniques, including concatenation and meta-networks for parameter generation. They appear beneficial in the few-shot setting, but ineffective in the zero-shot setting. Since the paucity of even plain digital text affects the majority of the world’s languages, we hope that these insights will broaden the scope of applications for language technology

Crossref

Apollo (Cambridge)

Specializing distributional vectors of allwords for lexical entailment

Author: Glavaš G
Kamath A
Pfeiffer J
Ponti EM
Vulić I
Publication venue: ACL 2019 - 4th Workshop on Representation Learning for NLP, RepL4NLP 2019 - Proceedings of the Workshop
Publication date: 01/01/2019
Field of study

Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e.g., WordNet) to accentuate a particular relation between words. However, such post-processing methods suffer from limited coverage as they affect only vectors of words seen in the external resources. We present the first postprocessing method that specializes vectors of all vocabulary words – including those unseen in the resources – for the asymmetric relation of lexical entailment (LE) (i.e., hyponymyhypernymy relation). Leveraging a partially LE-specialized distributional space, our POSTLE (i.e., post-specialization for LE) model learns an explicit global specialization function, allowing for specialization of vectors of unseen words, as well as word vectors from other languages via cross-lingual transfer. We capture the function as a deep feedforward neural network: its objective re-scales vector norms to reflect the concept hierarchy while simultaneously attracting hyponymyhypernymy pairs to better reflect semantic similarity. An extended model variant augments the basic architecture with an adversarial discriminator. We demonstrate the usefulness and versatility of POSTLE models with different input distributional spaces in different scenarios (monolingual LE and zero-shot cross-lingual LE transfer) and tasks (binary and graded LE). We report consistent gains over state-of-the-art LE-specialization methods, and successfully LE-specialize word vectors for languages without any external lexical knowledge

TUbiblio

Crossref

MAnnheim DOCument Server (Univ. Mannheim)

Edinburgh Research Explorer

Apollo (Cambridge)

On the relation between linguistic typology and (limitations of) multilingual language modeling

Author: Gerz D
Korhonen A
Ponti EM
Reichart R
Vulić I
Publication venue: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
Publication date: 01/01/2018
Field of study

A key challenge in cross-lingual NLP is developing general language-independent architectures that are equally applicable to any language. However, this ambition is largely hampered by the variation in structural and semantic properties, i.e. the typological profiles of the world's languages. In this work, we analyse the implications of this variation on the language modeling (LM) task. We present a large-scale study of state-of-the art n-gram based and neural language models on 50 typologically diverse languages covering a wide variety of morphological systems. Operating in the full vocabulary LM setup focused on word-level prediction, we demonstrate that a coarse typology of morphological systems is predictive of absolute LM performance. Moreover, fine-grained typological features such as exponence, flexivity, fusion, and inflectional synthesis are borne out to be responsible for the proliferation of low-frequency phenomena which are organically difficult to model by statistical architectures, or for the meaning ambiguity of character n-grams. Our study strongly suggests that these features have to be taken into consideration during the construction of next-level language-agnostic LM architectures, capable of handling morphologically complex languages such as Tamil or Korean.ERC grant Lexica

Crossref

Edinburgh Research Explorer

Apollo (Cambridge)

Adversarial propagation and zero-shot cross-lingual transfer of word vector specialization

Author: Glavaš G
Korhonen A
Mrkšić N
Ponti EM
Vulić I
Publication venue: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
Publication date: 01/01/2018
Field of study

Semantic \specialization is a process of fine-tuning pre-trained distributional word vectors using external lexical knowledge (e.g., WordNet) to accentuate a particular semantic relation in the specialized vector space. While post-processing specialization methods are applicable to arbitrary distributional vectors, they are limited to updating only the vectors of words occurring in external lexicons (i.e., seen words), leaving the vectors of all other words unchanged. We propose a novel approach to specializing the full distributional vocabulary. Our adversarial post-specialization method propagates the external lexical knowledge to the full distributional space. We exploit words seen in the resources as training examples for learning a global specialization function. This function is learned by combining a standard L2-distance loss with a adversarial loss: the adversarial component produces more realistic output vectors. We show the effectiveness and robustness of the proposed method across three languages and on three tasks: word similarity, dialog state tracking, and lexical simplification. We report consistent improvements over distributional word vectors and vectors specialized by other state-of-the-art specialization frameworks. Finally, we also propose a cross-lingual transfer method for zero-shot specialization which successfully specializes a full target distributional space without any lexical knowledge in the target language and without any bilingual data

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server (Univ. Mannheim)

Edinburgh Research Explorer

Apollo (Cambridge)

Cross-lingual semantic specialization via lexical relation induction

Author: Glavaš G
Korhonen A
Ponti EM
Reichart R
Vulić I
Publication venue: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference
Publication date: 01/01/2019
Field of study

Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints. However, this technique cannot be leveraged in many languages, because their structured external resources are typically incomplete or non-existent. To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. Our specialization transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation; and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. This allows us to specialize any set of distributional vectors in the target language with the refined constraints. We prove the effectiveness of our method through intrinsic word similarity evaluation in 8 languages, and with 3 downstream tasks in 5 languages: lexical simplification, dialog state tracking, and semantic textual similarity. The gains over the previous state-of-art specialization methods are substantial and consistent across languages. Our results also suggest that the transfer method is effective even for lexically distant source-target language pairs. Finally, as a by-product, our method produces lists of WordNet-style lexical relations in resource-poor languages

Crossref

MAnnheim DOCument Server (Univ. Mannheim)

Edinburgh Research Explorer

Apollo (Cambridge)

Decoding sentiment from distributed representations of sentences

Author: Korhonen A
Ponti EM
Vulić I
Publication venue: *SEM 2017 - 6th Joint Conference on Lexical and Computational Semantics, Proceedings
Publication date: 01/01/2017
Field of study

Distributed representations of sentences have been developed recently to represent their meaning as real-valued vectors. However, it is not clear how much information such representations retain about the polarity of sentences. To study this question, we decode sentiment from unsupervised sentence representations learned with different architectures (sensitive to the order of words, the order of sentences, or none) in 9 typologically diverse languages. Sentiment results from the (recursive) composition of lexical items and grammatical strategies such as negation and concession. The results are manifold: we show that there is no `one-size-fits-all' representation architecture outperforming the others across the board. Rather, the top-ranking architectures depend on the language and data at hand. Moreover, we find that in several cases the additive composition model based on skip-gram word vectors may surpass supervised state-of-art architectures such as bidirectional LSTMs. Finally, we provide a possible explanation of the observed variation based on the type of negative constructions in each language

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Apollo (Cambridge)

Study of the excess Fe XXV line emission in the central degrees of the Galactic centre using XMM-Newton data

Author: Anastasopoulou K
Campana S
Churazov EM
Di Teodoro EM
Haberl F
Jin C
Khabibullin I
Locatelli N
Maitra C
Mondal S
Morris MR
Ponti G
Sasaki M
Schödel R
Sormani M
Zhang Y
Zheng X
Publication venue
Publication date: 01/01/2023
Field of study

The diffuse Fe XXV (6.7 keV) line emission observed in the Galactic ridge is widely accepted to be produced by a superposition of a large number of unresolved X-ray point sources. In the very central degrees of our Galaxy, however, the existence of an extremely hot (~7 keV) diffuse plasma is still under debate. In this work we measure the Fe XXV line emission using all available XMM-Newton observations of the Galactic centre (GC) and inner disc (-10 < l < 10, -2 < b < 2). We use recent stellar mass distribution models to estimate the amount of X-ray emission originating from unresolved point sources, and find that within a region of l = ±1 and b = ±0.25 the 6.7keV emission is 1.3-1.5 times in excess of what is expected from unresolved point sources. The excess emission is enhanced towards regions where known supernova remnants are located, suggesting that at least a part of this emission is due to genuine diffuse very hot plasma. If the entire excess is due to very hot plasma, an energy injection rate of at least ~6 × 1040 erg s-1 is required, which cannot be provided by the measured supernova explosion rate or past Sgr A∗ activity alone. However, we find that almost the entire excess we observe can be explained by assuming GC stellar populations with iron abundances ~1.9 times higher than those in the bar/bulge, a value that can be reproduced by fitting diffuse X-ray spectra from the corresponding regions. Even in this case, a leftover X-ray excess is concentrated within l = ±0.3 and b = ±0.15, corresponding to a thermal energy of ~2 × 1052 erg, which can be reproduced by the estimated supernova explosion rate in the GC. Finally we discuss a possible connection to the observed GC Fermi-LAT excess

Archivio istituzionale della ricerca - Università dell'Insubria

Hypomelanosis of Ito with a trisomy 2 mosaicism: a case report

Author: A Griffiths
A Patrizi
A Schinzel
Aldo Tomasi
Antonio Percesepe
Azzurra Guerra
Carmelo Guarneri
Claudia Neri
Cristina Menozzi
DB Flannery
DL Cram
Elif Kisla
EM Grosshans
Giovanni Pellacani
Giovanni Ponti
H Sago
J Piñol
JE Jelinek
KS Nehal
L Hallgren
M Amon
M Ito
M Ruggieri
M Zappella
MB Rubin
MF Schwartz
P Montagna
Piril Cevikel
R Ruiz-Maldonado
R Sacrez
S Gupta
S Ronger
Stefania Seidenari
T Masumizu
Victor Desmond Mandel
W Vormittag
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Introduction: Hypomelanosis of Ito is a rare neurocutaneous disorder, characterized by streaks and swirls of hypopigmentation following the lines of Blaschko that may be associated to systemic abnormalities involving the central nervous system and musculoskeletal system. Despite the preponderance of reported sporadic hypomelanosis of Ito, few reports of familial hypomelanosis of Ito have been described. Case presentation: A 6-month-old Caucasian girl presented with unilateral areas of hypomelanosis distributed on the left half of her body and her father presented with similar mosaic hypopigmented lesions on his upper chest. Whereas both blood karyotypes obtained from peripheral lymphocyte cultures were normal, a 16% trisomy 2 mosaicism was found in cultured skinfibroblasts derived from a hypopigmented skin area of her father. Conclusions: Familial cases of hypomelanosis of Ito are very rare and can occur in patients without systemic involvement. Hypomelanosis of Ito constitutes a non-specific diagnostic definition including different clinical entities with a wide phenotypic variability, either sporadic or familial. Unfortunately, a large number of cases remain misdiagnosed due to both diagnostic challenges and controversial issues on cutaneous biopsies in the pediatric population

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Crossref

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Hypomelanosis of Ito with a trisomy 2 mosaicism: a case report

Author: A Griffiths
A Patrizi
A Schinzel
Aldo Tomasi
Antonio Percesepe
Azzurra Guerra
Carmelo Guarneri
Claudia Neri
Cristina Menozzi
DB Flannery
DL Cram
Elif Kisla
EM Grosshans
Giovanni Pellacani
Giovanni Ponti
H Sago
J Piñol
JE Jelinek
KS Nehal
L Hallgren
M Amon
M Ito
M Ruggieri
M Zappella
MB Rubin
MF Schwartz
P Montagna
Piril Cevikel
R Ruiz-Maldonado
R Sacrez
S Gupta
S Ronger
Stefania Seidenari
T Masumizu
Victor Desmond Mandel
W Vormittag
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Prostate Cancer Cell Lines under Hypoxia Exhibit Greater Stem-Like Properties

Author: A Mohan
AT Collins
C Chen
CE Forristal
CG Marsden
CR Jeter
CR Jeter
CS Bryant
D Ponti
Dean G. Tang
Dongming Liang
EM Hurt
F Arai
F Crea
G Dontu
GS Palapattu
Gunnar Kvalheim
H Li
I Ben-Porath
J Miki
J Shin
J Zhang
Jahn M. Nesland
JE Oates
JH Kim
Jian Liu
JM Heddleston
JM Sperger
JY Park
K Hua
Karol Axcrona
KL Covello
KM Bae
KS Kimbro
L Cortes-Dericks
L Khandrika
L Marignol
L Patrawala
LA Mathews
LE Pascal
MY Koh
N Monsef
N Sharifi
NJ Maitland
P Krishnamurthy
P Malladi
P Sotomayor
R Mori
RI Bhatt
RU Lukacs
S Ambady
S Nirasawa
S Peng
S Saigusa
SP Hong
SY Yasuda
T Ishimoto
T Liu
T Ma
Trond Stokke
Y Kim
Y Kim
Yuanyuan Ma
Z Li
Zhenhe Suo
Publication venue: Public Library of Science
Publication date: 28/12/2011
Field of study

Hypoxia is an important environmental change in many cancers. Hypoxic niches can be occupied by cancer stem/progenitor-like cells that are associated with tumor progression and resistance to radiotherapy and chemotherapy. However, it has not yet been fully elucidated how hypoxia influences the stem-like properties of prostate cancer cells. In this report, we investigated the effects of hypoxia on human prostate cancer cell lines, PC-3 and DU145. In comparison to normoxia (20% O2), 7% O2 induced higher expressions of HIF-1α and HIF-2α, which were associated with upregulation of Oct3/4 and Nanog; 1% O2 induced even greater levels of these factors. The upregulated NANOG mRNA expression in hypoxia was confirmed to be predominantly retrogene NANOGP8. Similar growth rates were observed for cells cultivated under hypoxic and normoxic conditions for 48 hours; however, the colony formation assay revealed that 48 hours of hypoxic pretreatment resulted in the formation of more colonies. Treatment with 1% O2 also extended the G0/G1 stage, resulting in more side population cells, and induced CD44 and ABCG2 expressions. Hypoxia also increased the number of cells positive for ABCG2 expression, which were predominantly found to be CD44bright cells. Correspondingly, the sorted CD44bright cells expressed higher levels of ABCG2, Oct3/4, and Nanog than CD44dim cells, and hypoxic pretreatment significantly increased the expressions of these factors. CD44bright cells under normoxia formed significantly more colonies and spheres compared with the CD44dim cells, and hypoxic pretreatment even increased this effect. Our data indicate that prostate cancer cells under hypoxia possess greater stem-like properties

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central