27 research outputs found
How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context
We study the influence of context on sentence acceptability. First we compare the acceptability ratings of sentences judged in isolation, with a relevant context, and with an irrelevant context. Our results show that context induces a cognitive load for humans, which compresses the distribution of ratings. Moreover, in relevant contexts we observe a discourse coherence effect which uniformly raises acceptability. Next, we test unidirectional and bidirectional language models in their ability to predict acceptability ratings. The bidirectional models show very promising results, with the best model achieving a new state-of-the-art for unsupervised acceptability prediction. The two sets of experiments provide insights into the cognitive aspects of sentence processing and central issues in the computational modelling of text and discourse
Evaluation of contextual embeddings on less-resourced languages
The current dominance of deep neural networks in natural language processing is based on contextual embeddings such as ELMo, BERT, and BERT derivatives. Most existing work focuses on English; in contrast, we present here the first multilingual empirical comparison of two ELMo and several monolingual and multilingual BERT models using 14 tasks in nine languages. In monolingual settings, our analysis shows that monolingual BERT models generally dominate, with a few exceptions such as the dependency parsing task, where they are not competitive with ELMo models trained on large corpora. In cross-lingual settings, BERT models trained on only a few languages mostly do best, closely followed by massively multilingual BERT models
Effect of the rs2259816 polymorphism in the HNF1A gene on circulating levels of c-reactive protein and coronary artery disease (the ludwigshafen risk and cardiovascular health study)
<p>Abstract</p> <p>Background</p> <p>C-reactive protein is a well established marker of inflammation and has been used to predict future cardiovascular disease. It is still controversial if it plays an active role in the development of cardiovascular disease. Recently, polymorphisms in the gene for HNF1Îą have been linked to the levels of C-reactive protein and coronary artery disease.</p> <p>Methods</p> <p>We investigated the association of the rs2259816 polymorphism in the HNF1A gene with the circulating level of C-reactive protein and the hazard of coronary artery disease in the LURIC Study cohort.</p> <p>Results</p> <p>Compared to CC homozygotes, the level of C-reactive protein was decreased in carriers of at least one A-allele. Each A-allele decreased CRP by approximately 15%. The odds ratio for coronary artery disease was only very slightly increased in carriers of the A-allele and this association did not reach statistical significance.</p> <p>Conclusions</p> <p>In the LURIC Study cohort the A-allele of rs2259816 is associated with decreased CRP but not with coronary artery disease.</p
CoSimLex: A Resource for Evaluating Graded Word Similarity in Context
State of the art natural language processing tools are built on context-dependent word embeddings, but no direct method for evaluating these representations currently exists. Standard tasks and datasets for intrinsic evaluation of embeddings are based on judgements of similarity, but ignore context; standard tasks for word sense disambiguation take account of context but do not provide continuous measures of meaning similarity. This paper describes an effort to build a new dataset, CoSimLex, intended to fill this gap. Building on the standard pairwise similarity task of SimLex-999, it provides context-dependent similarity measures; covers not only discrete differences in word sense but more subtle, graded changes in meaning; and covers not only a well-resourced language (English) but a number of less-resourced languages. We define the task and evaluation metrics, outline the dataset collection methodology, and describe the status of the dataset so far