Search CORE

27 research outputs found

Capturing contentiousness: Constructing the contentious terms in context corpus

Author: Brate R. (Ryan)
Erp M. (Marieke) van
Hollink L. (Laura)
Nesterov A. (Andrei)
Ossenbruggen J.R. (Jacco) van
Vogelmann V. (Valentin)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2021
Field of study

Recent initiatives by cultural heritage institutions in addressing outdated and offensive language used in their collections demonstrate the need for further understanding into when terms are problematic or contentious. This paper presents an annotated dataset of 2,715 unique samples of terms in context, drawn from a historical newspaper archive, collating 21,800 annotations of contentiousness from expert and crowd workers. We describe the contents of the corpus by analysing inter-rater agreement and differences between experts and crowd workers. In addition, we demonstrate the potential of the corpus for automated detection of contentiousness. We show that a simple classifier applied to the embedding representation of a target word provides a better than baseline performance in predicting contentiousness. We find that the term itself and the context play a role in whether a term is considered contentious

VU Research Portal

CWI's Institutional Repository

Norse-Derived Terms in Orm’s Lexico-Semantic Field of EMOTION

Author: Antin P.
Brate
Britton Anderson
Burnley David
Clemoes Peter
Egge Albert Erikson
Heidermanns Frank
Hogg
Hogg Richard M.
Holt
I
Johannesson Nils-Lennart
Johannesson Nils-Lennart
Ker N. R.
L.
Language The
Latin Sources On Orm's
ME
ME
ME
Merrill Selah
Middle Much
Middle The
Miller D. Gary
Millward C. M.
Morin D. Germanus
Morrison Stephen
Noreen
OED
Orm
Ormulum Nordische Lehnwörter
Ormulum On
Oxford English Dictionary Historical Thesaurus
Parkes M. B.
Pons-Sanz Cp.
Pons-Sanz Sara M.
Pons-Sanz Sara M.
Pons-Sanz
Rynell Alarik
Sarrazin Gregor
Sauer Hans
Stafford Pauline
That ME
The HTOED
Vries De
Wełna Jerzy
Publication venue: 'University of Illinois Press'
Publication date: 01/01/2015
Field of study

Crossref

Online Research @ Cardiff

WestminsterResearch

Towards Olfactory Information Extraction from Text: A Case Study on Detecting Smell Experiences in Novels

Author: Brate R.
Groth P.
van Erp M.
Publication venue: International Committee on Computational Linguistics
Publication date: 01/01/2020
Field of study

Environmental factors determine the smells we perceive, but societal factors factors shape the importance, sentiment and biases we give to them. Descriptions of smells in text, or as we call them `smell experiences', offer a window into these factors, but they must first be identified. To the best of our knowledge, no tool exists to extract references to smell experiences from text. In this paper, we present two variations on a semi-supervised approach to identify smell experiences in English literature. The combined set of patterns from both implementations offer significantly better performance than a keyword-based baseline.Comment: Accepted to The 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2020). Barcelona, Spain. December 2020.

arXiv.org e-Print Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Improving language model predictions via prompts enriched with knowledge graphs

Author: Brate R
Dang M-H
He Y
Hoppe F
Meroño-Peñuela A
Sadashivaiah V
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2023
Field of study

Despite advances in deep learning and knowledge graphs (KGs), using language models for natural language understanding and question answering remains a challenging task. Pre-trained language models (PLMs) have shown to be able to leverage contextual information, to complete cloze prompts, next sentence completion and question answering tasks in various domains. Unlike structured data querying in e.g. KGs, mapping an input question to data that may or may not be stored by the language model is not a simple task. Recent studies have highlighted the improvements that can be made to the quality of information retrieved from PLMs by adding auxiliary data to otherwise naive prompts. In this paper, we explore the effects of enriching prompts with additional contextual information leveraged from the Wikidata KG on language model performance. Specifically, we compare the performance of naive vs. KG-engineered cloze prompts for entity genre classification in the movie domain. Selecting a broad range of commonly available Wikidata properties, we show that enrichment of cloze-style prompts with Wikidata information can result in a significantly higher recall for the investigated BERT and RoBERTa large PLMs. However, it is also apparent that the optimum level of data enrichment differs between models

Oxford University Research Archive

New worlds in political science

Political science’ is a ‘vanguard’ field concerned with advancing generic knowledge of political processes, while a wider ‘political scholarship’ utilising eclectic approaches has more modest or varied ambitions. Political science nonetheless necessarily depends upon and is epistemologically comparable with political scholarship. I deploy Boyer's distinctions between discovery, integration, application and renewing the profession to show that these connections are close woven. Two sets of key challenges need to be tackled if contemporary political science is to develop positively. The first is to ditch the current unworkable and restrictive comparative politics approach, in favour of a genuinely global analysis framework. Instead of obsessively looking at data on nation states, we need to seek data completeness on the whole (multi-level) world we have. A second cluster of challenges involves looking far more deeply into political phenomena; reaping the benefits of ‘digital-era’ developments; moving from sample methods to online census methods in organisational analysis; analysing massive transactional databases and real-time political processes (again, instead of depending on surveys); and devising new forms of ‘instrumentation’, informed by post-rational choice theoretical perspectives

Crossref

LSE Research Online

University of Canberra Research Repository

Research Papers in Economics