Search CORE

25 research outputs found

Sentiment analysis of corona-musicking online reveals bifurcation of pandemic coping strategies

Author: Baglini Rebekah
Chr. Hansen Niels
Publication venue: 'Edinburgh University Library'
Publication date: 31/08/2022
Field of study

Journal Hosting Service | The University of Edinburgh

DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition

Author: Baglini Rebekah
Enevoldsen Kenneth
Jessen Emil Trenckner
Publication venue
Publication date: 28/02/2024
Field of study

Named entity recognition is one of the cornerstones of Danish NLP, essential for language technology applications within both industry and research. However, Danish NER is inhibited by a lack of available datasets. As a consequence, no current models are capable of fine-grained named entity recognition, nor have they been evaluated for potential generalizability issues across datasets and domains. To alleviate these limitations, this paper introduces: 1) DANSK: a named entity dataset providing for high-granularity tagging as well as within-domain evaluation of models across a diverse set of domains; 2) DaCy 2.6.0 that includes three generalizable models with fine-grained annotation; and 3) an evaluation of current state-of-the-art models' ability to generalize across domains. The evaluation of existing and new models revealed notable performance discrepancies across domains, which should be addressed within the field. Shortcomings of the annotation quality of the dataset and its impact on model training and evaluation are also discussed. Despite these limitations, we advocate for the use of the new dataset DANSK alongside further work on the generalizability within Danish NER

arXiv.org e-Print Archive

Direct Causation: A New Approach to an Old Question

Author: Baglini Rebekah
Siegal Elitzur A. Bar-Asher
Publication venue: ScholarlyCommons
Publication date: 01/10/2020
Field of study

Causative constructions come in lexical and periphrastic variants, exemplified in English by Sam killed Lee and Sam caused Lee to die. While use of the former, the lexical causative, entails the truth of the latter, an entailment in the other direction does not hold. The source of this asymmetry is commonly ascribed to the lexical causative having an additional prerequisite of “direct causation , such that the causative relation holds between a contiguous cause and effect (Fodor 1970, Katz 1970). However, this explanation encounters both empirical and theoretical problems (Nelleman & van der Koot 2012). To explain the source of the directness inferences (as well as other longstanding puzzles), we propose a formal analysis based on the framework of Structural Equation Models (SEMs) (Pearl 2000) which provides the necessary background for licensing causal inferences. Specifically, we provide a formalization of a \u27sufficient set of conditions\u27 within a model and demonstrate its role in the selectional parameters of causative descriptions. We argue that “causal sufficiency” is not a property of singular conditions, but rather sets of conditions, which are individually necessary but only sufficient when taken together (a view originally motivated in the philosophical literature by Mackie 1965). We further introduce the notion of a “completion event” of a sufficient set, which is critical to explain the particular inferential profile of lexical causatives

ScholarlyCommons@Penn

MULTILINGUAL SENTIMENT NORMALIZATION FOR SCANDINAVIAN LANGUAGES

Author: Baglini Rebekah Brita
Enevoldsen Kenneth
Hansen Lasse
Nielbo Kristoffer Laigaard
Publication venue: 'Aarhus University Library'
Publication date: 31/12/2021
Field of study

In this paper, we address the challenge of multilingual sentiment analysis using a traditional lexicon and rule-based sentiment instrument that is tailored to capture sentiment patterns in a particular language. Focusing on a case study of three closely related Scandinavian languages (Danish, Norwegian, and Swedish) and using three tailored versions of VADER, we measure the relative degree of variation in valence using the OPUS corpus. We found that scores for Swedish are systematically skewed lower than Danish for translational pairs, and that scores for Norwegian are skewed higher for both other languages. We use a neural network to optimize the fit between Norwegian and Swedish respectively and Danish as the reference (target) language

Tidsskrift.dk (Det Kongelige Bibliotek)

Speaker Attitude and Sexual Orientation Affect Phonetic Imitation

Author: Abrego-Collier Carissa
Baglini Rebekah
Grano Tommy
Martinovic Martina
Otte Charles, III
Thomas Julia
Urban Jasmin
Yu Alan
Publication venue: ScholarlyCommons
Publication date: 01/01/2011
Field of study

Numerous studies have documented the phenomenon of phonetic convergence: the process by which speakers alter their productions to become more similar on some phonetic or acoustic dimension to those of their interlocutor. Though social factors have been suggested as a motivator for imitation, few studies have established a tight connection between these extralinguistic factors and a speaker’s likelihood to imitate. The present study explores the effects of perceived sexual orientation and speaker attitude toward the interlocutor on the likelihood of imitation for extended VOT. Experimental results show that the extent of phonetic convergence (and divergence) depends on the perceived sexual orientation of the talker as well as whether the speaker is positively disposed to the interlocutor

ScholarlyCommons@Penn

The Danish Gigaword Project

Author: Baglini Rebekah
Christiansen Morten H.
Ciosici Manuel R.
Dalsgaard Jacob Aarup
Fusaroli Riccardo
Henrichsen Peter Juel
Hvingelby Rasmus
Kirkedal Andreas
Kjeldsen Alex Speed
Ladefoged Claus
Nielsen Finn Årup
Petersen Malte Lau
Rystrøm Jonathan Hvithamar
Strømberg-Derczynski Leon
Varab Daniel
Publication venue
Publication date: 01/01/2020
Field of study

Danish is a North Germanic/Scandinavian language spoken primarily in Denmark, a country with a tradition of technological and scientific innovation. However, from a technological perspective, the Danish language has received relatively little attention and, as a result, Danish language technology is hard to develop, in part due to a lack of large or broad-coverage Danish corpora. This paper describes the Danish Gigaword project, which aims to construct a freely-available one billion word corpus of Danish text that represents the breadth of the written language

arXiv.org e-Print Archive

Copenhagen University Research Information System

Online Research Database In Technology

Recommended from our members

Causal and associational language in observational health research: a systematic evaluation

Author: Aguirre Ariadne
Alsalti Taym
Alshihayb Talal
Antonietti Alberto
Arah Onyebuchi
Au Eric
Axfors Cathrine
Baglini Rebekah
Booman Anna
Calvache Jose
Chatton Arthur
Dabravolskaj Julia
Do Stephanie
Dufour Mi-Suk
Dunleavy Daniel
Evans Thomas
Fox Matthew
Haber Noah
Hoopsick Rachel
Howcutt Sarah
Judd Nicholas
Kelson Mark
Khalatbari-Soltani Saman
Khan Palwasha
Lam Sze
Leyrat Clémence
McLinden Taylor
Meyerowitz-Katz Gideon
Murray Eleanor
O'Donoghue Ashley
Odu Nnaemeka
Parra Camila
Peña Sebastián
Pilleron Sophie
Riederer Emily
Rodriguez-Molina Daloha
Rohrer Julia
Salvia Meg
Schmid Ian
Schoenegger Philipp
Seiler Jessie
Simmons Alison
Steriu Andreea
Stuart Elizabeth
Suresh Shashank
Takashima Mari
Tennant Peter
Twardowski Sarah
Wieten Sarah
Publication venue: Bloomberg School of Public Health - Oxford University Press
Publication date: 04/08/2022
Field of study

We estimated the degree to which language used in the high profile medical/public health/epidemiology literature implied causality using language linking exposures to outcomes and action recommendations; examined disconnects between language and recommendations; identified the most common linking phrases; and estimated how strongly linking phrases imply causality. We searched and screened for 1,170 articles from 18 high-profile journals (65 per journal) published from 2010-2019. Based on written framing and systematic guidance, three reviewers rated the degree of causality implied in abstracts and full text for exposure/outcome linking language and action recommendations. Reviewers rated the causal implication of exposure/outcome linking language as None (no causal implication) in 13.8%, Weak 34.2%, Moderate 33.2%, and Strong 18.7% of abstracts. The implied causality of action recommendations was higher than the implied causality of linking sentences for 44.5% or commensurate for 40.3% of articles. The most common linking word in abstracts was “associate” (45.7%). Reviewer’s ratings of linking word roots were highly heterogeneous; over half of reviewers rated “association” as having at least some causal implication. This research undercuts the assumption that avoiding “causal” words leads to clarity of interpretation in medical research

Greenwich Academic Literature Archive

LSHTM Research Online

eScholarship - University of California

University of St. Andrews - Pure

St Andrews Research Repository

The Middle Construction in Mandarin Chinese

Author: Baglini Rebekah
Publication venue
Publication date: 01/01/2007
Field of study

The middle is an un accusative construction which expresses a modal generalization over events\ud (Keyser and Roeper 1984). Although the middle is not homogenous cross-linguistically (Ting 2006),\ud manifestations of the middle have been observed in most Indo-European languages. In this thesis, I\ud will develop criteria for middles based on cross-linguistic generalizations and argue for the existence of\ud a middle construction in Chinese. Chinese has a class of so-called 'notional passives,' unaccusative\ud sentences which display active morphology but receive passive interpretation. I will provide evidence\ud that the notional passive is distinct both structurally and semantically from the canonical Chinese\ud passive and demonstrate the inadequacy of the topic-comment account of such constructions proposed\ud by Li and Thompson (1981).\ud My account of the middle will crucially define it as a resultative form in Chinese, appearing\ud exclusively with Resultative Verb Compounds (RVCs). I will adopt Cheng and Huang's (1994)\ud classification of RVCs into four verbal subcategories (unergative, transitive, ergative, and causative)\ud and consider the syntactic and semantic properties of the resultative middle based on the argument\ud structure of its component predicates. Using data, I will analyze whether these Chinese middle verbs\ud pattern in a predictable, cross-linguistically consistent way, considerin~ syntactic distribution,\ud aspectual composition, and semantic constraints on middle formation

TriCollege Libraries Institutional Repository