3 research outputs found
Separating Actor-View from Speaker-View Opinion Expressions using Linguistic Features
We examine different features and classifiers for the categorization of opinion words into actor and speaker view. To our knowledge, this is the first comprehensive work to address sentiment views on the word level taking into consideration opinion verbs, nouns and adjectives. We consider many high-level features requiring only few labeled training data. A detailed feature analysis produces linguistic insights into the nature of sentiment views. We also examine how far global constraints between different opinion words help to increase classification performance. Finally, we show that our (prior) word-level annotation correlates with contextual sentiment views
SRL4ORL: Improving Opinion Role Labeling using Multi-task Learning with Semantic Role Labeling
For over a decade, machine learning has been used to extract
opinion-holder-target structures from text to answer the question "Who
expressed what kind of sentiment towards what?". Recent neural approaches do
not outperform the state-of-the-art feature-based models for Opinion Role
Labeling (ORL). We suspect this is due to the scarcity of labeled training data
and address this issue using different multi-task learning (MTL) techniques
with a related task which has substantially more data, i.e. Semantic Role
Labeling (SRL). We show that two MTL models improve significantly over the
single-task model for labeling of both holders and targets, on the development
and the test sets. We found that the vanilla MTL model which makes predictions
using only shared ORL and SRL features, performs the best. With deeper analysis
we determine what works and what might be done to make further improvements for
ORL.Comment: Published in NAACL 201
Linguistic Threat Assessment: Understanding Targeted Violence through Computational Linguistics
Language alluding to possible violence is widespread online, and security professionals are increasingly faced with the issue of understanding and mitigating this phenomenon. The volume of extremist and violent online data presents a workload that is unmanageable for traditional, manual threat assessment. Computational linguistics may be of particular relevance to understanding threats of grievance-fuelled targeted violence on a large scale. This thesis seeks to advance knowledge on the possibilities and pitfalls of threat assessment through automated linguistic analysis. Based on in-depth interviews with expert threat assessment practitioners, three areas of language are identified which can be leveraged for automation of threat assessment, namely, linguistic content, style, and trajectories. Implementations of each area are demonstrated in three subsequent quantitative chapters. First, linguistic content is utilised to develop the Grievance Dictionary, a psycholinguistic dictionary aimed at measuring concepts related to grievance-fuelled violence in text. Thereafter, linguistic content is supplemented with measures of linguistic style in order to examine the feasibility of author profiling (determining gender, age, and personality) in abusive texts. Lastly, linguistic trajectories are measured over time in order to assess the effect of an external event on an extremist movement. Collectively, the chapters in this thesis demonstrate that linguistic automation of threat assessment is indeed possible. The concluding chapter describes the limitations of the proposed approaches and illustrates where future potential lies to improve automated linguistic threat assessment. Ideally, developers of computational implementations for threat assessment strive for explainability and transparency. Furthermore, it is argued that computational linguistics holds particular promise for large-scale measurement of grievance-fuelled language, but is perhaps less suited to prediction of actual violent behaviour. Lastly, researchers and practitioners involved in threat assessment are urged to collaboratively and critically evaluate novel computational tools which may emerge in the future