Search CORE

39 research outputs found

Is writing style predictive of scientific fraud?

Author: Braud Chloé
Søgaard Anders
Publication venue
Publication date: 01/01/2017
Field of study

The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators. The results seem to suggest that writing style is predictive of scientific fraud. We revisit these initial experiments, and show that the leave-one-out testing procedure they used likely leads to a slight over-estimate of the predictability, but also that simple models can outperform their proposed model by some margin. We go on to explore more abstract linguistic features, such as linguistic complexity and discourse structure, only to obtain negative results. Upon analyzing our models, we do see some interesting patterns, though: Scientific fraud, for examples, contains less comparison, as well as different types of hedging and ways of presenting logical reasoning.Comment: To appear in the Proceedings of the Workshop on Stylistic Variation 2017 (EMNLP), 6 page

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Detection of Abusive Language from Tweets in Social Networks

Author: Ms. Mohini S. Dadhe, Ms. Pranali S. Masidkar, Ms. Vishanka Vaidya, Prof. Priyanka A. Jalan
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/03/2018
Field of study

Detection of abusive language in user generated online con-tent has become an issue of increasing importance in recent years. Most current commercial methods make use of black-lists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted ex-samples of hate speech. In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. Finally, we use our detection tool to analyze abusive language over time and in different settings to further enhance our knowledge of this behavior

International Journal on Recent and Innovation Trends in Computing and Communication

Discourse Structures and Language Technologies

Author: Webber Bonnie
Publication venue
Publication date: 09/05/2011
Field of study

Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa. NEALT Proceedings Series, Vol. 11 (2011), 12-16. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16955

DSpace at Tartu University Library

GumDrop at the DISRPT2019 Shared Task: A Model Stacking Approach to Discourse Unit Segmentation and Connective Detection

Author: Gong Mackenzie
Liu Yan
Liu Yang
Peng Siyao
Yu Yue
Zeldes Amir
Zhu Yilun
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

In this paper we present GumDrop, Georgetown University's entry at the DISRPT 2019 Shared Task on automatic discourse unit segmentation and connective detection. Our approach relies on model stacking, creating a heterogeneous ensemble of classifiers, which feed into a metalearner for each final task. The system encompasses three trainable component stacks: one for sentence splitting, one for discourse unit segmentation and one for connective detection. The flexibility of each ensemble allows the system to generalize well to datasets of different sizes and with varying levels of homogeneity.Comment: Proceedings of Discourse Relation Parsing and Treebanking (DISRPT2019

arXiv.org e-Print Archive

Crossref

Discovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection

Author: Versley Yannick
Publication venue
Publication date: 30/11/2010
Field of study

Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors: Lars Ahrenberg, Jörg Tiedemann and Martin Volk. NEALT Proceedings Series, Vol. 10 (2010), 83-92. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15893

DSpace at Tartu University Library

HLT-FBK: a Complete Temporal Processing System for QA TempEval

Author: Minard Anne-Lyse
Mirza Paramita
Publication venue
Publication date: 01/01/2015
Field of study

The HLT-FBK system is a suite of SVMs-based classification models for extracting time expressions, events and temporal relations, each with a set of features obtained with the NewsReader NLP pipeline. HLT-FBK’s best system runs ranked 1st in all three domains, with a recall of 0.30 over all domains. Our attempts on increasing recall by considering all SRL predicates as events as well as utilizing event co-reference information in extracting temporal links result in significant improvements

Crossref

Archivio della ricerca - Fondazione Bruno Kessler