42 research outputs found
Is writing style predictive of scientific fraud?
The problem of detecting scientific fraud using machine learning was recently
introduced, with initial, positive results from a model taking into account
various general indicators. The results seem to suggest that writing style is
predictive of scientific fraud. We revisit these initial experiments, and show
that the leave-one-out testing procedure they used likely leads to a slight
over-estimate of the predictability, but also that simple models can outperform
their proposed model by some margin. We go on to explore more abstract
linguistic features, such as linguistic complexity and discourse structure,
only to obtain negative results. Upon analyzing our models, we do see some
interesting patterns, though: Scientific fraud, for examples, contains less
comparison, as well as different types of hedging and ways of presenting
logical reasoning.Comment: To appear in the Proceedings of the Workshop on Stylistic Variation
2017 (EMNLP), 6 page
Implicit Discourse Relation Classification via Multi-Task Neural Networks
Without discourse connectives, classifying implicit discourse relations is a
challenging task and a bottleneck for building a practical discourse parser.
Previous research usually makes use of one kind of discourse framework such as
PDTB or RST to improve the classification performance on discourse relations.
Actually, under different discourse annotation frameworks, there exist multiple
corpora which have internal connections. To exploit the combination of
different discourse corpora, we design related discourse classification tasks
specific to a corpus, and propose a novel Convolutional Neural Network embedded
multi-task learning system to synthesize these tasks by learning both unique
and shared representations for each task. The experimental results on the PDTB
implicit discourse relation classification task demonstrate that our model
achieves significant gains over baseline systems.Comment: This is the pre-print version of a paper accepted by AAAI-1
Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification
Implicit discourse relation classification is of great challenge due to the
lack of connectives as strong linguistic cues, which motivates the use of
annotated implicit connectives to improve the recognition. We propose a feature
imitation framework in which an implicit relation network is driven to learn
from another neural network with access to connectives, and thus encouraged to
extract similarly salient features for accurate classification. We develop an
adversarial model to enable an adaptive imitation scheme through competition
between the implicit network and a rival feature discriminator. Our method
effectively transfers discriminability of connectives to the implicit features,
and achieves state-of-the-art performance on the PDTB benchmark.Comment: To appear in ACL201
Differences Over Discourse Structure Differences: A Reply to Urquhart and Urquhart
Purpose – In this paper we respond to Urquhart and Urquhart’s critique of our previous work entitled “Discourse structure differences in lay and professional health communication”, published in this journal in 2012 (Vol. 68 No. 6, pp.826 – 851, doi: 10.1108/00220411211277064).
Design/methodology/approach – We examine Urquhart and Urquhart’s critique and provide responses to their concerns and cautionary remarks against cross-disciplinary contributions. We reiterate our central claim.
Findings – We argue that Mann and Thompson’s (1987, 1988) Rhetorical Structure Theory (RST) offers valuable insights into computer-mediated health communication and deserves further discussion of its methodological strength and weaknesses for application in LIS.
Research limitations/implications – While we agree that some methodological limitations pointed out by Urquhart and Urquhart are valid, we take this opportunity to correct certain misunderstandings and misstatements.
Originality/value – We argue for continued use of innovative techniques borrowed from neighboring disciplines, in spite of objections from the researchers accustomed to a familiar strand of literature. We encourage researchers to consider RST and other computational linguistics-based discourse analysis annotation frameworks that could provide the basis for integrated research, and eventual applications in information behaviour and information retrieval