Search CORE

819 research outputs found

Identifying duplicate content using statistically improbable phrases

Author: A. C. George
Bailey
Barnard
Blancett
Bloemenkamp
CHENNAGIRI
Durani
Errami
G tzsche
H. R. Garner
J. D. Wren
Kostoff
Long
M. A. Skinner
M. Errami
Martinson
Mojon-Azzi
Roig
Rosenthal
Schein
T. C. Long
von Elm
Z. Sun
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Document similarity metrics such as PubMed's ‘Find related articles’ feature, which have been primarily used to identify studies with similar topics, can now also be used to detect duplicated or potentially plagiarized papers within literature reference databases. However, the CPU-intensive nature of document comparison has limited MEDLINE text similarity studies to the comparison of abstracts, which constitute only a small fraction of a publication's total text. Extending searches to include text archived by online search engines would drastically increase comparison ability. For large-scale studies, submitting short phrases encased in direct quotes to search engines for exact matches would be optimal for both individual queries and programmatic interfaces. We have derived a method of analyzing statistically improbable phrases (SIPs) for assistance in identifying duplicate content

Crossref

PubMed Central

An IR-based Approach Utilising Query Expansion for Plagiarism Detection in MEDLINE

Author: Clough P.
Nawab R.M.A.
Stevenson M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2017
Field of study

The identification of duplicated and plagiarised passages of text has become an increasingly active area of research. In this paper we investigate methods for plagiarism detection that aim to identify potential sources of plagiarism from MEDLINE, particularly when the original text has been modified through the replacement of words or phrases. A scalable approach based on Information Retrieval is used to perform candidate document selection - the identification of a subset of potential source documents given a suspicious text - from MEDLINE. Query expansion is performed using the ULMS Metathesaurus to deal with situations in which original documents are obfuscated. Various approaches to Word Sense Disambiguation are investigated to deal with cases where there are multiple Concept Unique Identifiers (CUIs) for a given term. Results using the proposed IR-based approach outperform a state-of-the-art baseline based on Kullback-Leibler Distance

Crossref

White Rose Research Online

A new pathway for Motriz: challenges and commitments

Author: Albers C.A.
Bretag T.
Collberg C.
Covan E.K.
Errami M.
Errami M.
Freda M.C.
Lee B.M.
Thomas S.P.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

The Evangelical Lutheran women of the ELCIC: vitality and fragile presence on the margins

Author: Jaugelis Vida
Publication venue: Scholars Commons @ Laurier
Publication date: 01/05/2008
Field of study

Wilfrid Laurier University

Data Fingerprinting with Similarity Digests

Author: A. Broder
A. Broder
B. Bloom
J. Kornblum
M. Mitzenmacher
V. Roussev
V. Roussev
V. Roussev
V. Roussev
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Philosophy’s gender gap and argumentative arena: an empirical study

Author: Dickinson Michael Adam
Mizrahi Moti
Publication venue
Publication date: 01/01/2022
Field of study

While the empirical evidence pointing to a gender gap in professional, academic philosophy in the English-speaking world is widely accepted, explanations of this gap are less so. In this paper, we aim to make a modest contribution to the literature on the gender gap in academic philosophy by taking a quantitative, corpus-based empirical approach. Since some philosophers have suggested that it may be the argumentative, “logic-chopping,” and “paradox-mongering” nature of academic philosophy that explains the underrepresentation of women in the discipline, our research questions are the following: Do men and women philosophers make different types of arguments in their published works? If so, which ones and with what frequency? Using data mining and text analysis methods, we study a large corpus of philosophical texts mined from the JSTOR database in order to answer these questions empirically. Using indicator words to classify arguments by type, we search through our corpus to find patterns of argumentation. Overall, the results of our empirical study suggest that women philosophers make deductive, inductive, and abductive arguments in their published works just as much as male philosophers do, with no statistically significant differences in the proportions of those arguments relative to each philosopher’s body of work

PhilPapers

A refresher in research publication ethics

Author: Agneta Yngve
Allison Hodge
Geraldine McNeill
Graf
Irja Haapala
Marilyn Tseng
Publication venue: 'Cambridge University Press (CUP)'
Publication date
Field of study

Crossref

Word Order in Epigraphic Gǝ’ǝz

Author: Bulakh Maria
Publication venue: 'Staats- und Universitatsbibliothek Hamburg Carl von Ossietzky'
Publication date: 04/12/2013
Field of study

The paper offers the results of analysis of word order throughout the epigraphic corpus of Gǝʿǝz. This evidence is mostly in agreement with the data from Classical Gǝʿǝz and confirms that early Gǝʿǝz represents the classical Semitic type of a right-branching language: objects and prepositional phrases mostly follow the verbs, and relative clauses and genitive complements usually follow the head nouns. At the same time, some differences between the syntax of Classical Gǝʿǝz and Epigraphic Gǝʿǝz have been registered, notably in the behaviour of numerals.

Zeitschriftenserver von Hamburg University Press Verlag der Staats- und Universitätsbibliothek Hamburg

Aethiopica

Recommended from our members

How Censorship in China Allows Government Criticism but Silences Collective Expression

Author: King Gary
Pan Jennifer Jie
Roberts Margaret Earling
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 10/03/2014
Field of study

We offer the first large scale, multiple source analysis of the outcome of what may be the most extensive effort to selectively censor human expression ever implemented. To do this, we have devised a system to locate, download, and analyze the content of millions of social media posts originating from nearly 1,400 different social media services all over China before the Chinese government is able to find, evaluate, and censor (i.e., remove from the Internet) the large subset they deem objectionable. Using modern computer-assisted text analytic methods that we adapt to and validate in the Chinese language, we compare the substantive content of posts censored to those not censored over time in each of 85 topic areas. Contrary to previous understandings, posts with negative, even vitriolic, criticism of the state, its leaders, and its policies are not more likely to be censored. Instead, we show that the censorship program is aimed at curtailing collective action by silencing comments that represent, reinforce, or spur social mobilization, regardless of content. Censorship is oriented toward attempting to forestall collective activities that are occurring now or may occur in the future --- and, as such, seem to clearly expose government intent.Governmen

Harvard University - DASH

An Analysis of the Leadership Practices of Selected Superintendents in Du Page County in Terms of Their Implementation of Transformational Leadership

Author: Avery Roberta J.
Publication venue: Loyola eCommons
Publication date: 01/01/1994
Field of study

Loyola eCommons