Search CORE

10,179 research outputs found

An application of distributional semantics for the analysis of the Holy Quran

Author: Benotto Giulia
Giovannetti Emiliano
NAHLI OUAFAE
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this contribution we illustrate the methodology and the results of an experiment we conducted by applying Distributional Semantics Models to the analysis of the Holy Quran. Our aim was to gather information on the potential differences in meanings that the same words might take on when used in Modern Standard Arabic w.r.t. their usage in the Quran. To do so we used the Penn Arabic Treebank as a contrastive corpu

Archivio della ricerca- Università di Roma La Sapienza

Enriching Existing Test Collections with OXPath

Author: P Mayr
R Berendsen
T Beckers
T Furche
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/06/2017
Field of study

Extending TREC-style test collections by incorporating external resources is a time consuming and challenging task. Making use of freely available web data requires technical skills to work with APIs or to create a web scraping program specifically tailored to the task at hand. We present a light-weight alternative that employs the web data extraction language OXPath to harvest data to be added to an existing test collection from web resources. We demonstrate this by creating an extended version of GIRT4 called GIRT4-XT with additional metadata fields harvested via OXPath from the social sciences portal Sowiport. This allows the re-use of this collection for other evaluation purposes like bibliometrics-enhanced retrieval. The demonstrated method can be applied to a variety of similar scenarios and is not limited to extending existing collections but can also be used to create completely new ones with little effort.Comment: Experimental IR Meets Multilinguality, Multimodality, and Interaction - 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September 11-14, 201

arXiv.org e-Print Archive

Crossref

A Corpus of Sentence-level Revisions in Academic Writing: A Step towards Understanding Statement Strength in Communication

Author: Lee Lillian
Tan Chenhao
Publication venue
Publication date: 01/01/2014
Field of study

The strength with which a statement is made can have a significant impact on the audience. For example, international relations can be strained by how the media in one country describes an event in another; and papers can be rejected because they overstate or understate their findings. It is thus important to understand the effects of statement strength. A first step is to be able to distinguish between strong and weak statements. However, even this problem is understudied, partly due to a lack of data. Since strength is inherently relative, revisions of texts that make claims are a natural source of data on strength differences. In this paper, we introduce a corpus of sentence-level revisions from academic writing. We also describe insights gained from our annotation efforts for this task.Comment: 6 pages, to appear in Proceedings of ACL 2014 (short paper

arXiv.org e-Print Archive

CiteSeerX

Crossref