Search CORE

2,026 research outputs found

How Do Families Try to Survive Yemen’s Brutal War? Following a Spiral of Research to Unexpected Conclusions

Author: O\u27Neil Rory
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/04/2018
Field of study

With the support of a Research Experience and Apprenticeship Program (REAP) grant, Rory gained a greater understanding of the conflict in Yemen as well as of the nature of political science research

UNH Scholars' Repository

Grounding event references in news

Author: Altena R.
Geerlings W.A.
Klingeren B. van
Lange W.C.M. de
Werf T.S.
Publication venue: School of Information Technologies
Publication date: 01/01/2000
Field of study

Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Sydney eScholarship

Radboud Repository

Dissertations of the University of Groningen

Grounding event references in news

Author: Nothman Joel
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2014
Field of study

Sydney eScholarship

NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

Author: Auer Sören
D'Souza Jennifer
Publication venue
Publication date: 01/01/2020
Field of study

We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761.Comment: In Proceedings of the 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE 2020) co-located with the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL 2020), Virtual Event, China, August 1. http://ceur-ws.org/Vol-2658

arXiv.org e-Print Archive

Repositorium für Naturwissenschaften und Technik

Recommended from our members

Contextual Semantics for Radicalisation Detection on Twitter

Author: Alani Harith
Fernandez Miriam
Publication venue: CEUR
Publication date: 08/10/2018
Field of study

Much research aims to detect online radical content mainly using radicalisation glossaries, i.e., by looking for terms and expressions associated with religion, war, offensive language, etc. However, such crude methods are highly inaccurate towards content that uses radicalisation terminology to simply report on current events, to share harmless religious rhetoric, or even to counter extremism. Language is complex and the context in which particular terms are used should not be disregarded. In this paper, we propose an approach for building a representation of the semantic context of the terms that are linked to radicalised rhetoric. We use this approach to analyse over 114K tweets that contain radicalisation-terms (around 17K posted by pro-ISIS users, and 97k posted by “general” Twitter users). We report on how the contextual information differs for the same radicalisation terms in the two datasets, which indicate that contextual semantics can help to better discriminate radical content from content that only uses radical terminology.The classifiers we built to test this hypothesis outperform those that disregard contextual informatio

Open Research Online (The Open University)

NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

Author: Auer Sören
D'Souza Jennifer
Lu Wie
Mayr Philipp
Zhang Chengzhi
Zhang Yi
Publication venue: Aachen, Germany : RWTH Aachen
Publication date: 01/01/2020
Field of study

We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. Question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers [18] of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761

Institutionelles Repositorium der Leibniz Universität Hannover

Informal learning evidence in online communities of mobile device enthusiasts

Author: Clough Gill
Jones Ann
McAndrew Patrick
Scanlon Eileen
Publication venue: 'Athabasca University Press'
Publication date: 30/12/2008
Field of study

This chapter describes a study that investigated the informal learning practices of enthusiastic mobile device owners. Informal learning is far more widespread than is often realized. Livingston (2000) pointed out that Canadian adults spend an average of fifteen hours per week on informal learning activities, more than they spend on formal learning activities. The motivation for these learning efforts generally comes from the individual, not from some outside force such as a school, university, or workplace. Therefore, in the absence of an externally imposed learning framework, informal learners will use whatever techniques,resources, and tools best suit their learning needs and personal preferences. As ownership of mobile technologies becomes increasingly widespread in the western world, it is likely that learners who have access to this technology will use it to support their informal learning efforts. This chapter presents the findings of a study into the various and innovative ways in which PDA and Smartphone users exploit mobile device functionality in their informal learning activities. The findings suggested that mobile device users deploy the mobile, connective, and collaborative capabilities of their devices in a variety of informal learning contexts, and in quite innovative ways. Trends emerged, such as the increasing importance of podcasting and audio and the use of built-in GPS, which may have implications for future studies. Informal learners identified learning activities that could be enhanced by the involvement of mobile technology, and developed methods and techniques that helped them achieve their learning goals

CiteSeerX

Open Research Online (The Open University)

The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts

Author: Giovanni Moretti
Rachele Sprugnoli (ORCID:0000-0001-6861-5595)
Sara Tonelli
Tommaso Caselli
Publication venue: place:Valencia (Spagna)
Publication date: 01/01/2017
Field of study

This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments

Crossref

Archivio istituzionale della Ricerca - Università degli Studi di Parma

PubliCatt

CompRes: A Dataset for Narrative Structure in News

Author: Levi Effi
Mor Guy
Sheafer Tamir
Shenhav Shaul
Publication venue
Publication date: 09/07/2020
Field of study

This paper addresses the task of automatically detecting narrative structures in raw texts. Previous works have utilized the oral narrative theory by Labov and Waletzky to identify various narrative elements in personal stories texts. Instead, we direct our focus to news articles, motivated by their growing social impact as well as their role in creating and shaping public opinion. We introduce CompRes -- the first dataset for narrative structure in news media. We describe the process in which the dataset was constructed: first, we designed a new narrative annotation scheme, better suited for news media, by adapting elements from the narrative theory of Labov and Waletzky (Complication and Resolution) and adding a new narrative element of our own (Success); then, we used that scheme to annotate a set of 29 English news articles (containing 1,099 sentences) collected from news and partisan websites. We use the annotated dataset to train several supervised models to identify the different narrative elements, achieving an

F_1

score of up to 0.7. We conclude by suggesting several promising directions for future work.Comment: Accpted to the First Joint Workshop on Narrative Understanding, Storylines, and Events, ACL 202

arXiv.org e-Print Archive