2,026 research outputs found
How Do Families Try to Survive Yemen’s Brutal War? Following a Spiral of Research to Unexpected Conclusions
With the support of a Research Experience and Apprenticeship Program (REAP) grant, Rory gained a greater understanding of the conflict in Yemen as well as of the nature of political science research
Grounding event references in news
Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation
Grounding event references in news
Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation
NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature
We describe an annotation initiative to capture the scholarly contributions
in natural language processing (NLP) articles, particularly, for the articles
that discuss machine learning (ML) approaches for various information
extraction tasks. We develop the annotation task based on a pilot annotation
exercise on 50 NLP-ML scholarly articles presenting contributions to five
information extraction tasks 1. machine translation, 2. named entity
recognition, 3. question answering, 4. relation classification, and 5. text
classification. In this article, we describe the outcomes of this pilot
annotation phase. Through the exercise we have obtained an annotation
methodology; and found ten core information units that reflect the contribution
of the NLP-ML scholarly investigations. The resulting annotation scheme we
developed based on these information units is called NLPContributions.
The overarching goal of our endeavor is four-fold: 1) to find a systematic
set of patterns of subject-predicate-object statements for the semantic
structuring of scholarly contributions that are more or less generically
applicable for NLP-ML research articles; 2) to apply the discovered patterns in
the creation of a larger annotated dataset for training machine readers of
research contributions; 3) to ingest the dataset into the Open Research
Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly
state-of-the-art overviews; 4) to integrate the machine readers into the ORKG
to assist users in the manual curation of their respective article
contributions. We envision that the NLPContributions methodology engenders a
wider discussion on the topic toward its further refinement and development.
Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the
NLPContributions scheme is openly available to the research community at
https://doi.org/10.25835/0019761.Comment: In Proceedings of the 1st Workshop on Extraction and Evaluation of
Knowledge Entities from Scientific Documents (EEKE 2020) co-located with the
ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL 2020), Virtual
Event, China, August 1. http://ceur-ws.org/Vol-2658
Recommended from our members
Contextual Semantics for Radicalisation Detection on Twitter
Much research aims to detect online radical content mainly using radicalisation glossaries, i.e., by looking for terms and expressions associated with religion, war, offensive language, etc. However, such crude methods are highly inaccurate towards content that uses radicalisation terminology to simply report on current events, to share harmless religious rhetoric, or even to counter extremism.
Language is complex and the context in which particular terms are used should not be disregarded. In this paper, we propose an approach for building a representation of the semantic context of the terms that are linked to radicalised rhetoric. We use this approach to analyse over 114K tweets that contain radicalisation-terms (around 17K posted by pro-ISIS users, and 97k posted by “general” Twitter users).
We report on how the contextual information differs for the same radicalisation terms in the two datasets, which indicate that contextual semantics can help to better discriminate radical content from content that only uses radical terminology.The classifiers we built to test this hypothesis outperform those that disregard contextual informatio
NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature
We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. Question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers [18] of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761
Informal learning evidence in online communities of mobile device enthusiasts
This chapter describes a study that investigated the informal learning practices of enthusiastic mobile device owners. Informal learning is far more widespread than is often realized. Livingston (2000) pointed out that Canadian adults spend an average of fifteen hours per week on informal learning activities, more than they spend on formal learning activities. The motivation for these learning efforts generally comes from the individual, not from some outside force such as a school, university, or workplace. Therefore, in the absence of an externally imposed learning framework, informal learners will use whatever techniques,resources, and tools best suit their learning needs and personal preferences. As ownership of mobile technologies becomes increasingly widespread in the western world, it is likely that learners who have access to this technology will use it to support their informal learning efforts. This chapter presents the findings of a study into the various and innovative ways in which PDA and Smartphone users exploit mobile device functionality in their informal learning activities. The findings suggested that mobile device users deploy the mobile, connective, and collaborative capabilities of their devices in a variety of informal learning contexts, and in quite innovative ways. Trends emerged, such as the increasing importance of podcasting and audio and the use of built-in GPS, which may have implications for future studies. Informal learners identified learning activities that could be enhanced by the involvement of mobile technology, and developed methods and techniques that helped them achieve their learning goals
The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts
This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of
units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments
CompRes: A Dataset for Narrative Structure in News
This paper addresses the task of automatically detecting narrative structures
in raw texts. Previous works have utilized the oral narrative theory by Labov
and Waletzky to identify various narrative elements in personal stories texts.
Instead, we direct our focus to news articles, motivated by their growing
social impact as well as their role in creating and shaping public opinion.
We introduce CompRes -- the first dataset for narrative structure in news
media. We describe the process in which the dataset was constructed: first, we
designed a new narrative annotation scheme, better suited for news media, by
adapting elements from the narrative theory of Labov and Waletzky (Complication
and Resolution) and adding a new narrative element of our own (Success); then,
we used that scheme to annotate a set of 29 English news articles (containing
1,099 sentences) collected from news and partisan websites. We use the
annotated dataset to train several supervised models to identify the different
narrative elements, achieving an score of up to 0.7. We conclude by
suggesting several promising directions for future work.Comment: Accpted to the First Joint Workshop on Narrative Understanding,
Storylines, and Events, ACL 202
- …