Search CORE

6,499 research outputs found

Detecting and ordering adjectival scalemates

Author: van Miltenburg Emiel
Publication venue
Publication date: 01/01/2015
Field of study

This paper presents a pattern-based method that can be used to infer adjectival scales, such as , from a corpus. Specifically, the proposed method uses lexical patterns to automatically identify and order pairs of scalemates, followed by a filtering phase in which unrelated pairs are discarded. For the filtering phase, several different similarity measures are implemented and compared. The model presented in this paper is evaluated using the current standard, along with a novel evaluation set, and shown to be at least as good as the current state-of-the-art.Comment: Paper presented at MAPLEX 2015, February 9-10, Yamagata, Japan (http://lang.cs.tut.ac.jp/maplex2015/

arXiv.org e-Print Archive

VU Research Portal

Mining Domain-Specific Thesauri from Wikipedia: A case study

Author: Medelyan Olena
Milne David N.
Witten Ian H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia. In a comparison with a professional thesaurus for agriculture we find that Wikipedia contains a substantial proportion of its concepts and semantic relations; furthermore it has impressive coverage of contemporary documents in the domain. Thesauri derived using our techniques capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts

CiteSeerX

Crossref

Research Commons@Waikato

Enroller: an experiment in aggregating resources

Author: Anderson J.
Publication venue: Editions Rodopi B.V.
Publication date: 01/01/2013
Field of study

This chapter describes a collaborative project between e-scientists and humanists working to create an online repository of linguistic data sets and tools. Corpora, dictionaries, and a thesaurus are brought together to enable a new method of research. It combines our most advanced knowledge in both computing and linguistic research techniques

Enlighten

The civilizing process in London’s Old Bailey

Author: DeDeo Simon
Hitchcock Tim
Klingenstein Sara
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 16/06/2014
Field of study

The jury trial is a critical point where the state and its citizens come together to define the limits of acceptable behavior. Here we present a large-scale quantitative analysis of trial transcripts from the Old Bailey that reveal a major transition in the nature of this defining moment. By coarse-graining the spoken word testimony into synonym sets and dividing the trials based on indictment, we demonstrate the emergence of semantically distinct violent and nonviolent trial genres. We show that although in the late 18th century the semantic content of trials for violent offenses is functionally indistinguishable from that for nonviolent ones, a long-term, secular trend drives the system toward increasingly clear distinctions between violent and nonviolent acts. We separate this process into the shifting patterns that drive it, determine the relative effects of bureaucratic change and broader cultural shifts, and identify the synonym sets most responsible for the eventual genre distinguishability. This work provides a new window onto the cultural and institutional changes that accompany the monopolization of violence by the state, described in qualitative historical analysis as the civilizing process

PubMed Central

Sussex Research Online

Retrieving with good sense

Author: Sanderson M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2000
Field of study

Although always present in text, word sense ambiguity only recently became regarded as a problem to information retrieval which was potentially solvable. The growth of interest in word senses resulted from new directions taken in disambiguation research. This paper first outlines this research and surveys the resulting efforts in information retrieval. Although the majority of attempts to improve retrieval effectiveness were unsuccessful, much was learnt from the research. Most notably a notion of under what circumstance disambiguation may prove of use to retrieval

CiteSeerX

White Rose Research Online

Diachronic and synchronic thesauruses

Author: Alexander Marc
Kay Christian
Publication venue: 'Oxford University Press (OUP)'
Publication date: 26/11/2015
Field of study

No abstract available

Enlighten