11 research outputs found

    Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

    Get PDF
    Selectional preferences have been used by word sense disambiguation (WSD) systems as one source of disambiguating information. We evaluate WSD using selectional preferences acquired for English adjective—noun, subject, and direct object grammatical relationships with respect to a standard test corpus. The selectional preferences are specific to verb or adjective classes, rather than individual word forms, so they can be used to disambiguate the co-occurring adjectives and verbs, rather than just the nominal argument heads. We also investigate use of the one-senseper-discourse heuristic to propagate a sense tag for a word to other occurrences of the same word within the current document in order to increase coverage. Although the preferences perform well in comparison with other unsupervised WSD systems on the same corpus, the results show that for many applications, further knowledge sources would be required to achieve an adequate level of accuracy and coverage. In addition to quantifying performance, we analyze the results to investigate the situations in which the selectional preferences achieve the best precision and in which the one-sense-per-discourse heuristic increases performance

    Opposition theory and computational semiotics

    Get PDF
    Opposition theory suggests that binary oppositions (e.g., high vs. low) underlie basic cognitive and linguistic processes. However, opposition theory has never been implemented in a computational cognitive-semiotics model. In this paper, we present a simple model of metaphor identification that relies on opposition theory. An algorithm instantiating the model has been tested on a data set of 100 phrases comprising adjective-noun pairs in which approximately a half represent metaphorical language-use (e.g., dark thoughts) and the rest literal language-use (e.g., dark hair). The algorithm achieved 89% accuracy in metaphor identification and illustrates the relevance of opposition theory for modelling metaphor processing

    Carving verb classes from corpora

    Get PDF
    In this paper, I discuss some methodological problems arising from the use of corpus data for semantic verb classification. In particular, I present a computational framework to describe the distributional properties of Italian verbs using linguistic data automatically extracted from a large corpus. This information is used to build a distribution-based classification of a set of Italian verbs. Its small scale notwithstanding, this case study will provide evidence for the complex interplay between syntactic and semantic verb features

    Metaphor Identification in Large Texts Corpora

    Get PDF
    Identifying metaphorical language-use (e.g., sweet child) is one of the challenges facing natural language processing. This paper describes three novel algorithms for automatic metaphor identification. The algorithms are variations of the same core algorithm. We evaluate the algorithms on two corpora of Reuters and the New York Times articles. The paper presents the most comprehensive study of metaphor identification in terms of scope of metaphorical phrases and annotated corpora size. Algorithms’ performance in identifying linguistic phrases as metaphorical or literal has been compared to human judgment. Overall, the algorithms outperform the state-of-the-art algorithm with 71% precision and 27% averaged improvement in prediction over the base-rate of metaphors in the corpus.United States. Intelligence Advanced Research Projects Activity (IARPA)United States. Dept. of Defense (U.S. Army Research Laboratory Contract W911NF-12-C-0021

    Exploration and Exploitation of Victorian Science in Darwin's Reading Notebooks

    Get PDF
    Search in an environment with an uncertain distribution of resources involves a trade-off between exploitation of past discoveries and further exploration. This extends to information foraging, where a knowledge-seeker shifts between reading in depth and studying new domains. To study this decision-making process, we examine the reading choices made by one of the most celebrated scientists of the modern era: Charles Darwin. From the full-text of books listed in his chronologically-organized reading journals, we generate topic models to quantify his local (text-to-text) and global (text-to-past) reading decisions using Kullback-Liebler Divergence, a cognitively-validated, information-theoretic measure of relative surprise. Rather than a pattern of surprise-minimization, corresponding to a pure exploitation strategy, Darwin's behavior shifts from early exploitation to later exploration, seeking unusually high levels of cognitive surprise relative to previous eras. These shifts, detected by an unsupervised Bayesian model, correlate with major intellectual epochs of his career as identified both by qualitative scholarship and Darwin's own self-commentary. Our methods allow us to compare his consumption of texts with their publication order. We find Darwin's consumption more exploratory than the culture's production, suggesting that underneath gradual societal changes are the explorations of individual synthesis and discovery. Our quantitative methods advance the study of cognitive search through a framework for testing interactions between individual and collective behavior and between short- and long-term consumption choices. This novel application of topic modeling to characterize individual reading complements widespread studies of collective scientific behavior.Comment: Cognition pre-print, published February 2017; 22 pages, plus 17 pages supporting information, 7 pages reference

    D6.1: Technologies and Tools for Lexical Acquisition

    Get PDF
    This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)

    Segmenting customers based on their unconscious needs.

    Get PDF
    This paper contributes to the literature by proposing a new methodological approach for understanding customers’ unconscious needs. This approach combines Unconscious Thought Theory (UTT) with Choice-Based Conjoint (CBC), for the first time, to identify needs that segment members are either unaware of, or unable/unwilling to articulate. This methodological approach identified an additional market segment, distinct from those identified by a traditional customer segmentation approach based on customer need articulation. Ergo, understanding unconscious needs may provide additional customer insight and aid marketeers in developing new propositions and gaining market share. This study, therefore, makes a methodological contribution to the literature. The study involves the segmentation of buyers of snack bars (i.e. cereal bars), based on subjective nutritional information importance (as revealed on packaging). Separate samples of buyers were recruited: one group as a control sample completing a traditional CBC exercise; a second group completing the same CBC exercise but asked to complete a UTT working memory distraction-task between each choice-task and responding to it. This allowed a comparison of the segmentations of two groups (one which incorporated unconscious thought theory and the other which did not). Latent Class Segmentation (LCS) analysis indicated that whilst both approaches generate four similar segments, the CBC/UTT approach revealed a fifth (hidden) segment, unidentified in the other sample. In addition, the nutritional preferences of four of the five segments produced via the CBC/UTT approach matched those demonstrated by the participants’ store card behavioural data in a manner unobserved for the traditional CBC approach. This research provides a framework for further exploration and identifies a number of issues, such as which types of working memory distraction-tasks are most effective, that could potentially improve the approach if replicated.Doctor of Business Administratio

    Statistical models for the induction and use of selectional preferences

    Full text link
    corecore