2,183 research outputs found

    Deverbal semantics and the Montagovian generative lexicon

    Get PDF
    We propose a lexical account of action nominals, in particular of deverbal nominalisations, whose meaning is related to the event expressed by their base verb. The literature about nominalisations often assumes that the semantics of the base verb completely defines the structure of action nominals. We argue that the information in the base verb is not sufficient to completely determine the semantics of action nominals. We exhibit some data from different languages, especially from Romance language, which show that nominalisations focus on some aspects of the verb semantics. The selected aspects, however, seem to be idiosyncratic and do not automatically result from the internal structure of the verb nor from its interaction with the morphological suffix. We therefore propose a partially lexicalist approach view of deverbal nouns. It is made precise and computable by using the Montagovian Generative Lexicon, a type theoretical framework introduced by Bassac, Mery and Retor\'e in this journal in 2010. This extension of Montague semantics with a richer type system easily incorporates lexical phenomena like the semantics of action nominals in particular deverbals, including their polysemy and (in)felicitous copredications.Comment: A revised version will appear in the Journal of Logic, Language and Informatio

    A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations

    Get PDF
    Recognizing analogies, synonyms, antonyms, and associations appear to be four\ud distinct tasks, requiring distinct NLP algorithms. In the past, the four\ud tasks have been treated independently, using a wide variety of algorithms.\ud These four semantic classes, however, are a tiny sample of the full\ud range of semantic phenomena, and we cannot afford to create ad hoc algorithms\ud for each semantic phenomenon; we need to seek a unified approach.\ud We propose to subsume a broad range of phenomena under analogies.\ud To limit the scope of this paper, we restrict our attention to the subsumption\ud of synonyms, antonyms, and associations. We introduce a supervised corpus-based\ud machine learning algorithm for classifying analogous word pairs, and we\ud show that it can solve multiple-choice SAT analogy questions, TOEFL\ud synonym questions, ESL synonym-antonym questions, and similar-associated-both\ud questions from cognitive psychology

    From Parsed Corpora to Semantically Related Verbs

    Get PDF
    A comprehensive repository of semantic relations between verbs is of great importance in supporting a large area of natural language applications. The aim of this paper is to automatically generate a repository of semantic relations between verb pairs using Distributional Memory (DM), a state-of-the-art framework for distributional semantics. The main idea of our method is to exploit relationships that are expressed through prepositions between a verbal and a nominal event in text to extract semantically related events. Then using these prepositions, we derive relation types including causal, temporal, comparison, and expansion. The result of our study leads to the construction of a resource for semantic relations, which consists of pairs of verbs associated with their probable arguments and significance scores based on our measures. Experimental evaluations show promising results on the task of extracting and categorising semantic relations between verbs

    Do Multi-Sense Embeddings Improve Natural Language Understanding?

    Full text link
    Learning a distinct representation for each sense of an ambiguous word could lead to more powerful and fine-grained models of vector-space representations. Yet while `multi-sense' methods have been proposed and tested on artificial word-similarity tasks, we don't know if they improve real natural language understanding tasks. In this paper we introduce a multi-sense embedding model based on Chinese Restaurant Processes that achieves state of the art performance on matching human word similarity judgments, and propose a pipelined architecture for incorporating multi-sense embeddings into language understanding. We then test the performance of our model on part-of-speech tagging, named entity recognition, sentiment analysis, semantic relation identification and semantic relatedness, controlling for embedding dimensionality. We find that multi-sense embeddings do improve performance on some tasks (part-of-speech tagging, semantic relation identification, semantic relatedness) but not on others (named entity recognition, various forms of sentiment analysis). We discuss how these differences may be caused by the different role of word sense information in each of the tasks. The results highlight the importance of testing embedding models in real applications

    Distributional semantics beyond words: Supervised learning of analogy and paraphrase

    Full text link
    There have been several efforts to extend distributional semantics beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, contiguous or noncontiguous). One way to extend beyond words is to compare two tuples using a function that combines pairwise similarities between the component words in the tuples. A strength of this approach is that it works with both relational similarity (analogy) and compositional similarity (paraphrase). However, past work required hand-coding the combination function for different tasks. The main contribution of this paper is that combination functions are generated by supervised learning. We achieve state-of-the-art results in measuring relational similarity between word pairs (SAT analogies and SemEval~2012 Task 2) and measuring compositional similarity between noun-modifier phrases and unigrams (multiple-choice paraphrase questions)

    Dependency parsing of Turkish

    Get PDF
    The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, poses interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical representations called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We compare two different parsing methods, one based on a probabilistic model with beam search, the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of parsing method.We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank
    corecore