610 research outputs found

    From Parsed Corpora to Semantically Related Verbs

    Get PDF
    A comprehensive repository of semantic relations between verbs is of great importance in supporting a large area of natural language applications. The aim of this paper is to automatically generate a repository of semantic relations between verb pairs using Distributional Memory (DM), a state-of-the-art framework for distributional semantics. The main idea of our method is to exploit relationships that are expressed through prepositions between a verbal and a nominal event in text to extract semantically related events. Then using these prepositions, we derive relation types including causal, temporal, comparison, and expansion. The result of our study leads to the construction of a resource for semantic relations, which consists of pairs of verbs associated with their probable arguments and significance scores based on our measures. Experimental evaluations show promising results on the task of extracting and categorising semantic relations between verbs

    Leveraging graph-based semantic annotation for the identification of cause-effect relations

    Get PDF
    This research is related to language article in Indonesia that discuss about causality relationship research used as public health surveillance information monitoring system. Utilization of this research is suitability of feature selection, phrase annotation, paragraph annotation, medical element annotation and graph-based semantic annotation. Evaluation of system performance is done by intrinsic approach using the Naive Bayes Multinomial method. The results obtained sequentially for recall, precision and f-measure are 0.924, 0.905, and 0.910

    Machine Learning for Holistic Evaluation of Scientific Essays

    Full text link
    Abstract. In the US in particular, there is an increasing emphasis on the importance of science in education. To better understand a scien-tific topic, students need to compile information from multiple sources and determine the principal causal factors involved. We describe an ap-proach for automatically inferring the quality and completeness of causal reasoning in essays on two separate scientific topics using a novel, two-phase machine learning approach for detecting causal relations. For each core essay concept, we initially trained a window-based tagging model to predict which individual words belonged to that concept. Using the predictions from this first set of models, we then trained a second stacked model on all the predicted word tags present in a sentence to predict in-ferences between essay concepts. The results indicate we could use such a system to provide explicit feedback to students to improve reasoning and essay writing skills

    Inquiries into the lexicon-syntax relations in Basque

    Get PDF
    Index:- Foreword. B. Oyharçabal.- Morphosyntactic disambiguation and shallow parsing in computational processing in Basque. I. Aduriz, A. Díaz de Ilarraza.- The transitivity of borrowed verbs in Basque: an outline. X. Alberdi.- Patrixa: a unification-based parser for Basque and its application to the automatic analysis of verbs. I. Aldezabal, M. J. Aranzabe, A. Atutxa, K.Gojenola, K, Sarasola.- Learning argument/adjunct distinction for Basque. I. Aldezabal, M. J. Aranzabe, K. Gojenola, K, Sarasola, A. Atutxa.- Analyzing verbal subcategorization aimed at its computation application. I. Aldezabal, P. Goenaga.- Automatic extraction of verb paterns from “hauta-lanerako euskal hiztegia”. J. M. Arriola, X. Artola, A. Soroa.- The case of an enlightening, provoking an admirable Basque derivational siffux with implications for the theory of argument structure. X. Artiagoitia.- Verb-deriving processes in Basque. J. C. Odriozola.- Lexical causatives and causative alternation in Basque. B. Oyharçabal.- Causation and semantic control; diagnosis of incorrect use in minorized languages. I. Zabala.- Subject index.- Contributions

    Lexical typology : a programmatic sketch

    Get PDF
    The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar

    Grounding the Linking Competence in Culture and Nature. How Action and Perception Shape the Syntax-Semantics Relationship

    Get PDF
    Part I of the book presents my basic assumptions about the syntax-semantics relationship as a competence of language users and compares them with those of the two paradigms that presently account for most theoretical linguistic projects, studies, and publications. I refer to them as Chomskyan Linguistics and Cognitive-Functional Linguistics. I will show that these approaches do not provide the means to accommodate the sociocultural origins of the “linking” competence, creating the need for an alternative approach. While considering these two approaches (sections 2.1 and 2.3), an alternative proposal will be sketched in section 2.2, using the notion of “research programme”. Thus, part I deals mainly with questions of the philosophy of science. Nevertheless, the model underlying the research programme gives structure to the procedure followed throughout the rest of the book, since it identifies the undertaking as multidisciplinary, following from the central roles of perception and action/attribution. This means that approaching the competence of relating form to content as characterized above requires looking into these sub-competences first, since the former draws upon the latter. Part I concludes with the formulation of an action-theoretic vocabulary and taxonomy (section 2.4). This vocabulary serves as the guideline for how to talk about the subject-matter of each of these disciplines. Part II and chapter 3 then deal with the sub-competences that have been identified as underlying linguistic competence. They concern the use of perception, identification/categorization, conceptualization, action, attribution, and the use of linguistic symbols. Section 3.1 in part II deals with perception. In particular, two crucial properties of perception will be discussed: that it consists of a bottom-up part and a top-down part, and that the output of perception is underspecified in the sense that what we perceive is not informative with respect to actional, i.e., socially relevant matters. The sections on perception to some degree anticipate the characterization of conceptualization in section 3.2 because the latter will be reconstructed as simulated perception. The property of underspecification is thus sustained in conceptualization, too. If utterances encode concepts and concepts are underspecified with respect to those matters that are most important for everyday interaction, one wonders how verbal interaction can (actually) be successful. Here is where action competence and attribution come into play (the non-conceptual contents referred to above). I will show that native speakers act and cognize according to particular socio-cognitive parameters, on the basis of which they make socially relevant attributions. These in turn specify what was underspecified about concepts beforehand. In other words, actional knowledge including attribution must complement concepts in order to count as the semantics underlying linguistic utterances. Sections 3.3 and 3.4 develop a descriptive means for semantic contents. I present the inherent structural organization of concepts and demonstrate how the spatial and temporal aspects of conceptualization can be systematically related to the syntactic structures underlying utterances. In particular, I will argue that conceptualization is organized by means of trajector-landmark configurations which can quite regularly be related to parts of speech in syntactic constructions using the notion of diagrammatic iconicity. Given a diagrammatic mapping and conceptualization as simulated perception the utterance thus becomes something like an instruction to simulate a perception. In part III, section 4.1 deals with the question of what the formal constituents of utterances/constructions contribute to the building of a concept from an utterance. In this context a theory of the German dative is presented, based on the theoretical notions developed throughout this work. Section 4.2 sketches the non-formal properties that reduce the remaining underspecification. In this context one of the most fundamental cognitive properties of language users is uncovered, namely their need to find the cause of any event they are cognizing about. I will then outline the consequences of this property for language production and comprehension. Section 4.3 lists the most important linking schemas for German on the basis of the most important constructions, i.e., motivated conceptualization-syntactic construction mappings, and then describes in a step-by-step manner how – from the utterance-as-instruction-for-conceptualization perspective – such an instruction is obeyed, and how such an instruction is built up from the perception of an event, respectively. The last section, 4.4, is dedicated to a discussion of some of the most famous and most puzzling linguistic phenomena which theoretical linguists traditionally deal with. In discussing the formal aspects of the linguistic competence, examples from German are used

    Spin: Lexical Semantics, Transitivity, and the Identification of Implicit Sentiment

    Get PDF
    Current interest in automatic sentiment analysis is motivated by a variety of information requirements. The vast majority of work in sentiment analysis has been specifically targeted at detecting subjective statements and mining opinions. This dissertation focuses on a different but related problem that to date has received relatively little attention in NLP research: detecting implicit sentiment, or spin, in text. This text classification task is distinguished from other sentiment analysis work in that there is no assumption that the documents to be classified with respect to sentiment are necessarily overt expressions of opinion. They rather are documents that might reveal a perspective. This dissertation describes a novel approach to the identification of implicit sentiment, motivated by ideas drawn from the literature on lexical semantics and argument structure, supported and refined through psycholinguistic experimentation. A relationship predictive of sentiment is established for components of meaning that are thought to be drivers of verbal argument selection and linking and to be arbiters of what is foregrounded or backgrounded in discourse. In computational experiments employing targeted lexical selection for verbs and nouns, a set of features reflective of these components of meaning is extracted for the terms. As observable proxies for the underlying semantic components, these features are exploited using machine learning methods for text classification with respect to perspective. After initial experimentation with manually selected lexical resources, the method is generalized to require no manual selection or hand tuning of any kind. The robustness of this linguistically motivated method is demonstrated by successfully applying it to three distinct text domains under a number of different experimental conditions, obtaining the best classification accuracies yet reported for several sentiment classification tasks. A novel graph-based classifier combination method is introduced which further improves classification accuracy by integrating statistical classifiers with models of inter-document relationships

    Semantic prosody in Thai

    Get PDF
    Semantic prosody is an important concept and has become a primary research interest in corpus linguistics. This thesis undertakes the groundwork of fundamental research into semantic prosody in Thai, a language which has not been subject to studies of semantic prosody before, to set out the parameters for subsequent research in this area. In particular, it addresses these three research questions: 1. What are the advantages and disadvantages of the major approaches to semantic prosody proposed in the literature for describing semantic prosody in Thai? 2. What variation in semantic prosodies across genres can be identified for Thai words? 3. To what extent are the semantic prosodies of words identified as translation-equivalents in widely-used bilingual dictionaries in Thai and English similar or different? The datasets employed in the analysis are the Thai National Corpus and the British National Corpus. To address each research question, a small number of Thai words are selected for the analysis. Two primary approaches, the polarity-oriented approach and the EUM-oriented approach, are employed to identify semantic prosody. Within the polarity-oriented approach, which is founded in work by Louw, Stubbs, and Partington, semantic prosody is identified based on collocates, and is restricted to the positive vs. negative opposition. Within the EUM-oriented approach, which is based in the studies of Sinclair, semantic prosody is identified by examining concordance lines for a pragmatic function or meaning that is spread across an extended unit of meaning. The results of the analysis show that the two primary approaches to semantic prosody do operate successfully with the Thai data. A range of semantic prosodies are identified for /kreeƋcay/ ‘considerate’, /kɔ̀ ɔhĂąykə̀ ət/ ‘cause’, and /chɔ̂ɔp/ ‘like’, the objects under study, by the two approaches. The discussion of these semantic prosodies shows that the two approaches are useful for different purposes. The polarity-oriented approach is useful when one’s aim is to investigate a word’s tendency to co-occur with positive or negative words. Particularly, it reveals the hidden evaluative potential of words whose evaluation is not obvious from their core semantics. The EUM-oriented approach is, by contrast, suitable for the examination of an extended unit of meaning and its pragmatic function in the Sinclairian sense. They both also have some advantages and disadvantages in terms of practicality. On the issue of variation in semantic prosodies across genres, some variation is indeed found to exist. From the concordance analysis of 19 verbs, each in four different genres, namely academic writing, fiction, newspaper stories, and non-academic non-fiction, 21 different extended units of meaning are identified from 14 of the verbs. The level of variation in the use of these extended units of meaning across genres, which implies variation in semantic prosodies, is considerable with some extended units of meaning, but is limited with others. In particular, a notable contrast is identified between academic and fiction genres in terms of which extended units (and semantic prosodies) are common. Finally, the majority of the translationequivalent pairs under study (36 out of 48) show the same semantic prosody; of these, most present a neutral semantic prosody. In cases where the pairs show different semantic prosodies, there are not any cases where one word in the pair shows a positive semantic prosody, and the other shows a negative semantic prosody, and vice versa. It is thus arguable that there is a relationship between semantic prosody in Thai and English – not a genetic or areal relationship, but one that arises from a functional basis, that is, the meanings that the pairs of words under study express in both languages

    Semantic prosody in Thai

    Get PDF
    Semantic prosody is an important concept and has become a primary research interest in corpus linguistics. This thesis undertakes the groundwork of fundamental research into semantic prosody in Thai, a language which has not been subject to studies of semantic prosody before, to set out the parameters for subsequent research in this area. In particular, it addresses these three research questions: 1. What are the advantages and disadvantages of the major approaches to semantic prosody proposed in the literature for describing semantic prosody in Thai? 2. What variation in semantic prosodies across genres can be identified for Thai words? 3. To what extent are the semantic prosodies of words identified as translation-equivalents in widely-used bilingual dictionaries in Thai and English similar or different? The datasets employed in the analysis are the Thai National Corpus and the British National Corpus. To address each research question, a small number of Thai words are selected for the analysis. Two primary approaches, the polarity-oriented approach and the EUM-oriented approach, are employed to identify semantic prosody. Within the polarity-oriented approach, which is founded in work by Louw, Stubbs, and Partington, semantic prosody is identified based on collocates, and is restricted to the positive vs. negative opposition. Within the EUM-oriented approach, which is based in the studies of Sinclair, semantic prosody is identified by examining concordance lines for a pragmatic function or meaning that is spread across an extended unit of meaning. The results of the analysis show that the two primary approaches to semantic prosody do operate successfully with the Thai data. A range of semantic prosodies are identified for /kreeƋcay/ ‘considerate’, /kɔ̀ ɔhĂąykə̀ ət/ ‘cause’, and /chɔ̂ɔp/ ‘like’, the objects under study, by the two approaches. The discussion of these semantic prosodies shows that the two approaches are useful for different purposes. The polarity-oriented approach is useful when one’s aim is to investigate a word’s tendency to co-occur with positive or negative words. Particularly, it reveals the hidden evaluative potential of words whose evaluation is not obvious from their core semantics. The EUM-oriented approach is, by contrast, suitable for the examination of an extended unit of meaning and its pragmatic function in the Sinclairian sense. They both also have some advantages and disadvantages in terms of practicality. On the issue of variation in semantic prosodies across genres, some variation is indeed found to exist. From the concordance analysis of 19 verbs, each in four different genres, namely academic writing, fiction, newspaper stories, and non-academic non-fiction, 21 different extended units of meaning are identified from 14 of the verbs. The level of variation in the use of these extended units of meaning across genres, which implies variation in semantic prosodies, is considerable with some extended units of meaning, but is limited with others. In particular, a notable contrast is identified between academic and fiction genres in terms of which extended units (and semantic prosodies) are common. Finally, the majority of the translationequivalent pairs under study (36 out of 48) show the same semantic prosody; of these, most present a neutral semantic prosody. In cases where the pairs show different semantic prosodies, there are not any cases where one word in the pair shows a positive semantic prosody, and the other shows a negative semantic prosody, and vice versa. It is thus arguable that there is a relationship between semantic prosody in Thai and English – not a genetic or areal relationship, but one that arises from a functional basis, that is, the meanings that the pairs of words under study express in both languages
    • 

    corecore