10,463 research outputs found

    Ditransitive verbs and the ditransitive construction: a diachronic perspective

    Get PDF
    This paper argues for the adoption of a construction-based perspective to the investigation of diachronic shifts in valency, which is a hitherto largely neglected topic in the framework of valency grammar. On the basis of a comparison of the set of verbs attested in the double object argument structure pattern in a corpus of 18th-century British English with the construction's present-day semantic range, I will distinguish between three kinds of valency shifts. It will be shown that the semantic ranges of schematic argument structure constructions are subject to diachronic change, and that the shifts in valency observed in individual verbs are often part of more general changes at the level of the associated argument structure constructions. The latter part of the paper explores frequency shifts in valency and constructional semantics

    Knowledge-based methods for automatic extraction of domain-specific ontologies

    Get PDF
    Semantic web technology aims at developing methodologies for representing large amount of knowledge in web accessible form. The semantics of knowledge should be easy to interpret and understand by computer programs, so that sharing and utilizing knowledge across the Web would be possible. Domain specific ontologies form the basis for knowledge representation in the semantic web. Research on automated development of ontologies from texts has become increasingly important because manual construction of ontologies is labor intensive and costly, and, at the same time, large amount of texts for individual domains is already available in electronic form. However, automatic extraction of domain specific ontologies is challenging due to the unstructured nature of texts and inherent semantic ambiguities in natural language. Moreover, the large size of texts to be processed renders full-fledged natural language processing methods infeasible. In this dissertation, we develop a set of knowledge-based techniques for automatic extraction of ontological components (concepts, taxonomic and non-taxonomic relations) from domain texts. The proposed methods combine information retrieval metrics, lexical knowledge-base(like WordNet), machine learning techniques, heuristics, and statistical approaches to meet the challenge of the task. These methods are domain-independent and automatic approaches. For extraction of concepts, the proposed WNSCA+{PE, POP} method utilizes the lexical knowledge base WordNet to improve precision and recall over the traditional information retrieval metrics. A WordNet-based approach, the compound term heuristic, and a supervised learning approach are developed for taxonomy extraction. We also developed a weighted word-sense disambiguation method for use with the WordNet-based approach. An unsupervised approach using log-likelihood ratios is proposed for extracting non-taxonomic relations. Further more, a supervised approach is investigated to learn the semantic constraints for identifying relations from prepositional phrases. The proposed methods are validated by experiments with the Electronic Voting and the Tender Offers, Mergers, and Acquisitions domain corpus. Experimental results and comparisons with some existing approaches clearly indicate the superiority of our methods. In summary, a good combination of information retrieval, lexical knowledge base, statistics and machine learning methods in this study has led to the techniques efficient and effective for extracting ontological components automatically

    An Exercise in Visualizing Colexification on a Semantic Map

    Get PDF
    This paper aims at investigating the polysemic patterns associated with the notion ‘soil/earth’ by using the semantic map model as a methodological tool. We focus on the applicability of the model to the lexicon, since most of past research has been devoted to the analysis of grammatical morphemes. The most concise result of our research is a diagrammatic visualization of the semantic spaces of twenty lexemes in nine different languages, mainly ancient languages belonging to the Indo-European and the Afro-Asiatic language families. The common semantic map for the various languages reveals that the semantic spaces covered by the investigated lexemes are often quite different from one another, although common patterns can also be detected. Our study highlights some shortcomings and methodological problems of previous analyses suggesting that a possible solution to these problems is the control of the data in the existing sources of the object languages. Finally, drawing upon the cognitive linguistics literature on the various types of semantic change, we show that some of the senses of the individual lexemes are the result of the function of such mechanisms as metaphor, metonymy, and generalization

    Precis of neuroconstructivism: how the brain constructs cognition

    Get PDF
    Neuroconstructivism: How the Brain Constructs Cognition proposes a unifying framework for the study of cognitive development that brings together (1) constructivism (which views development as the progressive elaboration of increasingly complex structures), (2) cognitive neuroscience (which aims to understand the neural mechanisms underlying behavior), and (3) computational modeling (which proposes formal and explicit specifications of information processing). The guiding principle of our approach is context dependence, within and (in contrast to Marr [1982]) between levels of organization. We propose that three mechanisms guide the emergence of representations: competition, cooperation, and chronotopy; which themselves allow for two central processes: proactivity and progressive specialization. We suggest that the main outcome of development is partial representations, distributed across distinct functional circuits. This framework is derived by examining development at the level of single neurons, brain systems, and whole organisms. We use the terms encellment, embrainment, and embodiment to describe the higher-level contextual influences that act at each of these levels of organization. To illustrate these mechanisms in operation we provide case studies in early visual perception, infant habituation, phonological development, and object representations in infancy. Three further case studies are concerned with interactions between levels of explanation: social development, atypical development and within that, developmental dyslexia. We conclude that cognitive development arises from a dynamic, contextual change in embodied neural structures leading to partial representations across multiple brain regions and timescales, in response to proactively specified physical and social environment

    LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

    Full text link
    Distributional semantics based on neural approaches is a cornerstone of Natural Language Processing, with surprising connections to human meaning representation as well. Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information, simply as a product of self-supervision. Prior work has shown that these contextual representations can be used to accurately represent large sense inventories as sense embeddings, to the extent that a distance-based solution to Word Sense Disambiguation (WSD) tasks outperforms models trained specifically for the task. Still, there remains much to understand on how to use these Neural Language Models (NLMs) to produce sense embeddings that can better harness each NLM's meaning representation abilities. In this work we introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants. We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD, while demonstrating improved performance using our proposed approach over prior work focused on sense embeddings. Finally, we discuss unexpected findings regarding layer and model performance variations, and potential applications for downstream tasks.Comment: Accepted to Artificial Intelligence Journal (AIJ

    Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

    Full text link
    Modelling semantic similarity plays a fundamental role in lexical semantic applications. A natural way of calculating semantic similarity is to access handcrafted semantic networks, but similarity prediction can also be anticipated in a distributional vector space. Similarity calculation continues to be a challenging task, even with the latest breakthroughs in deep neural language models. We first examined popular methodologies in measuring taxonomic similarity, including edge-counting that solely employs semantic relations in a taxonomy, as well as the complex methods that estimate concept specificity. We further extrapolated three weighting factors in modelling taxonomic similarity. To study the distinct mechanisms between taxonomic and distributional similarity measures, we ran head-to-head comparisons of each measure with human similarity judgements from the perspectives of word frequency, polysemy degree and similarity intensity. Our findings suggest that without fine-tuning the uniform distance, taxonomic similarity measures can depend on the shortest path length as a prime factor to predict semantic similarity; in contrast to distributional semantics, edge-counting is free from sense distribution bias in use and can measure word similarity both literally and metaphorically; the synergy of retrofitting neural embeddings with concept relations in similarity prediction may indicate a new trend to leverage knowledge bases on transfer learning. It appears that a large gap still exists on computing semantic similarity among different ranges of word frequency, polysemous degree and similarity intensity

    The representation of polysemy in the mental lexicon and its processing

    Get PDF
    Polysemy, the lexical semantic phenomenon in which a word form has different but related senses, is pervasive in natural languages. Examples of polysemy encompass regular polysemy and idiosyncratic forms, such as paper and atmosphere respectively. Nonetheless, regardless of its abundance in languages, it was not until 1980, with the appearance of cognitive grammar, that polysemy was given considerable attention. While this phenomenon does not seem to pose a problem in everyday communication, it has proved to be notably difficult to treat both theoretically and empirically (Falkum & Vicente, 2015, p. 3). At present, there is discussion regarding the representation of polysemy in the mental lexicon and its processing. The purpose of this dissertation is to present the main theories which are currently being discussed by linguists on this topic. In order to achieve this aim, I start by defining and comparing polysemy to homonymy, the phenomenon by which one word form has, at least, two different and unrelated meanings. I explain the criteria and some of the tests which can be applied to distinguish them (e.g. etymological derivation, native intuition, pronominalization and ellipsis). Moreover, I define the types in which polysemy can be subdivided, emphasizing metonymically and metaphorically motivated polysemy. After polysemy has been distinguished from homonymy and its subdivisions have been explained, I move on to the main section of this paper: The representation and processing of polysemy in the mental lexicon. The representation is the information which is stored in the mental lexicon for the different types of word forms. The processing is how that information is accessed and used in language production and comprehension. This being explained, I discuss what I consider to be the two main approaches regarding this issue. On the one hand, the theory named Sense Enumeration approach which postulates that the related senses of polysemous words are both stored and processed like the unrelated senses of homonymous terms. On the other hand, the other main theory is the One Representation approach, which proposes that polysemous and homonymous terms differ in how their meanings are stored and processed. There are various views with different perspectives which lie within this theory. Then, I present empirical evidence which partially support both theories. However, I conclude my paper by taking a stance for the One Representation approach
    corecore