26,300 research outputs found

    Semantic categories underlying the meaning of ‘place’

    Get PDF
    This paper analyses the semantics of natural language expressions that are associated with the intuitive notion of ‘place’. We note that the nature of such terms is highly contested, and suggest that this arises from two main considerations: 1) there are a number of logically distinct categories of place expression, which are not always clearly distinguished in discourse about ‘place’; 2) the many non-substantive place count nouns (such as ‘place’, ‘region’, ‘area’, etc.) employed in natural language are highly ambiguous. With respect to consideration 1), we propose that place-related expressions should be classified into the following distinct logical types: a) ‘place-like’ count nouns (further subdivided into abstract, spatial and substantive varieties), b) proper names of ‘place-like’ objects, c) locative property phrases, and d) definite descriptions of ‘place-like’ objects. We outline possible formal representations for each of these. To address consideration 2), we examine meanings, connotations and ambiguities of the English vocabulary of abstract and generic place count nouns, and identify underlying elements of meaning, which explain both similarities and differences in the sense and usage of the various terms

    CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information

    Full text link
    Open Information Extraction (OpenIE) methods extract (noun phrase, relation phrase, noun phrase) triples from text, resulting in the construction of large Open Knowledge Bases (Open KBs). The noun phrases (NPs) and relation phrases in such Open KBs are not canonicalized, leading to the storage of redundant and ambiguous facts. Recent research has posed canonicalization of Open KBs as clustering over manuallydefined feature spaces. Manual feature engineering is expensive and often sub-optimal. In order to overcome this challenge, we propose Canonicalization using Embeddings and Side Information (CESI) - a novel approach which performs canonicalization over learned embeddings of Open KBs. CESI extends recent advances in KB embedding by incorporating relevant NP and relation phrase side information in a principled manner. Through extensive experiments on multiple real-world datasets, we demonstrate CESI's effectiveness.Comment: Accepted at WWW 201

    Generating natural language specifications from UML class diagrams

    Get PDF
    Early phases of software development are known to be problematic, difficult to manage and errors occurring during these phases are expensive to correct. Many systems have been developed to aid the transition from informal Natural Language requirements to semistructured or formal specifications. Furthermore, consistency checking is seen by many software engineers as the solution to reduce the number of errors occurring during the software development life cycle and allow early verification and validation of software systems. However, this is confined to the models developed during analysis and design and fails to include the early Natural Language requirements. This excludes proper user involvement and creates a gap between the original requirements and the updated and modified models and implementations of the system. To improve this process, we propose a system that generates Natural Language specifications from UML class diagrams. We first investigate the variation of the input language used in naming the components of a class diagram based on the study of a large number of examples from the literature and then develop rules for removing ambiguities in the subset of Natural Language used within UML. We use WordNet,a linguistic ontology, to disambiguate the lexical structures of the UML string names and generate semantically sound sentences. Our system is developed in Java and is tested on an independent though academic case study

    Experimental Support for a Categorical Compositional Distributional Model of Meaning

    Full text link
    Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (arXiv:1003.4394v1 [cs.CL]) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.Comment: 11 pages, to be presented at EMNLP 2011, to be published in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processin

    Particle verbs and the conditions of projection

    Get PDF
    In this paper I discuss the properties of particle verbs in light of a proposal about syntactic projection. In section 2 I suggest that projection involves functional structure in two important ways: (i) only functional phrases can be complements, and (ii) lexical heads that take complements and project must be inflected. In section 3, I show that the structure of particle verbs is not uniform with respect to (i) and (ii). On the one hand, a particle always combines with an inflected verb; in this respect, particle verbs look like verb-complement constructions. On the other hand, the particle is not a functional phrase and therefore is not a proper complement, which makes the combination of the particle and the verb look more like a morphologically complex verb. I argue that syntactic rules can in fact interpret the node dominating the particle and the verb as a projection and as a complex head. In section 4, I show that many of the characteristic properties of particle verbs in the Germanic languages follow from the fact that they are structural hybrids
    • …
    corecore