65,915 research outputs found

    Human-Level Performance on Word Analogy Questions by Latent Relational Analysis

    Get PDF
    This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus

    Exploiting conceptual spaces for ontology integration

    Get PDF
    The widespread use of ontologies raises the need to integrate distinct conceptualisations. Whereas the symbolic approach of established representation standards – based on first-order logic (FOL) and syllogistic reasoning – does not implicitly represent semantic similarities, ontology mapping addresses this problem by aiming at establishing formal relations between a set of knowledge entities which represent the same or a similar meaning in distinct ontologies. However, manually or semi-automatically identifying similarity relationships is costly. Hence, we argue, that representational facilities are required which enable to implicitly represent similarities. Whereas Conceptual Spaces (CS) address similarity computation through the representation of concepts as vector spaces, CS rovide neither an implicit representational mechanism nor a means to represent arbitrary relations between concepts or instances. In order to overcome these issues, we propose a hybrid knowledge representation approach which extends FOL-based ontologies with a conceptual grounding through a set of CS-based representations. Consequently, semantic similarity between instances – represented as members in CS – is indicated by means of distance metrics. Hence, automatic similarity detection across distinct ontologies is supported in order to facilitate ontology integration

    Towards ontology interoperability through conceptual groundings

    Get PDF
    Abstract. The widespread use of ontologies raises the need to resolve heterogeneities between distinct conceptualisations in order to support interoperability. The aim of ontology mapping is, to establish formal relations between a set of knowledge entities which represent the same or a similar meaning in distinct ontologies. Whereas the symbolic approach of established SW representation standards – based on first-order logic and syllogistic reasoning – does not implicitly represent similarity relationships, the ontology mapping task strongly relies on identifying semantic similarities. However, while concept representations across distinct ontologies hardly equal another, manually or even semi-automatically identifying similarity relationships is costly. Conceptual Spaces (CS) enable the representation of concepts as vector spaces which implicitly carry similarity information. But CS provide neither an implicit representational mechanism nor a means to represent arbitrary relations between concepts or instances. In order to overcome these issues, we propose a hybrid knowledge representation approach which extends first-order logic ontologies with a conceptual grounding through a set of CS-based representations. Consequently, semantic similarity between instances – represented as members in CS – is indicated by means of distance metrics. Hence, automatic similarity-detection between instances across distinct ontologies is supported in order to facilitate ontology mapping

    Measuring Sentences Similarity Based on Discourse Representation Structure

    Get PDF
    The problem of measuring similarity between sentences is crucial for many applications in Natural Language Processing (NLP). Most of the proposed approaches depend on similarity of words in sentences. This research considers semantic relations between words in calculating sentence similarity. This paper uses Discourse Representation Structure (DRS) of natural language sentences to measure similarity. DRS captures the structure and semantic information of sentences. Moreover, the estimation of similarity between sentences depends on semantic coverage of relations of the first sentence in the other sentence. Experiments show that exploiting structural information achieves better results than traditional word-to-word approaches. Moreover, the proposed method outperforms similar approaches on a standard benchmark dataset

    Predicting the relevance of distributional semantic similarity with contextual information

    Get PDF
    International audienceUsing distributional analysis methods to compute semantic proximity links between words has become commonplace in NLP. The resulting relations are often noisy or difficult to interpret in general. This paper focuses on the issues of evaluating a distributional resource and filtering the relations it contains, but instead of considering it in abstracto, we focus on pairs of words in context. In a discourse , we are interested in knowing if the semantic link between two items is a by-product of textual coherence or is irrelevant. We first set up a human annotation of semantic links with or without contex-tual information to show the importance of the textual context in evaluating the relevance of semantic similarity, and to assess the prevalence of actual semantic relations between word tokens. We then built an experiment to automatically predict this relevance , evaluated on the reliable reference data set which was the outcome of the first annotation. We show that in-document information greatly improve the prediction made by the similarity level alone
    • …
    corecore