10,762 research outputs found

    Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings

    Full text link
    We consider the task of inferring is-a relationships from large text corpora. For this purpose, we propose a new method combining hyperbolic embeddings and Hearst patterns. This approach allows us to set appropriate constraints for inferring concept hierarchies from distributional contexts while also being able to predict missing is-a relationships and to correct wrong extractions. Moreover -- and in contrast with other methods -- the hierarchical nature of hyperbolic space allows us to learn highly efficient representations and to improve the taxonomic consistency of the inferred hierarchies. Experimentally, we show that our approach achieves state-of-the-art performance on several commonly-used benchmarks

    A Context-theoretic Framework for Compositionality in Distributional Semantics

    Full text link
    Techniques in which words are represented as vectors have proved useful in many applications in computational linguistics, however there is currently no general semantic formalism for representing meaning in terms of vectors. We present a framework for natural language semantics in which words, phrases and sentences are all represented as vectors, based on a theoretical analysis which assumes that meaning is determined by context. In the theoretical analysis, we define a corpus model as a mathematical abstraction of a text corpus. The meaning of a string of words is assumed to be a vector representing the contexts in which it occurs in the corpus model. Based on this assumption, we can show that the vector representations of words can be considered as elements of an algebra over a field. We note that in applications of vector spaces to representing meanings of words there is an underlying lattice structure; we interpret the partial ordering of the lattice as describing entailment between meanings. We also define the context-theoretic probability of a string, and, based on this and the lattice structure, a degree of entailment between strings. We relate the framework to existing methods of composing vector-based representations of meaning, and show that our approach generalises many of these, including vector addition, component-wise multiplication, and the tensor product.Comment: Submitted to Computational Linguistics on 20th January 2010 for revie

    Variable Precision Rough Set Approximations in Concept Lattice

    Get PDF
    The notions of variable precision rough set and concept lattice are can be shared by a basic notion, which is the definability of a set of objects based on a set of properties. The two theories of rough set and concept lattice can be compared, combined and applied to each other based on definability. Based on introducing the definitions of variable precision rough set and concept lattice, this paper shows that any extension of a concept in concept lattice is an equivalence class of variable precision rough set. After that, we present a definition of lower and upper approximations in concept lattice and generate the lower and upper approximations concept of concept lattice. Afterwards, we discuss the properties of the new lower and upper approximations. Finally, an example is given to show the validity of the properties that the lower and upper approximations have

    An Approach to Incremental Learning Good Classification Tests

    Get PDF
    An algorithm of incremental mining implicative logical rules is pro-posed. This algorithm is based on constructing good classification tests. The in-cremental approach to constructing these rules allows revealing the interde-pendence between two fundamental components of human thinking: pattern recognition and knowledge acquisition

    Discovering correlated parameters in Semiconductor Manufacturing processes: a Data Mining approach

    Get PDF
    International audienceData mining tools are nowadays becoming more and more popular in the semiconductor manufacturing industry, and especially in yield-oriented enhancement techniques. This is because conventional approaches fail to extract hidden relationships between numerous complex process control parameters. In order to highlight correlations between such parameters, we propose in this paper a complete knowledge discovery in databases (KDD) model. The mining heart of the model uses a new method derived from association rules programming, and is based on two concepts: decision correlation rules and contingency vectors. The first concept results from a cross fertilization between correlation and decision rules. It enables relevant links to be highlighted between sets of values of a relation and the values of sets of targets belonging to the same relation. Decision correlation rules are built on the twofold basis of the chi-squared measure and of the support of the extracted values. Due to the very nature of the problem, levelwise algorithms only allow extraction of results with long execution times and huge memory occupation. To offset these two problems, we propose an algorithm based both on the lectic order and contingency vectors, an alternate representation of contingency tables. This algorithm is the basis of our KDD model software, called MineCor. An overall presentation of its other functions, of some significant experimental results, and of associated performances are provided and discussed

    Extracting expertise from experts: Methods for knowledge acquisition

    Full text link
    Knowledge acquisition is the biggest bottleneck in the development of expert systems. Fortunately, the process of translating expert knowledge to a form suitable for expert system development can benefit from methods developed by cognitive science to reveal human knowledge structures. There are two classes of these investigative methods, direct and indirect. We provide reviews, criteria for use, and literature sources for all principal methods. Direct methods discussed are: interviews, questionnaires, observation of task performance, protocol analysis, interruption analysis, closed curves, and inferential flow analysis. Indirect methods include: multidimensional scaling, hierarchical clustering, general weighted networks, ordered trees, and repertory grid analysis.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/73498/1/j.1468-0394.1987.tb00139.x.pd
    • …
    corecore