121 research outputs found

    Applying Cognitive Principles of Similarity to Data Integration – The Case of SIAM

    Get PDF
    Increasingly, modern system design is concerned with the integration of legacy systems and data. Consequently, data integration is an important step in many system design projects and also a prerequisite to data warehousing, data mining, and analytics. The central step in data integration is the identification of similar elements in multiple data sources. In this paper, we describe an application of principles of similarity based in cognitive psychology, specifically the theory of Similarity as Interactive Activation and Mapping (SIAM) to the problem of database schema matching. In a field that has been dominated by a multitude of ad-hoc algorithms, cognitive principles can establish an appropriate theoretical basis. The results of this paper show initial success in matching applications and point towards future research

    Lexical choice and conceptual perspective in the generation of plural referring expressions

    Get PDF
    A fundamental part of the process of referring to an entity is to categorise it (for instance, as the woman). Where multiple categorisations exist, this implicitly involves the adoption of a conceptual perspective. A challenge for the automatic Generation of Referring Expressions is to identify a set of referents coherently, adopting the same conceptual perspective. We describe and evaluate an algorithm to achieve this. The design of the algorithm is motivated by the results of psycholinguistic experiments.peer-reviewe

    Ontology-based methodology for error detection in software design

    Get PDF
    Improving the quality of a software design with the goal of producing a high quality software product continues to grow in importance due to the costs that result from poorly designed software. It is commonly accepted that multiple design views are required in order to clearly specify the required functionality of software. There is universal agreement as to the importance of identifying inconsistencies early in the software design process, but the challenge is how to reconcile the representations of the diverse views to ensure consistency. To address the problem of inconsistencies that occur across multiple design views, this research introduces the Methodology for Objects to Agents (MOA). MOA utilizes a new ontology, the Ontology for Software Specification and Design (OSSD), as a common information model to integrate specification knowledge and design knowledge in order to facilitate the interoperability of formal requirements modeling tools and design tools, with the end goal of detecting inconsistency errors in a design. The methodology, which transforms designs represented using the Unified Modeling Language (UML) into representations written in formal agent-oriented modeling languages, integrates object-oriented concepts and agent-oriented concepts in order to take advantage of the benefits that both approaches can provide. The OSSD model is a hierarchical decomposition of software development concepts, including ontological constructs of objects, attributes, behavior, relations, states, transitions, goals, constraints, and plans. The methodology includes a consistency checking process that defines a consistency framework and an Inter-View Inconsistency Detection technique. MOA enhances software design quality by integrating multiple software design views, integrating object-oriented and agent-oriented concepts, and defining an error detection method that associates rules with ontological properties

    Word sense discovery and disambiguation

    Get PDF
    The work is based on the assumption that words with similar syntactic usage have similar meaning, which was proposed by Zellig S. Harris (1954,1968). We study his assumption from two aspects: Firstly, different meanings (word senses) of a word should manifest themselves in different usages (contexts), and secondly, similar usages (contexts) should lead to similar meanings (word senses). If we start with the different meanings of a word, we should be able to find distinct contexts for the meanings in text corpora. We separate the meanings by grouping and labeling contexts in an unsupervised or weakly supervised manner (Publication 1, 2 and 3). We are confronted with the question of how best to represent contexts in order to induce effective classifiers of contexts, because differences in context are the only means we have to separate word senses. If we start with words in similar contexts, we should be able to discover similarities in meaning. We can do this monolingually or multilingually. In the monolingual material, we find synonyms and other related words in an unsupervised way (Publication 4). In the multilingual material, we ?nd translations by supervised learning of transliterations (Publication 5). In both the monolingual and multilingual case, we first discover words with similar contexts, i.e., synonym or translation lists. In the monolingual case we also aim at finding structure in the lists by discovering groups of similar words, e.g., synonym sets. In this introduction to the publications of the thesis, we consider the larger background issues of how meaning arises, how it is quantized into word senses, and how it is modeled. We also consider how to define, collect and represent contexts. We discuss how to evaluate the trained context classi?ers and discovered word sense classifications, and ?nally we present the word sense discovery and disambiguation methods of the publications. This work supports Harris' hypothesis by implementing three new methods modeled on his hypothesis. The methods have practical consequences for creating thesauruses and translation dictionaries, e.g., for information retrieval and machine translation purposes. Keywords: Word senses, Context, Evaluation, Word sense disambiguation, Word sense discovery

    Constraint-Based Ontology Induction From Online Customer Reviews

    Get PDF
    We present an unsupervised, domain-independent technique for inducing a product-specific ontology of product features based upon online customer reviews. We frame ontology induction as a logical assignment problem and solve it with a bounds consistency constrained logic program. Using shallow natural language processing techniques, reviews are parsed into phrase sequences where each phrase refers to a single concept. Traditional document clustering techniques are adapted to collect phrases into initial concepts. We generate a token graph for each initial concept cluster and find a maximal clique to define the corresponding logical set of concept sub-elements. The logic program assigns tokens to clique sub-elements. We apply the technique to several thousand digital camera customer reviews and evaluate the results by comparing them to the ontologies represented by several prominent online buying guides. Because our results are drawn directly from customer comments, differences between our automatically induced product features and those in extant guides may reflect opportunities for better managing customer-producer relationships rather than errors in the process
    • …
    corecore