15,873 research outputs found

    Bottom-up construction of ontologies

    Get PDF
    Presents a particular way of building ontologies that proceeds in a bottom-up fashion. Concepts are defined in a way that mirrors the way their instances are composed out of smaller objects. The smaller objects themselves may also be modeled as being composed. Bottom-up ontologies are flexible through the use of implicit and, hence, parsimonious part-whole and subconcept-superconcept relations. The bottom-up method complements current practice, where, as a rule, ontologies are built top-down. The design method is illustrated by an example involving ontologies of pure substances at several levels of detail. It is not claimed that bottom-up construction is a generally valid recipe; indeed, such recipes are deemed uninformative or impossible. Rather, the approach is intended to enrich the ontology developer's toolki

    A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection

    Full text link
    Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is meaning-preserving, compositionality can be approximated as the semantic similarity between a phrase and a version of that phrase where words have been replaced by their synonyms. Different ways of representing such phrases exist (e.g., vectors [1] or language models [2]), and the choice of representation affects the measurement of semantic similarity. We propose a new compositionality detection method that represents phrases as ranked lists of term weights. Our method approximates the semantic similarity between two ranked list representations using a range of well-known distance and correlation metrics. In contrast to most state-of-the-art approaches in compositionality detection, our method is completely unsupervised. Experiments with a publicly available dataset of 1048 human-annotated phrases shows that, compared to strong supervised baselines, our approach provides superior measurement of compositionality using any of the distance and correlation metrics considered

    The Evaluation Of Molecular Similarity And Molecular Diversity Methods Using Biological Activity Data

    Get PDF
    This paper reviews the techniques available for quantifying the effectiveness of methods for molecule similarity and molecular diversity, focusing in particular on similarity searching and on compound selection procedures. The evaluation criteria considered are based on biological activity data, both qualitative and quantitative, with rather different criteria needing to be used depending on the type of data available

    Enriching ontological user profiles with tagging history for multi-domain recommendations

    Get PDF
    Many advanced recommendation frameworks employ ontologies of various complexities to model individuals and items, providing a mechanism for the expression of user interests and the representation of item attributes. As a result, complex matching techniques can be applied to support individuals in the discovery of items according to explicit and implicit user preferences. Recently, the rapid adoption of Web2.0, and the proliferation of social networking sites, has resulted in more and more users providing an increasing amount of information about themselves that could be exploited for recommendation purposes. However, the unification of personal information with ontologies using the contemporary knowledge representation methods often associated with Web2.0 applications, such as community tagging, is a non-trivial task. In this paper, we propose a method for the unification of tags with ontologies by grounding tags to a shared representation in the form of Wordnet and Wikipedia. We incorporate individuals' tagging history into their ontological profiles by matching tags with ontology concepts. This approach is preliminary evaluated by extending an existing news recommendation system with user tagging histories harvested from popular social networking sites

    The head-modifier principle and multilingual term extraction

    Get PDF
    Advances in Language Engineering may be dependent on theoretical principles originating from linguistics since both share a common object of enquiry, natural language structures. We outline an approach to term extraction that rests on theoretical claims about the structure of words. We use the structural properties of compound words to specifically elicit the sets of terms defined by type hierarchies such as hyponymy and meronymy. The theoretical claims revolve around the head-modifier principle which determines the formation of a major class of compounds. Significantly it has been suggested that the principle operates in languages other than English. To demonstrate the extendibility of our approach beyond English, we present a case study of term extraction in Chinese, a language whose written form is the vehicle of communication for over 1.3 billion language users, and therefore has great significance for the development of language engineering technologies

    Efficient & Effective Selective Query Rewriting with Efficiency Predictions

    Get PDF
    To enhance effectiveness, a user's query can be rewritten internally by the search engine in many ways, for example by applying proximity, or by expanding the query with related terms. However, approaches that benefit effectiveness often have a negative impact on efficiency, which has impacts upon the user satisfaction, if the query is excessively slow. In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both effectiveness and efficiency. In particular, we propose the prediction of the execution time of ephemeral (e.g., proximity) posting lists generated from uni-gram inverted index posting lists, which are used in establishing the permissible query rewriting alternatives that may execute in the allowed time. Experiments examining both the effectiveness and efficiency of the proposed approach demonstrate that a 49% decrease in mean response time (and 62% decrease in 95th-percentile response time) can be attained without significantly hindering the effectiveness of the search engine
    corecore