20,934 research outputs found

    GO faster ChEBI with Reasonable Biochemistry

    Get PDF
    Chemical Entities of Biological Interest (ChEBI) is a database and ontology that represents biochemical knowledge about small molecules. Recent changes to the ontology have created new opportunities for automated reasoning with description logic, that have not previously been fully exploited in Chemistry. These changes open up the possibility of building an improved chemical semantic web, by making more use of necessary and sufficient conditions, allowing reasoning about chemical structure, highlighting ambiguous inconsistencies and improving alignment with the Gene Ontology (GO). This paper briefly discusses some of the problems with reasoning over the current version of ChEBI, to tackle these issues, and their potential solutions

    Coreference detection of low quality objects

    Get PDF
    The problem of record linkage is a widely studied problem that aims to identify coreferent (i.e. duplicate) data in a structured data source. As indicated by Winkler, a solution to the record linkage problem is only possible if the error rate is sufficiently low. In other words, in order to succesfully deduplicate a database, the objects in the database must be of sufficient quality. However, this assumption is not always feasible. In this paper, it is investigated how merging of low quality objects into one high quality object can improve the process of record linkage. This general idea is illustrated in the context of strings comparison, where strings of low quality (i.e. with a high typographical error rate) are merged into a string of high quality by using an n-dimensional Levenshtein distance matrix and compute the optimal alignment between the dirty strings. Results are presented and possible refinements are proposed

    Structured Knowledge Representation for Image Retrieval

    Full text link
    We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, we provide a syntax to describe segmented regions as basic objects and complex objects as compositions of basic ones. Then we introduce a companion extensional semantics for defining reasoning services, such as retrieval, classification, and subsumption. These services can be used for both exact and approximate matching, using similarity measures. Using our logical approach as a formal specification, we implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. A set of experiments has been carried out on a testbed of images to assess the retrieval capabilities of the system in comparison with expert users ranking. Results are presented adopting a well-established measure of quality borrowed from textual information retrieval

    Reasoning about Social Semantic Web Applications using String Similarity and Frame Logic

    Get PDF
    Social semantic Web or Web 3.0 application gained major attention from academia and industry in recent times. Such applications try to take advantage of user supplied meta data, using ideas from the semantic Web initiative, in order to provide better services. An open problem is the formalization of such meta data, due to its complex and often inconsistent nature. A possible solution to inconsistencies are string similarity metrics which are explained and analyzed. A study of performance and applicability in a frame logic environment is conducted on the case of agent reasoning about multiple domains in TaOPis - a social semantic Web application for self-organizing communities. Results show that the NYSIIS metric yields surprisingly good results on Croatian words and phrases

    Ontologies on the semantic web

    Get PDF
    As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The “Semantic Web” was touted by its developers as equally revolutionary but has not yet achieved anything like the Web’s exponential uptake. This 17 000 word survey article explores why this might be so, from a perspective that bridges both philosophy and IT

    Taxonomy for Humans or Computers? Cognitive Pragmatics for Big Data

    Get PDF
    Criticism of big data has focused on showing that more is not necessarily better, in the sense that data may lose their value when taken out of context and aggregated together. The next step is to incorporate an awareness of pitfalls for aggregation into the design of data infrastructure and institutions. A common strategy minimizes aggregation errors by increasing the precision of our conventions for identifying and classifying data. As a counterpoint, we argue that there are pragmatic trade-offs between precision and ambiguity that are key to designing effective solutions for generating big data about biodiversity. We focus on the importance of theory-dependence as a source of ambiguity in taxonomic nomenclature and hence a persistent challenge for implementing a single, long-term solution to storing and accessing meaningful sets of biological specimens. We argue that ambiguity does have a positive role to play in scientific progress as a tool for efficiently symbolizing multiple aspects of taxa and mediating between conflicting hypotheses about their nature. Pursuing a deeper understanding of the trade-offs and synthesis of precision and ambiguity as virtues of scientific language and communication systems then offers a productive next step for realizing sound, big biodiversity data services
    corecore