530 research outputs found

    Ontology of core data mining entities

    Get PDF
    In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

    SEBIO: A Semantic BioInformatics Platform for the New E-Science

    Get PDF
    Knowledge integration and exchange of data within and among organizations is a universally recognized need in bioinformatics and genomics research through the e-science field. The main problem looming over the lack of integration is the fact that the current Web is an environment primarily developed for human users and micro-array data resources lack widely accepted standards; this leads to a tremendous data heterogeneity. Using semantic technologies as a key technology for interoperation of various datasets enables knowledge integration of the vast amount of biological and biomedical data. In this paper, we aim at providing a semantically-enhanced bioinformatics platform (SEBIO), which handles these issues effectively. We will describe the problems arisen and the solutions applied so far. For that, the SEBIO approach is unfolded and its main components explained, to see in more detail how perfectly it copes with the aforementioned difficulties

    On the Foundations of Data Interoperability and Semantic Search on the Web

    Get PDF
    This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabling users to search independent ontologies and discover knowledge in a serendipitous fashion, i.e., often completely unintended by the developers of the ontologies. The main difficulty with such search is that users generally do not have any control over the naming conventions and content of the ontologies. Thus terms must be appropriately mapped across ontologies based on their meaning. The meaning-based search of data is referred to as semantic search, and its facilitation (aka semantic interoperability) then requires mapping between ontologies. In relational databases, searching across organizational boundaries currently involves the difficult task of setting up a rigid information integration system. Linked Data representations more flexibly tackle the problem of searching across organizational boundaries on the Web. However, there exists no consensus on how ontology mapping should be performed for this scenario, and the problem is open. We lay out the foundations of semantic search on the Web of Data by comparing it to keyword search in the relational model and by providing effective mechanisms to facilitate data interoperability across organizational boundaries. We identify two sharply distinct goals for ontology mapping based on real-world use cases. These goals are: (i) ontology development, and (ii) facilitating interoperability. We systematically analyze these goals, side-by-side, and contrast them. Our analysis demonstrates the implications of the goals on how to perform ontology mapping and how to represent the mappings. We rigorously compare facilitating interoperability between ontologies to information integration in databases. Based on the comparison, class matching is emphasized as a critical part of facilitating interoperability. For class matching, various class similarity metrics are formalized and an algorithm that utilizes these metrics is designed. We also experimentally evaluate the effectiveness of the class similarity metrics on real-world ontologies. In order to encode the correspondences between ontologies for interoperability, we develop a novel W3C-compliant representation, named skeleton

    Information modelling for the development of sustainable construction (MINDOC)

    Get PDF
    In previous decades, controlling the environmental impact through lifecycle analysis has become a topical issue in the building sector. However, there are some problems when trying to exchange information between experts for conducting various studies like the environmental assessment of the building. There is also heterogeneity between construction product databases because they do not have the same characteristics and do not use the same basis to measure the environmental impact of each construction product. Moreover, there are still difficulties to exploit the full potential of linking BIM, SemanticWeb and databases of construction products because the idea of combining them is relatively recent. The goal of this thesis is to increase the flexibility needed to assess the building’s environmental impact in a timely manner. First, our research determines gaps in interoperability in the AEC (Architecture Engineering and Construction) domain. Then, we fill some of the shortcomings encountered in the formalization of building information and the generation of building data in Semantic Web formats. We further promote efficient use of BIM throughout the building life cycle by integrating and referencing environmental data on construction products into a BIM tool. Moreover, semantics has been improved by the enhancement of a well-known building-based ontology (namely ifcOWL for Industry Foundation Classes Web Ontology Language). Finally, we experience a case study of a small building for our methodology

    The United States Marine Corps Data Collaboration Requirements: Retrieving and Integrating Data From Multiple Databases

    Get PDF
    The goal of this research is to develop an information sharing and database integration model and suggest a framework to fully satisfy the United States Marine Corps collaboration requirements as well as its information sharing and database integration needs. This research is exploratory; it focuses on only one initiative: the IT-21 initiative. The IT-21 initiative dictates The Technology for the United States Navy and Marine Corps, 2000-2035: Becoming a 21st Century Force. The IT-21 initiative states that Navy and Marine Corps information infrastructure will be based largely on commercial systems and services, and the Department of the Navy must ensure that these systems are seamlessly integrated and that information transported over the infrastructure is protected and secure. The Delphi Technique, a qualitative method approach, was used to develop a Holistic Model and to suggest a framework for information sharing and database integration. Data was primarily collected from mid-level to senior information officers, with a focus on Chief Information Officers. In addition, an extensive literature review was conducted to gain insight about known similarities and differences in Strategic Information Management, information sharing strategies, and database integration strategies. It is hoped that the Armed Forces and the Department of Defense will benefit from future development of the information sharing and database integration Holistic Model

    A semantic and agent-based approach to support information retrieval, interoperability and multi-lateral viewpoints for heterogeneous environmental databases

    Get PDF
    PhDData stored in individual autonomous databases often needs to be combined and interrelated. For example, in the Inland Water (IW) environment monitoring domain, the spatial and temporal variation of measurements of different water quality indicators stored in different databases are of interest. Data from multiple data sources is more complex to combine when there is a lack of metadata in a computation forin and when the syntax and semantics of the stored data models are heterogeneous. The main types of information retrieval (IR) requirements are query transparency and data harmonisation for data interoperability and support for multiple user views. A combined Semantic Web based and Agent based distributed system framework has been developed to support the above IR requirements. It has been implemented using the Jena ontology and JADE agent toolkits. The semantic part supports the interoperability of autonomous data sources by merging their intensional data, using a Global-As-View or GAV approach, into a global semantic model, represented in DAML+OIL and in OWL. This is used to mediate between different local database views. The agent part provides the semantic services to import, align and parse semantic metadata instances, to support data mediation and to reason about data mappings during alignment. The framework has applied to support information retrieval, interoperability and multi-lateral viewpoints for four European environmental agency databases. An extended GAV approach has been developed and applied to handle queries that can be reformulated over multiple user views of the stored data. This allows users to retrieve data in a conceptualisation that is better suited to them rather than to have to understand the entire detailed global view conceptualisation. User viewpoints are derived from the global ontology or existing viewpoints of it. This has the advantage that it reduces the number of potential conceptualisations and their associated mappings to be more computationally manageable. Whereas an ad hoc framework based upon conventional distributed programming language and a rule framework could be used to support user views and adaptation to user views, a more formal framework has the benefit in that it can support reasoning about the consistency, equivalence, containment and conflict resolution when traversing data models. A preliminary formulation of the formal model has been undertaken and is based upon extending a Datalog type algebra with hierarchical, attribute and instance value operators. These operators can be applied to support compositional mapping and consistency checking of data views. The multiple viewpoint system was implemented as a Java-based application consisting of two sub-systems, one for viewpoint adaptation and management, the other for query processing and query result adjustment
    corecore