Search CORE

2,724,039 research outputs found

Provenance in Linked Data Integration

Author: Gibbins Nicholas
Omitola Temitope
Shadbolt Nigel
Publication venue
Publication date: 16/12/2010
Field of study

The open world of the (Semantic) Web is a global information space offering diverse materials of disparate qualities, and the opportunity to re-use, aggregate, and integrate these materials in novel ways. The advent of Linked Data brings the potential to expose data on the Web, creating new challenges for data consumers who want to integrate these data. One challenge is the ability, for users, to elicit the reliability and/or the accuracy of the data they come across. In this paper, we describe a light-weight provenance extension for the voiD vocabulary that allows data publishers to add provenance metadata to their datasets. These provenance metadata can be queried by consumers and used as contextual information for integration and inter-operation of information resources on the Semantic Web

Southampton (e-Prints Soton)

Using Ontologies for Semantic Data Integration

Author: DE GIACOMO Giuseppe
Lembo Domenico
Lenzerini Maurizio
Poggi Antonella
Rosati Riccardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

While big data analytics is considered as one of the most important paths to competitive advantage of today’s enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed

Archivio della ricerca- Università di Roma La Sapienza

Ontology-Based Data Access and Integration

Author: A Calì
A Leitsch
A Levy
Alexandros Chortaras
D Calvanese
Georg Gottlob
Héctor Pérez-Urbina
S Ceri
T Imielinski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

An ontology-based data integration (OBDI) system is an information management system consisting of three components: an ontology, a set of data sources, and the mapping between the two. The ontology is a conceptual, formal description of the domain of interest to a given organization (or a community of users), expressed in terms of relevant concepts, attributes of concepts, relationships between concepts, and logical assertions characterizing the domain knowledge. The data sources are the repositories accessible by the organization where data concerning the domain are stored. In the general case, such repositories are numerous, heterogeneous, each one managed and maintained independently from the others. The mapping is a precise specification of the correspondence between the data contained in the data sources and the elements of the ontology. The main purpose of an OBDI system is to allow information consumers to query the data using the elements in the ontology as predicates. In the special case where the organization manages a single data source, the term ontology-based data access (ODBA) system is used

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Integration of environmental data in BIM tool & linked building data

Author: Fonbeyin Henry Abanda
Kamsu-Foguem Bernard
Karray Mohamed Hedi
Magniont Camille
Pauwels Pieter
Tchouangouem Justine Floré
Publication venue
Publication date: 01/01/2019
Field of study

Environmental assessment is a critical need to ensure building sustainability. In order to enhance the sustainability of building, involved actors should be able to access and share not only information about the building but also data about products and especially their environmental assessment. Among several approaches that have been proposed to achieve that, semantic web technologies stand out from the crowd by their capabilities to share data and enhance interoperability in between the most heterogeneous systems. This paper presents the implementation of a method in which semantic web technologies and particularly Linked Data have been combined with Building Information Modelling (BIM) tools to foster building sustainability by introducing products with their environmental assessment in building data during the modelling phase. Based on Linked Building Data (LBD) vocabularies and environmental data, several ontologies have been generated in order to make both of them available as Resource Description Framework (RDF) graphs. A database access plugin has been developed and installed in a BIM tool. In that way, the LBD generated from the BIM tool contains, for each product a reference to its environmental assessment which is contained in a triplestore

Repository TU/e

Ghent University Academic Bibliography

Discovering transcriptional modules by Bayesian data integration

Author: Antoniak
Bar-Joseph
Bernard J. de la Cruz
Bähler
Cho
Dahl
Datta
David L. Wild
Eisen
Falcon
Ferguson
Fritsch
Gasch
Gerber
Geweke
Harbison
Ideker
Ihmels
Jim E. Griffin
Kundaje
Lee
Liu
Liu
Medvedovic
Medvedovic
Qin
Rasmussen
Rasmussen
Reid
Richard S. Savage
Savage
Segal
Segal
Teh
Teh
Wild
Yao
Yeung
Zoubin Ghahramani
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2010
Field of study

Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs

Crossref

PubMed Central

Warwick Research Archives Portal Repository

Kent Academic Repository

CUED - Cambridge University Engineering Department

Automated data integration for developmental biological research

Author: Sternberg Paul W.
Zhong Weiwei
Publication venue: 'The Company of Biologists'
Publication date: 15/09/2007
Field of study

In an era exploding with genome-scale data, a major challenge for developmental biologists is how to extract significant clues from these publicly available data to benefit our studies of individual genes, and how to use them to improve our understanding of development at a systems level. Several studies have successfully demonstrated new approaches to classic developmental questions by computationally integrating various genome-wide data sets. Such computational approaches have shown great potential for facilitating research: instead of testing 20,000 genes, researchers might test 200 to the same effect. We discuss the nature and state of this art as it applies to developmental research

Caltech Authors

UK utility data integration: overcoming schematic heterogeneity

Author: Beck A.R.
Bennett B.
Cohn AG
Fu G.
Ramage S.
Sanderson M.
Stell J.G.
Tagg C
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 31/10/2008
Field of study

In this paper we discuss syntactic, semantic and schematic issues which inhibit the integration of utility data in the UK. We then focus on the techniques employed within the VISTA project to overcome schematic heterogeneity. A Global Schema based architecture is employed. Although automated approaches to Global Schema definition were attempted the heterogeneities of the sector were too great. A manual approach to Global Schema definition was employed. The techniques used to define and subsequently map source utility data models to this schema are discussed in detail. In order to ensure a coherent integrated model, sub and cross domain validation issues are then highlighted. Finally the proposed framework and data flow for schematic integration is introduced

Crossref

White Rose Research Online