180 research outputs found

    Reconciling Equational Heterogeneity within a Data Federation

    Get PDF
    Mappings in most federated databases are conceptualized and implemented as black-box transformations between source schemas and a federated schema. This approach does not allow specific mappings to be declared once and reused in other situations. We present an alternative approach, in which data-level mappings are represented independent of source and federated schemas as a network between “contexts”. This compendious representation expedites the data federation process via mapping reuse and automated mapping composition from simpler mappings. We illustrate the benefits of mapping reuse and composition by using an example that incorporates equational mappings and the application of symbolic equation solving techniques

    Integration of Heterogeneous Data Sources in an Ontological Knowledge Base

    Get PDF
    In this paper we present X2R, a system for integrating heterogeneous data sources in an ontological knowledge base. The main goal of the system is to create a unified view of information stored in relational, XML and LDAP data sources within an organization, expressed in RDF using a common ontology and valid according to a prescribed set of integrity constraints. X2R supports a wide range of source schemas and target ontologies by allowing the user to define potentially complex transformations of data between the original data source and the unified knowledge base. A rich set of integrity constraint primitives has been provided to ensure the quality of the unified data set. They are also leveraged in a novel approach towards semantic optimization of SPARQL queries

    An Integration-Oriented Ontology to Govern Evolution in Big Data Ecosystems

    Full text link
    Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. Thus data analysts need to adapt their analytical processes after each API release. This gets more challenging when performing an integrated or historical analysis. To cope with such complexity, in this paper, we present the Big Data Integration ontology, the core construct to govern the data integration process under schema evolution by systematically annotating it with information regarding the schema of the sources. We present a query rewriting algorithm that, using the annotated ontology, converts queries posed over the ontology to queries over the sources. To cope with syntactic evolution in the sources, we present an algorithm that semi-automatically adapts the ontology upon new releases. This guarantees ontology-mediated queries to correctly retrieve data from the most recent schema version as well as correctness in historical queries. A functional and performance evaluation on real-world APIs is performed to validate our approach.Comment: Preprint submitted to Information Systems. 35 page

    Querying industrial stream-temporal data: An ontology-based visual approach

    Get PDF
    An increasing number of sensors are being deployed in business-critical environments, systems, and equipment; and stream a vast amount of data. The operational efficiency and effectiveness of business processes rely on domain experts’ agility in interpreting data into actionable business information. A domain expert has extensive domain knowledge but not necessarily skills and knowledge on databases and formal query languages. Therefore, centralised approaches are often preferred. These require IT experts to translate the information needs of domain experts into extract-transform-load (ETL) processes in order to extract and integrate data and then let domain experts apply predefined analytics. Since such a workflow is too time intensive, heavy-weight and inflexible given the high volume and velocity of data, domain experts need to extract and analyse the data of interest directly. Ontologies, i.e., semantically rich conceptual domain models, present an intelligible solution by describing the domain of interest on a higher level of abstraction closer to the reality. Moreover, recent ontology-based data access (OBDA) technologies enable end users to formulate their information needs into queries using a set of terms defined in an ontology. Ontological queries could then be translated into SQL or some other database query languages, and executed over the data in its original place and format automatically. To this end, this article reports an ontology-based visual query system (VQS), namely OptiqueVQS, how it is extended for a stream-temporal query language called STARQL, a user experiment with the domain experts at Siemens AG, and STARQL’s query answering performance over a proof of concept implementation for PostgreSQL

    An integration-oriented ontology to govern evolution in big data ecosystems

    Get PDF
    Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. Thus data analysts need to adapt their analytical processes after each API release. This gets more challenging when performing an integrated or historical analysis. To cope with such complexity, in this paper, we present the Big Data Integration ontology, the core construct to govern the data integration process under schema evolution by systematically annotating it with information regarding the schema of the sources. We present a query rewriting algorithm that, using the annotated ontology, converts queries posed over the ontology to queries over the sources. To cope with syntactic evolution in the sources, we present an algorithm that semi-automatically adapts the ontology upon new releases. This guarantees ontology-mediated queries to correctly retrieve data from the most recent schema version as well as correctness in historical queries. A functional and performance evaluation on real-world APIs is performed to validate our approach.Peer ReviewedPostprint (author's final draft

    Semantic Web Services Provisioning

    Get PDF
    Semantic Web Services constitute an important research area, where vari ous underlying frameworks, such as WSMO and OWL-S, define Semantic Web ontologies to describe Web services, so they can be automatically discovered, composed, and invoked. Service discovery has been traditionally interpreted as a functional filter in current Semantic Web Services frameworks, frequently performed by Description Logics reasoners. However, semantic provisioning has to be performed taking Quality-of-Service (QOS) into account, defining user preferences that enable QOS-aware Semantic Web Service selection. Nowadays, the research focus is actually on QOS-aware processes, so cur rent proposals are developing the field by providing QOS support to semantic provisioning, especially in selection processes. These processes lead to opti mization problems, where the best service among a set of services has to be selected, so Description Logics cannot be used in this context. Furthermore, user preferences has to be semantically defined so they can be used within selection processes. There are several proposals that extend Semantic Web Services frameworks allowing QOS-aware semantic provisioning. However, proposed selection techniques are very coupled with their proposed extensions, most of them being implemented ad hoc. Thus, there is a semantic gap between functional descriptions (usually using WSMO or OWL-S) and user preferences, which are specific for each proposal, using different ontologies or even non-semantic de scriptions, and depending on its corresponding ad hoc selection technique. In this report, we give an overview of most important Semantic Web Ser vices frameworks, showing a comparison between them. Then, a thorough analysis of state-of-the art proposals on QOS-aware semantic provisioning and user preferences descriptions is presented, discussing about their applicabil ity, advantages, and defects. Results from this analysis motivate our research work, which has been already materialized in two early contributions.Los servicios web semánticos constituyen un importante campo de inves tigación, en el cual distintos frameworks, como por ejemplo WSMO y OWL-S, definen ontologías de la web semántica para describir servicios web, de for ma que estos puedan ser descubiertos, compuestos e invocados de manera automática. El descubrimiento de servicios ha sido interpretado tradicional mente como un filtro funcional en los frameworks actuales de servicios web semánticos, usando para ello razonadores de lógica descriptiva. Sin embargo, las tareas de aprovisionamiento semántico deberían tener en cuenta la calidad del servicio, definiendo para ello preferencias de usuario de manera que sea posible realizar una selección de servicios web semánticos sensible a la cali dad. Actualmente, el foco de la investigación está en procesos sensibles a la ca lidad, por lo que las propuestas actuales están trabajando en este campo intro duciendo el soporte adecuado a la calidad del servicio dentro del aprovisio namiento semántico, y principalmente en las tareas de selección. Estas tareas desembocan en problemas de optimización, donde el mejor servicio de entre un concjunto debe ser seleccionado, por lo que las lógicas descriptivas no pue den ser usadas en este contexto. Además, las preferencias de usuario deben ser definidas semánticamente, de forma que puedan ser usadas en las tareas de selección. Existen bastantes propuestas que extienden los frameworks de servicios web semánticos para habilitar el aprovisionamiento sensible a la calidad. Sin embargo, las técnicas de selección propuestas están altamente acopladas con dichas extensiones, donde la mayoría de ellas implementan algoritmos ad hoc. Por tanto, existe un salto semántico entre las descripciones funcionales (nor malmente usando WSMO o OWL-S) y las preferencias de usuario, las cuales son definidas específicamente por cada propuesta, usando ontologías distin tas o incluso descripciones no semánticas que dependen de la correspondiente técnica de selección ad hoc

    Towards Efficient Novel Materials Discovery

    Get PDF
    Die Entdeckung von neuen Materialien mit speziellen funktionalen Eigenschaften ist eins der wichtigsten Ziele in den Materialwissenschaften. Das Screening des strukturellen und chemischen Phasenraums nach potentiellen neuen Materialkandidaten wird häufig durch den Einsatz von Hochdurchsatzmethoden erleichtert. Schnelle und genaue Berechnungen sind eins der Hauptwerkzeuge solcher Screenings, deren erster Schritt oft Geometrierelaxationen sind. In Teil I dieser Arbeit wird eine neue Methode der eingeschränkten Geometrierelaxation vorgestellt, welche die perfekte Symmetrie des Kristalls erhält, Resourcen spart sowie Relaxationen von metastabilen Phasen und Systemen mit lokalen Symmetrien und Verzerrungen erlaubt. Neben der Verbesserung solcher Berechnungen um den Materialraum schneller zu durchleuchten ist auch eine bessere Nutzung vorhandener Daten ein wichtiger Pfeiler zur Beschleunigung der Entdeckung neuer Materialien. Obwohl schon viele verschiedene Datenbanken für computerbasierte Materialdaten existieren ist die Nutzbarkeit abhängig von der Darstellung dieser Daten. Hier untersuchen wir inwiefern semantische Technologien und Graphdarstellungen die Annotation von Daten verbessern können. Verschiedene Ontologien und Wissensgraphen werden entwickelt anhand derer die semantische Darstellung von Kristallstrukturen, Materialeigenschaften sowie experimentellen Ergebenissen im Gebiet der heterogenen Katalyse ermöglicht werden. Wir diskutieren, wie der Ansatz Ontologien und Wissensgraphen zu separieren, zusammenbricht wenn neues Wissen mit künstlicher Intelligenz involviert ist. Eine Zwischenebene wird als Lösung vorgeschlagen. Die Ontologien bilden das Hintergrundwissen, welches als Grundlage von zukünftigen autonomen Agenten verwendet werden kann. Zusammenfassend ist es noch ein langer Weg bis Materialdaten für Maschinen verständlich gemacht werden können, so das der direkte Nutzen semantischer Technologien nach aktuellem Stand in den Materialwissenschaften sehr limitiert ist.The discovery of novel materials with specific functional properties is one of the highest goals in materials science. Screening the structural and chemical space for potential new material candidates is often facilitated by high-throughput methods. Fast and still precise computations are a main tool for such screenings and often start with a geometry relaxation to find the nearest low-energy configuration relative to the input structure. In part I of this work, a new constrained geometry relaxation is presented which maintains the perfect symmetry of a crystal, saves time and resources as well as enables relaxations of meta-stable phases and systems with local symmetries or distortions. Apart from improving such computations for a quicker screening of the materials space, better usage of existing data is another pillar that can accelerate novel materials discovery. While many different databases exists that make computational results accessible, their usability depends largely on how the data is presented. We here investigate how semantic technologies and graph representations can improve data annotation. A number of different ontologies and knowledge graphs are developed enabling the semantic representation of crystal structures, materials properties as well experimental results in the field of heterogeneous catalysis. We discuss the breakdown of the knowledge-graph approach when knowledge is created using artificial intelligence and propose an intermediate information layer. The underlying ontologies can provide background knowledge for possible autonomous intelligent agents in the future. We conclude that making materials science data understandable to machines is still a long way to go and the usefulness of semantic technologies in the domain of materials science is at the moment very limited

    Un environnement de spécification et de découverte pour la réutilisation des composants logiciels dans le développement des logiciels distribués

    Get PDF
    Notre travail vise à élaborer une solution efficace pour la découverte et la réutilisation des composants logiciels dans les environnements de développement existants et couramment utilisés. Nous proposons une ontologie pour décrire et découvrir des composants logiciels élémentaires. La description couvre à la fois les propriétés fonctionnelles et les propriétés non fonctionnelles des composants logiciels exprimées comme des paramètres de QoS. Notre processus de recherche est basé sur la fonction qui calcule la distance sémantique entre la signature d'un composant et la signature d'une requête donnée, réalisant ainsi une comparaison judicieuse. Nous employons également la notion de " subsumption " pour comparer l'entrée-sortie de la requête et des composants. Après sélection des composants adéquats, les propriétés non fonctionnelles sont employées comme un facteur distinctif pour raffiner le résultat de publication des composants résultats. Nous proposons une approche de découverte des composants composite si aucun composant élémentaire n'est trouvé, cette approche basée sur l'ontologie commune. Pour intégrer le composant résultat dans le projet en cours de développement, nous avons développé l'ontologie d'intégration et les deux services " input/output convertor " et " output Matching ".Our work aims to develop an effective solution for the discovery and the reuse of software components in existing and commonly used development environments. We propose an ontology for describing and discovering atomic software components. The description covers both the functional and non functional properties which are expressed as QoS parameters. Our search process is based on the function that calculates the semantic distance between the component interface signature and the signature of a given query, thus achieving an appropriate comparison. We also use the notion of "subsumption" to compare the input/output of the query and the components input/output. After selecting the appropriate components, the non-functional properties are used to refine the search result. We propose an approach for discovering composite components if any atomic component is found, this approach based on the shared ontology. To integrate the component results in the project under development, we developed the ontology integration and two services " input/output convertor " and " output Matching "

    Model driven design and data integration in semantic web information systems

    Get PDF
    The Web is quickly evolving in many ways. It has evolved from a Web of documents into a Web of applications in which a growing number of designers offer new and interactive Web applications with people all over the world. However, application design and implementation remain complex, error-prone and laborious. In parallel there is also an evolution from a Web of documents into a Web of `knowledge' as a growing number of data owners are sharing their data sources with a growing audience. This brings the potential new applications for these data sources, including scenarios in which these datasets are reused and integrated with other existing and new data sources. However, the heterogeneity of these data sources in syntax, semantics and structure represents a great challenge for application designers. The Semantic Web is a collection of standards and technologies that offer solutions for at least the syntactic and some structural issues. If offers semantic freedom and flexibility, but this leaves the issue of semantic interoperability. In this thesis we present Hera-S, an evolution of the Model Driven Web Engineering (MDWE) method Hera. MDWEs allow designers to create data centric applications using models instead of programming. Hera-S especially targets Semantic Web sources and provides a flexible method for designing personalized adaptive Web applications. Hera-S defines several models that together define the target Web application. Moreover we implemented a framework called Hydragen, which is able to execute the Hera-S models to run the desired Web application. Hera-S' core is the Application Model (AM) in which the main logic of the application is defined, i.e. defining the groups of data elements that form logical units or subunits, the personalization conditions, and the relationships between the units. Hera-S also uses a so-called Domain Model (DM) that describes the content and its structure. However, this DM is not Hera-S specific, but instead allows any Semantic Web source representation as its DM, as long as its content can be queried by the standardized Semantic Web query language SPARQL. The same holds for the User Model (UM). The UM can be used for personalization conditions, but also as a source of user-related content if necessary. In fact, the difference between DM and UM is conceptual as their implementation within Hydragen is the same. Hera-S also defines a presentation model (PM) which defines presentation details of elements like order and style. In order to help designers with building their Web applications we have introduced a toolset, Hera Studio, which allows to build the different models graphically. Hera Studio also provides some additional functionality like model checking and deployment of the models in Hydragen. Both Hera-S and its implementation Hydragen are designed to be flexible regarding the user of models. In order to achieve this Hydragen is a stateless engine that queries for relevant information from the models at every page request. This allows the models and data to be changed in the datastore during runtime. We show that one way to exploit this flexibility is by applying aspect-orientation to the AM. Aspect-orientation allows us to dynamically inject functionality that pervades the entire application. Another way to exploit Hera-S' flexibility is in reusing specialized components, e.g. for presentation generation. We present a configuration of Hydragen in which we replace our native presentation generation functionality by the AMACONT engine. AMACONT provides more extensive multi-level presentation generation and adaptation capabilities as well aspect-orientation and a form of semantic based adaptation. Hera-S was designed to allow the (re-)use of any (Semantic) Web datasource. It even opens up the possibility for data integration at the back end, by using an extendible storage layer in our database of choice Sesame. However, even though theoretically possible it still leaves much of the actual data integration issue. As this is a recurring issue in many domains, a broader challenge than for Hera-S design only, we decided to look at this issue in isolation. We present a framework called Relco which provides a language to express data transformation operations as well as a collection of techniques that can be used to (semi-)automatically find relationships between concepts in different ontologies. This is done with a combination of syntactic, semantic and collaboration techniques, which together provide strong clues for which concepts are most likely related. In order to prove the applicability of Relco we explore five application scenarios in different domains for which data integration is a central aspect. This includes a cultural heritage portal, Explorer, for which data from several datasources was integrated and was made available by a mapview, a timeline and a graph view. Explorer also allows users to provide metadata for objects via a tagging mechanism. Another application is SenSee: an electronic TV-guide and recommender. TV-guide data was integrated and enriched with semantically structured data from several sources. Recommendations are computed by exploiting the underlying semantic structure. ViTa was a project in which several techniques for tagging and searching educational videos were evaluated. This includes scenarios in which user tags are related with an ontology, or other tags, using the Relco framework. The MobiLife project targeted the facilitation of a new generation of mobile applications that would use context-based personalization. This can be done using a context-based user profiling platform that can also be used for user model data exchange between mobile applications using technologies like Relco. The final application scenario that is shown is from the GRAPPLE project which targeted the integration of adaptive technology into current learning management systems. A large part of this integration is achieved by using a user modeling component framework in which any application can store user model information, but which can also be used for the exchange of user model data
    corecore