14 research outputs found

    AgroPortal : a proposition for ontology-based services in the agronomic domain

    Get PDF
    Our project is to develop and support a reference ontology repository for the agronomic domain. By reusing the NCBO BioPortal technology, we have already designed and implemented a prototype ontology repository for plants and a few crops. We plan to turn that prototype into a real service to the community. The AgroPortal project aims at reusing the scientific outcomes and experience of the biomedical domain in the context of plant, agronomic and environment sciences. We will offer an ontology portal which features ontology hosting, search, versioning, visualization, comment, but we will also offer services for semantically annotating data with the ontologies, as well as storing and exploiting ontology alignments and data annotations. All of these within a fully semantic web compliant infrastructure. The main objective of this project is to enable straightforward use of agronomic related ontologies, avoiding data managers and researchers the burden to deal with complex knowledge engineering issues to annotate the research data. The AgroPortal project will specifically pay attention to respect the requirements of the agronomic community and the specificities of the crop domain. We will first focus on the outputs of a few existing driving agronomic use cases related to rice and wheat, with the goal of generalizing to other Crop Ontology related use cases. AgroPortal will offer a robust and stable platform that we anticipate will be highly valued by the community

    Transformer les Open Data brutes en graphes enrichis en vue d'une intégration dans les systèmes OLAP

    Get PDF
    National audienceThe Open Data integration in the decision systems is challenged by the absence of schema, the raw data and the semantic and structural heterogeneousness. In the literature, the most of authors studies the integration of RDF’Open Data in information systems besides the little percentage of available data in this format. On the other hand, few works are interested of Excel’Open Data despite they represent more than 90% of the available data.In this paper, we provide an automatic process that transforms raw Open Data in exploitable rich graphs. This process is validated by the users. This is part of our generic approach for integrating theOpen Data into multidimensional data warehouse.L’intégration des Open Data dans les systèmes OLAP est difficile en raison de l’absence de schémas sources, l’aspect brut des données et l’hétérogénéité sémantique et structurelle. La plupart des travaux existants s’intéressent aux Open Data de format RDF qui restent actuellement minoritairement disponibles. En revanche, peu de travaux s’intéressent aux Open Data de format brut, par exemple Excel qui représentent pourtant plus que 90% des données ouvertes disponibles. Dans cet article, nous proposons un processus automatique de transformation des Open Data brutes en graphes enrichis exploitables pour l’intégration. Ce processus est validé par l’utilisateur et s’inscrit dans notre démarche d’intégration des Open Data dans les entrepôts de données multidimensionnelles

    AgroPortal: a vocabulary and ontology repository for agronomy

    Get PDF
    Many vocabularies and ontologies are produced to represent and annotate agronomic data. However, those ontologies are spread out, in different formats, of different size, with different structures and from overlapping domains. Therefore, there is need for a common platform to receive and host them, align them, and enabling their use in agro-informatics applications. By reusing the National Center for Biomedical Ontologies (NCBO) BioPortal technology, we have designed AgroPortal, an ontology repository for the agronomy domain. The AgroPortal project re-uses the biomedical domain’s semantic tools and insights to serve agronomy, but also food, plant, and biodiversity sciences. We offer a portal that features ontology hosting, search, versioning, visualization, comment, and recommendation; enables semantic annotation; stores and exploits ontology alignments; and enables interoperation with the semantic web. The AgroPortal specifically satisfies requirements of the agronomy community in terms of ontology formats (e.g., SKOS vocabularies and trait dictionaries) and supported features (offering detailed metadata and advanced annotation capabilities). In this paper, we present our platform’s content and features, including the additions to the original technology, as well as preliminary outputs of five driving agronomic use cases that participated in the design and orientation of the project to anchor it in the community. By building on the experience and existing technology acquired from the biomedical domain, we can present in AgroPortal a robust and feature-rich repository of great value for the agronomic domain. Keyword

    Intégration holistique et entreposage automatique des données ouvertes

    Get PDF
    Statistical Open Data present useful information to feed up a decision-making system. Their integration and storage within these systems is achieved through ETL processes. It is necessary to automate these processes in order to facilitate their accessibility to non-experts. These processes have also need to face out the problems of lack of schemes and structural and sematic heterogeneity, which characterize the Open Data. To meet these issues, we propose a new ETL approach based on graphs. For the extraction, we propose automatic activities performing detection and annotations based on a model of a table. For the transformation, we propose a linear program fulfilling holistic integration of several graphs. This model supplies an optimal and a unique solution. For the loading, we propose a progressive process for the definition of the multidimensional schema and the augmentation of the integrated graph. Finally, we present a prototype and the experimental evaluations.Les statistiques présentes dans les Open Data ou données ouvertes constituent des informations utiles pour alimenter un système décisionnel. Leur intégration et leur entreposage au sein du système décisionnel se fait à travers des processus ETL. Il faut automatiser ces processus afin de faciliter leur accessibilité à des non-experts. Ces processus doivent pallier aux problèmes de manque de schémas, d'hétérogénéité structurelle et sémantique qui caractérisent les données ouvertes. Afin de répondre à ces problématiques, nous proposons une nouvelle démarche ETL basée sur les graphes. Pour l'extraction du graphe d'un tableau, nous proposons des activités de détection et d'annotation automatiques. Pour la transformation, nous proposons un programme linéaire pour résoudre le problème d'appariement holistique de données structurelles provenant de plusieurs graphes. Ce modèle fournit une solution optimale et unique. Pour le chargement, nous proposons un processus progressif pour la définition du schéma multidimensionnel et l'augmentation du graphe intégré. Enfin, nous présentons un prototype et les résultats d'expérimentations

    Conceptual Correspondence Monitoring: Multimode Information Logistics Approach

    Get PDF
    Abstract. The paper addresses the problems arising in situations where conceptual correspondence has to be monitored, i.e., there are two or more structures of concepts which have a physical or abstract mapping and the changes in the structures of concepts may introduce the changes in the mapping. Usually the monitoring of conceptual correspondence requires manual, semiautomatic, and automatic information processing and exposes high level of complexity. The integration of different types of information processing units can be achieved by the use of multimode information logistics. The paper discusses challenges of the use of multimode information logistics in monitoring conceptual correspondence and proposes an approach that helps to partly meet the discussed challenges by jointly using functional and morphological spaces of representation of information logistics networks. The proposed approach is illustrated by an example of monitoring conceptual correspondence between knowledge demand and offer in the area of education

    Ontology-Driven Semantic Data Integration in Open Environment

    Get PDF
    Collaborative intelligence in the context of information management can be defined as A shared intelligence that results from the collaboration between various information systems . In open environments, these collaborating information systems can be heterogeneous, dynamic and loosely-coupled. Information systems in open environment can also possess a certain degree of autonomy. The integration of data residing in various heterogeneous information systems is essential in order to drive the intelligence efficiently and accurately. Because of the heterogeneous, loosely-coupled, and dynamic nature of open environment, the integration between these information systems in the data level is not efficient. Several approaches and models have been proposed in order to perform the task of data integration. Many of the existing approaches for data integration are designed for closed environment, tightly-coupled systems and enterprise data integration. They make explicit, or implicit, assumptions about the semantic structure of the data. Because of the heterogeneous and loosely-coupled nature of open environment, such assumptions are deemed unintuitive. Data integration approaches based on model that are extensional in nature are also inadequate for open environment. This is because they do not account for the dynamic nature of open environment. The need for an adequate model for describing data integration systems in open environment is quite evident. Intensional based modeling is found to be an adequate and natural choice for modeling in open environment. This is because it addresses the dynamic and loosely-coupled nature of open environment. In this work, an intensional model for the conceptualization is presented. This model is based on the theory of Properties Relations and Propositions (PRP). The proposed description takes the concepts, relations, and properties as primitive and as such, irreducible entities. The formal intensional account of both Ontology and Ontological Commitment are also proposed in light of the intensional model for conceptualization. An intensional model for ontology-driven mediated data integration in open environment is also proposed. The proposed model accounts for the dynamic nature of open environment and also intensionally describes the information of data sources. The interface between global and local ontologies and the formal intensional semantics of the query answering are then described

    Final Report of the ModSysC2020 Working Group - Data, Models and Theories for Complex Systems: new challenges and opportunities

    Get PDF
    Final Report of the ModSysC2020 Working Group at University Montpellier 2At University Montpellier 2, the modeling and simulation of complex systems has been identified as a major scientific challenge and one of the priority axes in interdisciplinary research, with major potential impact on training, economy and society. Many research groups and laboratories in Montpellier are already working in that direction, but typically in isolation within their own scientific discipline. Several local actions have been initiated in order to structure the scientific community with interdisciplinary projects, but with little coordination among the actions. The goal of the ModSysC2020 (modeling and simulation of complex systems in 2020) working group was to analyze the local situation (forces and weaknesses, current projects), identify the critical research directions and propose concrete actions in terms of research projects, equipment facilities, human resources and training to be encouraged. To guide this perspective, we decomposed the scientific challenge into four main themes, for which there is strong background in Montpellier: (1) modeling and simulation of complex systems; (2) algorithms and computing; (3) scientific data management; (4) production, storage and archiving of data from the observation of the natural and biological media. In this report, for each theme, we introduce the context and motivations, analyze the situation in Montpellier, identify research directions and propose specific actions in terms of interdisciplinary research projects and training. We also provide an analysis of the socio-economical aspects of modeling and simulation through use cases in various domains such as life science and healthcare, environmental science and energy. Finally, we discuss the importance of revisiting students training in fundamental domains such as modeling, computer programming and database which are typically taught too late, in specialized masters

    A utilização de dados públicos abertos na construção de um Data Warehouse : a construção de um repositório estatísticas educacionais públicas brasileiras

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceNa última década, diferentes países têm desenvolvido iniciativas relacionadas à divulgação de dados governamentais de forma aberta. Apesar da existência e disponibilização das bases de dados, a tarefa de utilização e extração de conhecimento dessas bases ainda apresenta alguns desafios, relacionados a à integração e à compatibilização das informações. Isso ocorre devido à baixa estruturação e a grande heterogeneidade das fontes, que faz com que as abordagens tradicionais de extração transformação e carga (ETL) tornem-se menos eficientes. Esse trabalho busca analisar uma abordagem de construção de um repositório de dados abertos baseada na estrutura dos arquivos unidimensionais (flat files), que possibilite a construção dos modelos dimensionais de forma mais eficiente.In the last decade, different countries have developed initiatives related to the dissemination of open data. Despite the existence and availability of databases, the task of using this data and knowledge extraction still presents some challenges related to the integration and compatibility of information. This occurs due to both poor-structure and a great heterogeneity of sources, which make traditional extraction, transformation, and loading (ETL) approach less efficient. This manuscript analyzes an approach for the construction of open data repository based on a flat files structure that enables a more efficient dimensional model building

    User satisfaction model to measure open government data usage

    Get PDF
    The open government data (OGD) initiative is presented by the government of any country to achieve promotion of transparency, social control and citizens participation in policy making. The use of OGD in Malaysia is still in its early stage and facing problems such as less participation, security issues, and lack of awareness. While most of the research in Information Communication Technology (ICT) that underpinned by Expectation Confirmation Theory (ECT) are focused on user satisfaction and determination of users’ reuse intention, this study focus on the direct antecedents of OGD users’ intention to use and its influence on OGD users’ satisfaction, as this research is still scarce. This research aims to examine ECT model on users’ satisfaction mediated by the intention to use the open government data (OGD). The objectives of this research are in three folds; (1) to design an integrated ECT and TAM models for explaining OGD satisfaction, (2) to examine the mediating role of citizens’ behavioural intention between the expectations, confirmation, perceived performance, incentive on usage, perceived risk and citizen’s satisfaction of open government data, (3) and to validate the impact of incentives on usage and perceived risk in explaining the new ECT model in the OGD context. Data were collected from 250 samples of OGD users in Malaysia. Empirical evidences were gathered through self-administered questionnaires using the Likert scale. The data were analysed using Partially Least Square Structural Equation Modelling (PLS-SEM) in order to test the model. The final model was verified by experts in the area. Results revealed that expectation has significant relationship with confirmation, but perceived performance showed insignificant relationship with confirmation which serves as a unique finding. Additionally, confirmation, expectation, perceived performance, incentive on usage and perceived risk has significant relationship with intention to use OGD. Meanwhile, the analysis proved that the intention to use mediates the relationship between confirmation, expectation, perceived performance, incentive on usage, perceived risk and satisfaction on use of OGD. This study suggests that the user’s expectations on OGD must be met in creating stronger intention and satisfaction. The implications of the study are to improve data service quality, support innovative services development, increase data transparency, and boost up potential investment

    Public Data Integration with WebSmatch

    No full text
    National audienceIntegrating open data sources can yield high value information but raises major problems in terms of metadata extraction, data source integration and visualization of integrated data. In this paper, we describe the demonstration of WebSmatch, a flexible environment for Web data integration, based on a real, end-to-end data integration scenario over public data from Data Publica. WebSmatch supports the full process of importing, refining and integrating data sources and uses third party tools for high quality visualization. We use a typical scenario of public data integration which involves problems not solved by currents tools: poorly structured input data sources (XLS files) and rich visualization of integrated data