8 research outputs found

    Ontology Pattern-Based Data Integration

    Get PDF
    Data integration is concerned with providing a unified access to data residing at multiple sources. Such a unified access is realized by having a global schema and a set of mappings between the global schema and the local schemas of each data source, which specify how user queries at the global schema can be translated into queries at the local schemas. Data sources are typically developed and maintained independently, and thus, highly heterogeneous. This causes difficulties in integration because of the lack of interoperability in the aspect of architecture, data format, as well as syntax and semantics of the data. This dissertation represents a study on how small, self-contained ontologies, called ontology design patterns, can be employed to provide semantic interoperability in a cross-repository data integration system. The idea of this so-called ontology pattern- based data integration is that a collection of ontology design patterns can act as the global schema that still contains sufficient semantics, but is also flexible and simple enough to be used by linked data providers. On the one side, this differs from existing ontology-based solutions, which are based on large, monolithic ontologies that provide very rich semantics, but enforce too restrictive ontological choices, hence are shunned by many data providers. On the other side, this also differs from the purely linked data based solutions, which do offer simplicity and flexibility in data publishing, but too little in terms of semantic interoperability. We demonstrate the feasibility of this idea through the actual development of a large scale data integration project involving seven ocean science data repositories from five institutions in the U.S. In addition, we make two contributions as part of this dissertation work, which also play crucial roles in the aforementioned data integration project. First, we develop a collection of more than a dozen ontology design patterns that capture the key notions in the ocean science occurring in the participating data repositories. These patterns contain axiomatization of the key notions and were developed with an intensive involvement from the domain experts. Modeling of the patterns was done in a systematic workflow to ensure modularity, reusability, and flexibility of the whole pattern collection. Second, we propose the so-called pattern views that allow data providers to publish their data in very simple intermediate schema and show that they can greatly assist data providers to publish their data without requiring a thorough understanding of the axiomatization of the patterns

    Reducing Adversarial Vulnerability through Adaptive Training Batch Size

    Get PDF
    Neural networks possess an ability to generalize well to data distribution, to an extent that they are capable of fitting to a randomly labeled data. But they are also known to be extremely sensitive to adversarial examples. Batch Normalization (BatchNorm), very commonly part of deep learning architecture, has been found to increase adversarial vulnerability. Fixup Initialization (Fixup Init) has been shown as an alternative to BatchNorm, which can considerably strengthen the networks against adversarial examples. This robustness can be improved further by employing smaller batch size in training. The latter, however, comes with a tradeoff in the form of a significant increase of training time (up to ten times longer when reducing batch size from the default 128 to 8 for ResNet-56). In this paper, we propose a workaround to this problem by starting the training with a small batch size and gradually increase it to larger ones during training. We empirically show that our proposal can still improve adversarial robustness (up to 5.73\%) of ResNet-56 with Fixup Init and default batch size of 128. At the same time, our proposal keeps the training time considerably shorter (only 4 times longer, instead of 10 times)

    Challenges, Techniques, and Trends of Simple Knowledge Graph Question Answering: A Survey

    No full text
    Simple questions are the most common type of questions used for evaluating a knowledge graph question answering (KGQA). A simple question is a question whose answer can be captured by a factoid statement with one relation or predicate. Knowledge graph question answering (KGQA) systems are systems whose aim is to automatically answer natural language questions (NLQs) over knowledge graphs (KGs). There are varieties of researches with different approaches in this area. However, the lack of a comprehensive study to focus on addressing simple questions from all aspects is tangible. In this paper, we present a comprehensive survey of answering simple questions to classify available techniques and compare their advantages and drawbacks in order to have better insights of existing issues and recommendations to direct future works

    A Non-Uniform Continuous Cellular Automata for Analyzing and Predicting the Spreading Patterns of COVID-19

    No full text
    During the COVID-19 outbreak, modeling the spread of infectious diseases became a challenging research topic due to its rapid spread and high mortality rate. The main objective of a standard epidemiological model is to estimate the number of infected, suspected, and recovered from the illness by mathematical modeling. This model does not capture how the disease transmits between neighboring regions through interaction. A more general framework such as Cellular Automata (CA) is required to accommodate a more complex spatial interaction within the epidemiological model. The critical issue of modeling in the spread of diseases is how to reduce the prediction error. This research aims to formulate the influence of the interaction of a neighborhood on the spreading pattern of COVID-19 using a neighborhood frame model in a Cellular-Automata (CA) approach and obtain a predictive model for the COVID-19 spread with the error reduction to improve the model. We propose a non-uniform continuous CA (N-CCA) as our contribution to demonstrate the influence of interactions on the spread of COVID-19. The model has succeeded in demonstrating the influence of the interaction between regions on the COVID-19 spread, as represented by the coefficients obtained. These coefficients result from multiple regression models. The coefficient obtained represents the population’s behavior interacting with its neighborhood in a cell and influences the number of cases that occur the next day. The evaluation of the N-CCA model is conducted by root mean square error (RMSE) for the difference in the number of cases between prediction and real cases per cell in each region. This study demonstrates that this approach improves the prediction of accuracy for 14 days in the future using data points from the past 42 days, compared to a baseline model

    Consuming and Reusing Semantic Geoscience Data

    No full text
    Semantic Technologies are becoming commonplace in the geoscience community. Within this collection of tools, techniques, and methodologies, the ontology is a basic building block. Yet, despite the interest and uptake in semantics, there still exist several challenges to consuming semantic data and reusing existing ontologies. One challenge is that ontologies can be created by varying means (manual vs auto- mated), varying methodologies (e.g. the Fox and McGuinness method [1]), and to varying levels of domain and logical expressiveness. Ultimately, the goal is for wide spread uptake and reuse of ontologies. Yet, attempts to describe an entire domain within an ontology have led to difficulties in reuse both within the geosciences and the broader Semantic Web community. It has become apparent over the past few years that common conceptual patterns are repeated in ontologies emerging from different communities and domains. Analogous to using design patterns to create software, the study of Ontology Design Patterns (ODPs) [2,3] advocates the reuse of small modular ontologies as opposed to large ontologies de- scribing full domain areas. This modularizing of ontologies into reusable patterns (ODPs) enhances reuse and simplifies interoperability issues [4]. Not surprisingly, Linked Data [5, 6], which is based on data published and consumed against ontology schema, have also not seen as much consumption as would be liked. Recent research [7] has shown that ODPs can also be beneficial in facilitating more, and easier, Linked Data consumption. The ODP research area is new and the basic benefits of ODPs are just now beginning to be validated in the broader Semantic Web community. At present, limited validation within the geosciences has occurred. Linked Data and ontologies are at the heart of the ESIP's Semantic Web Committee’s Strategic Vision and Road Map. This ESIP funded testbed project will provide crucial initial feedback regarding the benefits of ODPs in geoscience data publication and consumption. [1] http://tw.rpi.edu/web/doc/TWC SemanticWebMethodology [2] E. Blomqvist and K. Sandkuhl. Patterns in ontology engineering: Classification of ontology patterns. In ICEIS 2005, Proceedings of the Seventh International Conference on Enterprise Information Systems, Miami, USA, May 25-28, 2005, pages 413–416, 2005. [3] A. Gangemi. Ontology design patterns for semantic web content. In Y. Gil, E. Motta, V. R. Ben- jamins, and M. A. Musen, editors, The Semantic Web – ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings, volume 3729 of Lecture Notes in Computer Science, pages 262–276. Springer, 2005. doi:10.1007/11574620 21. [4] E. Blomqvist, P. Hitzler, K. Janowicz, A. Krisnadhi, T. Narock, and M. Solanki. Considerations regarding ontology design patterns. Semantic Web, 7(1), 2016. [5] T. Berners-Lee. Linked data: Design issues, 2006. http://www.w3.org/DesignIssues/LinkedData.html. [6] T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space. Morgan & Claypool, 2011. [7] V. Rodriguez-Doncel, A. A. Krisnadhi, P. Hitzler, M. Cheatham, N. Karima, and R. Amini. Pattern-based Linked Data publication: The Linked Chess Dataset case. In O. Hartig, J. Se- queda, and A. Hogan, editors, Proceedings of the 6th International Workshop on Consuming Linked Data co-located with 14th International Semantic Web Conference (ISWC 2105), Beth- lehem, Pennsylvania, US, October 12th, 2015, volume 1426 of CEUR Workshop Proceedings, 2015

    Preface

    No full text
    corecore