110 research outputs found

    Modeling Data Lake Metadata with a Data Vault

    Get PDF
    International audienceWith the rise of big data, business intelligence had to find solutions for managing even greater data volumes and variety than in data warehouses, which proved ill-adapted. Data lakes answer these needs from a storage point of view, but require managing adequate metadata to guarantee an efficient access to data. Starting from a multidimensional metadata model designed for an industrial heritage data lake presenting a lack of schema evolutivity, we propose in this paper to use ensemble modeling, and more precisely a data vault, to address this issue. To illustrate the feasibility of this approach, we instantiate our metadata conceptual model into relational and document-oriented logical and physical models, respectively. We also compare the physical models in terms of metadata storage and query response time

    Personalizing Interactions with Information Systems

    Get PDF
    Personalization constitutes the mechanisms and technologies necessary to customize information access to the end-user. It can be defined as the automatic adjustment of information content, structure, and presentation tailored to the individual. In this chapter, we study personalization from the viewpoint of personalizing interaction. The survey covers mechanisms for information-finding on the web, advanced information retrieval systems, dialog-based applications, and mobile access paradigms. Specific emphasis is placed on studying how users interact with an information system and how the system can encourage and foster interaction. This helps bring out the role of the personalization system as a facilitator which reconciles the user’s mental model with the underlying information system’s organization. Three tiers of personalization systems are presented, paying careful attention to interaction considerations. These tiers show how progressive levels of sophistication in interaction can be achieved. The chapter also surveys systems support technologies and niche application domains

    Integrating data warehouses with web data : a survey

    Get PDF
    This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query, and retrieve Web data and their application to DWs. The paper reviews different DW distributed architectures and the use of XML languages as an integration tool in these systems. It also introduces the problem of dealing with semistructured data in a DW. It studies Web data repositories, the design of multidimensional databases for XML data sources, and the XML extensions of OnLine Analytical Processing techniques. The paper addresses the application of information retrieval technology in a DW to exploit text-rich document collections. The authors hope that the paper will help to discover the main limitations and opportunities that offer the combination of the DW and the Web fields, as well as to identify open research line

    Graph databases and their application to the Italian Business Register for efficient search of relationships among companies

    Get PDF
    We studied and tested three of the major graph databases, and we compared them with a relational database. We worked on a dataset representing equity participations among companies, and we found out that the strong points of graph databases are: the purposely designed storage techniques; and their query languages. The main performance increments have been obtained when heavy graph situations are queried; for simpler situations and queries, a relational database performs equally wellope

    The exploration of a category theory-based virtual Geometrical product specification system for design and manufacturing

    Get PDF
    In order to ensure quality of products and to facilitate global outsourcing, almost all the so-called “world-class” manufacturing companies nowadays are applying various tools and methods to maintain the consistency of a product’s characteristics throughout its manufacturing life cycle. Among these, for ensuring the consistency of the geometric characteristics, a tolerancing language − the Geometrical Product Specification (GPS) has been widely adopted to precisely transform the functional requirements from customers into manufactured workpieces expressed as tolerance notes in technical drawings. Although commonly acknowledged by industrial users as one of the most successful efforts in integrating existing manufacturing life-cycle standards, current GPS implementations and software packages suffer from several drawbacks in their practical use, possibly the most significant, the difficulties in inferring the data for the “best” solutions. The problem stemmed from the foundation of data structures and knowledge-based system design. This indicates that there need to be a “new” software system to facilitate GPS applications. The presented thesis introduced an innovative knowledge-based system − the VirtualGPS − that provides an integrated GPS knowledge platform based on a stable and efficient database structure with knowledge generation and accessing facilities. The system focuses on solving the intrinsic product design and production problems by acting as a virtual domain expert through translating GPS standards and rules into the forms of computerized expert advices and warnings. Furthermore, this system can be used as a training tool for young and new engineers to understand the huge amount of GPS standards in a relative “quicker” manner. The thesis started with a detailed discussion of the proposed categorical modelling mechanism, which has been devised based on the Category Theory. It provided a unified mechanism for knowledge acquisition and representation, knowledge-based system design, and database schema modelling. As a core part for assessing this knowledge-based system, the implementation of the categorical Database Management System (DBMS) is also presented in this thesis. The focus then moved on to demonstrate the design and implementation of the proposed VirtualGPS system. The tests and evaluations of this system were illustrated in Chapter 6. Finally, the thesis summarized the contributions to knowledge in Chapter 7. After thoroughly reviewing the project, the conclusions reached construe that the III entire VirtualGPS system was designed and implemented to conform to Category Theory and object-oriented programming rules. The initial tests and performance analyses show that the system facilitates the geometric product manufacturing operations and benefits the manufacturers and engineers alike from function designs, to a manufacturing and verification

    Extending a methodology for migration of the database layer to the cloud considering relational database schema migration to NoSQL

    Get PDF
    The advances in Cloud computing and in modern Web applications have raised the need for highly available and scalable distributed databases to accommodate the big data being created and consumed. Along with the explosion in data growth comes the necessity to rapidly evolve databases and schemas to meet user demands for new functionality. A special attention is being paid to the vast amounts of semi-structured and un-structured data, and the data management tools should reflect the support for these needs. This has lead to the development of new Cloud serving systems such as "Not Only" SQL (NoSQL) databases. NoSQL databases were driven by the scalability needs of the big companies, such as Google, Facebook, Amazon, and Yahoo. While the demands of these key players are different from those of small and medium enterprises in terms of scalability, the core problem is the same - storage arrays are not scalable and force you into expensive, forklift upgrades. These facts combined with changes in how IT resources are delivered and consumed through the Cloud computing paradigm, projects adopting NoSQL solutions are not a hype anymore. NoSQL databases are being offered as a service by the big Cloud providers, such as Google, Amazon, Microsoft, but by smaller vendors as well. In this master thesis we investigate the possibilities and limitations of mapping relational database schemas to NoSQL schemas when migrating the database layer to the Cloud. Based on literature research we provide recommendations and guidelines with regard to schema transformation and discuss the implications at other application architecture layers, such as business logic and data access layer. We extend an existing data migration tool and methodology for incorporating the migration guidelines and hints. Moreover, we validate our work based on a chosen sub-set of relational and NoSQL databases by using example data from the established TPC-H benchmark

    A Survey on Mapping Semi-Structured Data and Graph Data to Relational Data

    Get PDF
    The data produced by various services should be stored and managed in an appropriate format for gaining valuable knowledge conveniently. This leads to the emergence of various data models, including relational, semi-structured, and graph models, and so on. Considering the fact that the mature relational databases established on relational data models are still predominant in today's market, it has fueled interest in storing and processing semi-structured data and graph data in relational databases so that mature and powerful relational databases' capabilities can all be applied to these various data. In this survey, we review existing methods on mapping semi-structured data and graph data into relational tables, analyze their major features, and give a detailed classification of those methods. We also summarize the merits and demerits of each method, introduce open research challenges, and present future research directions. With this comprehensive investigation of existing methods and open problems, we hope this survey can motivate new mapping approaches through drawing lessons from eachmodel's mapping strategies, aswell as a newresearch topic - mapping multi-model data into relational tables.Peer reviewe

    The exploration of a category theory-based virtual geometrical product specification system for design and manufacturing

    Get PDF
    In order to ensure quality of products and to facilitate global outsourcing, almost all the so-called “world-class” manufacturing companies nowadays are applying various tools and methods to maintain the consistency of a product’s characteristics throughout its manufacturing life cycle. Among these, for ensuring the consistency of the geometric characteristics, a tolerancing language − the Geometrical Product Specification (GPS) has been widely adopted to precisely transform the functional requirements from customers into manufactured workpieces expressed as tolerance notes in technical drawings. Although commonly acknowledged by industrial users as one of the most successful efforts in integrating existing manufacturing life-cycle standards, current GPS implementations and software packages suffer from several drawbacks in their practical use, possibly the most significant, the difficulties in inferring the data for the “best” solutions. The problem stemmed from the foundation of data structures and knowledge-based system design. This indicates that there need to be a “new” software system to facilitate GPS applications. The presented thesis introduced an innovative knowledge-based system − the VirtualGPS − that provides an integrated GPS knowledge platform based on a stable and efficient database structure with knowledge generation and accessing facilities. The system focuses on solving the intrinsic product design and production problems by acting as a virtual domain expert through translating GPS standards and rules into the forms of computerized expert advices and warnings. Furthermore, this system can be used as a training tool for young and new engineers to understand the huge amount of GPS standards in a relative “quicker” manner. The thesis started with a detailed discussion of the proposed categorical modelling mechanism, which has been devised based on the Category Theory. It provided a unified mechanism for knowledge acquisition and representation, knowledge-based system design, and database schema modelling. As a core part for assessing this knowledge-based system, the implementation of the categorical Database Management System (DBMS) is also presented in this thesis. The focus then moved on to demonstrate the design and implementation of the proposed VirtualGPS system. The tests and evaluations of this system were illustrated in Chapter 6. Finally, the thesis summarized the contributions to knowledge in Chapter 7. After thoroughly reviewing the project, the conclusions reached construe that the III entire VirtualGPS system was designed and implemented to conform to Category Theory and object-oriented programming rules. The initial tests and performance analyses show that the system facilitates the geometric product manufacturing operations and benefits the manufacturers and engineers alike from function designs, to a manufacturing and verification.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    • 

    corecore