69 research outputs found

    A Survey on Graph Database Management Techniques for Huge Unstructured Data

    Get PDF
    Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management

    A research roadmap towards achieving scalability in model driven engineering

    Get PDF
    International audienceAs Model-Driven Engineering (MDE) is increasingly applied to larger and more complex systems, the current generation of modelling and model management technologies are being pushed to their limits in terms of capacity and eciency. Additional research and development is imperative in order to enable MDE to remain relevant with industrial practice and to continue delivering its widely recognised productivity , quality, and maintainability benefits. Achieving scalabil-ity in modelling and MDE involves being able to construct large models and domain-specific languages in a systematic manner, enabling teams of modellers to construct and refine large models in a collaborative manner, advancing the state of the art in model querying and transformations tools so that they can cope with large models (of the scale of millions of model elements), and providing an infrastructure for ecient storage, indexing and retrieval of large models. This paper attempts to provide a research roadmap for these aspects of scalability in MDE and outline directions for work in this emerging research area

    i-DATAQUEST : a Proposal for a Manufacturing Data Query System Based on a Graph

    Get PDF
    During the manufacturing product life cycle, an increasing volume of data is generated and stored in distributed resources. These data are heterogeneous, explicitly and implicitly linked and they could be structured and unstructured. The rapid, exhaustive and relevant acquisition of information from this data is a major manufacturing industry issue. The key challenges, in this context, are to transform heterogeneous data into a common searchable data model, to allow semantic search, to detect implicit links between data and to rank results by relevance. To address this issue, the authors propose a query system based on a graph database. This graph is defined based on all the transformed manufacturing data. Besides, the graph is enriched by explicitly and implicitly data links. Finally, the enriched graph is queried thanks to an extended queries system defined by a knowledge graph. The authors depict a proof of concept to validate the proposal. After a partial implementation of this proof of concept, the authors obtain an acceptable result and a needed effort to improve the system response time. Finally, the authors open the topic on the subjects of right management, user profile/customization and data update.Chaire ENSAM-Capgemini sur le PLM du futu

    Raising Time Awareness in Model-Driven Engineering

    Get PDF
    International audienceThe conviction that big data analytics is a key for the success of modern businesses is growing deeper, and the mo-bilisation of companies into adopting it becomes increasingly important. Big data integration projects enable companies to capture their relevant data, to efficiently store it, turn it into domain knowledge, and finally monetize it. In this context, historical data, also called temporal data, is becoming increasingly available and delivers means to analyse the history of applications, discover temporal patterns, and predict future trends. Despite the fact that most data that today's applications are dealing with is inherently temporal current approaches, methodologies, and environments for developing these applications don't provide sufficient support for handling time. We envision that Model-Driven Engineering (MDE) would be an appropriate ecosystem for a seamless and orthogonal integration of time into domain modelling and processing. In this paper, we investigate the state-of-the-art in MDE techniques and tools in order to identify the missing bricks for raising time-awareness in MDE and outline research directions in this emerging domain

    Querying heterogeneous data in an in-situ unified agile system

    Get PDF
    Data integration provides a unified view of data by combining different data sources. In today’s multi-disciplinary and collaborative research environments, data is often produced and consumed by various means, multiple researchers operate on the data in different divisions to satisfy various research requirements, and using different query processors and analysis tools. This makes data integration a crucial component of any successful data intensive research activity. The fundamental difficulty is that data is heterogeneous not only in syntax, structure, and semantics, but also in the way it is accessed and queried. We introduce QUIS (QUery In-Situ), an agile query system equipped with a unified query language and a federated execution engine. It is capable of running queries on heterogeneous data sources in an in-situ manner. Its language provides advanced features such as virtual schemas, heterogeneous joins, and polymorphic result set representation. QUIS utilizes a federation of agents to transform a given input query written in its language to a (set of) computation models that are executable on the designated data sources. Federative query virtualization has the disadvantage that some aspects of a query may not be supported by the designated data sources. QUIS ensures that input queries are always fully satisfied. Therefore, if the target data sources do not fulfill all of the query requirements, QUIS detects the features that are lacking and complements them in a transparent manner. QUIS provides union and join capabilities over an unbound list of heterogeneous data sources; in addition, it offers solutions for heterogeneous query planning and optimization. In brief, QUIS is intended to mitigate data access heterogeneity through query virtualization, on-the-fly transformation, and federated execution. It offers in-Situ querying, agile querying, heterogeneous data source querying, unifeied execution, late-bound virtual schemas, and Remote execution

    Clock-G: A temporal graph management system with space-efficient storage technique

    Get PDF
    International audienceIoT applications can be naturally modeled as a graph where the edges represent the interactions between devices, sensors, and their environment. Thing'in 1 is a platform, initiated by Orange 2. The platform manages a graph of millions of connected and non-connected objects using a commercial graph database. The graph of Thing'in is dynamic because IoT devices create temporary connections between each other. Analyzing the history of these connections paves the way to new promising applications such as object tracking, anomaly detection, and forecasting the future behavior of devices. However, existing commercial graph databases are not designed with native temporal support which limits their usability in such use cases. In this paper, we discuss the design of a temporal graph management system Clock-G and introduce a new space-efficient storage technique ÎŽ-Copy+Log. Clock-G is designed by the developers of the Thing'in platform and is currently being deployed into production. It differentiates from existing temporal graph management systems by adopting the ÎŽ-Copy+Log technique. This technique targets the mitigation of the apparent trade-off between the conflicting goals of the reduction of space usage and acceleration of query execution time. Our experimental results demonstrate that the ÎŽ-Copy+Log presents an overall better performance as compared to traditional storage methods in terms of space usage and query evaluation time

    Graph database management systems: storage, management and query processing

    Get PDF
    The proliferation of graph data, generated from diverse sources, have given rise to many research efforts concerning graph analysis. Interactions in social networks, publication networks, protein networks, software code dependencies and transportation systems are all examples of graph-structured data originating from a variety of application domains and demonstrating different characteristics. In recent years, graph database management systems (GDBMS) have been introduced for the management and analysis of graph data. Motivated by the growing number of real-life applications making use of graph database systems, this thesis focuses on the effectiveness and efficiency aspects of such systems. Specifically, we study the following topics relevant to graph database systems: (i) modeling large-scale applications in GDBMS; (ii) storage and indexing issues in GDBMS, and (iii) efficient query processing in GDBMS. In this thesis, we adopt two different application scenarios to examine how graph database systems can model complex features and perform relevant queries on each of them. Motivated by the popular application of social network analytics, we selected Twitter, a microblogging platform, to conduct our detailed analysis. Addressing limitations of existing models, we pro- pose a data model for the Twittersphere that proactively captures Twitter-specific interactions. We examine the feasibility of running analytical queries on GDBMS and offer empirical analysis of the performance of the proposed approach. Next, we consider a use case of modeling software code dependencies in a graph database system, and investigate how these systems can support capturing the evolution of a codebase overtime. We study a code comprehension tool that extracts software dependencies and stores them in a graph database. On a versioned graph built using a very large codebase, we demonstrate how existing code comprehension queries can be efficiently processed and also show the benefit of running queries across multiple versions. Another important aspect of this thesis is the study of storage aspects of graph systems. Throughput of many graph queries can be significantly affected by disk I/O performance; therefore graph database systems need to focus on effective graph storage for optimising disk operations. We observe that the locality of edges plays an important role and we address the edge-labeling problem which aims to label both incoming and outgoing edges of a graph maximizing the ‘edge-consecutiveness’ metric. By achieving a better layout and locality of edges on disk, we show that our proposed algorithms result in significantly improved disk I/O performance leading to faster execution of neighbourhood queries. Some applications require the integrated processing of queries from graph and the textual domains within a graph database system. Aggregation of these dimensions facilitates gaining key insights in several application scenarios. For example, in a social network setting, one may want to find the closest k users in the network (graph traversal) who talk about a particular topic A (textual search). Motivated by such practical use cases, in this thesis we study the top-k social-textual ranking query that essentially requires efficient combination of a keyword search query with a graph traversal. We propose algorithms that leverage graph partitioning techniques, based on the premise that socially close users will be placed within the same partition, allowing more localised computations. We show that our proposed approaches are able to achieve significantly better results compared to standard baselines and demonstrating robust behaviour under changing parameters

    Semantic representation of engineering knowledge:pre-study

    Get PDF

    Development of a BIM-enabled software tool for facility management using interactive floor plans, graph-based data management and granular information retrieval

    Get PDF
    Since its very conception Building Information Modeling incorporates the notion of using digital models—rich in geometric and semantic information—throughout the whole life cycle of a building. The creation of these models is a process tied to much effort, split by disciplines ,executed by different parties and brought together under difficult collaboration. However, in reality the effective utilization of the BIM process ends with the conclusion of the construction project. The subsequent Operation & Management phase makes little to no use of the information contained in these files, although it would be valuable resource to boost productivity. Especially the Facility Management phase, suffers from great inefficiency caused by challenges of data management and outside advances in digitization. Research suggests thatBIM is able to provide benefits for processes in FM and O&M related tasks and increase their overall efficiency, but previous attempts to introduce BIM software have remained fruitless.We argue that current solutions have failed to meet expectations and requirements by FM community, which generally lack expertise in working with CAD-like software. Instead this thesis presents a concept which puts interactive, two-dimensional floor plans at the center of a possible BIM-enabled Facility Management (FM) software tool. These floor plans are directly derived from BIM models and maintain linkage to all relevant semantic data, which is stored in a graph database. Users are able to navigate rooms, equipment and themselves on the floor plans. Further information about rooms can be accessed through 360◩photospheres—enabling remote exploration and conception—and room specific 3D model. The latter is generated beforehand and follows the underlying concept that FM seldomly requires a holistic view of the whole building but instead a cross section of many different domain models, tied by a specific location. Based on the mentioned features and concepts a prototypical web application is developed in order to investigate the feasibility and effectiveness of the proposed solution

    An evaluation of the challenges of Multilingualism in Data Warehouse development

    Get PDF
    In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which has implications for data warehouse design and reporting performance. Data warehouses are a core component of most Business Intelligence systems and the star schema is the approach most widely used to develop data warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in the Star Schema and identify that current approaches have serious limitations which include data redundancy and data manipulation, performance and maintenance issues. We propose a new approach to enable the optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce satisfactory results when used in a proof-of-concept environment. Future work will include testing the approach in an enterprise environmen
    • 

    corecore