316 research outputs found

    Graph Processing in Main-Memory Column Stores

    Get PDF
    Evermore, novel and traditional business applications leverage the advantages of a graph data model, such as the offered schema flexibility and an explicit representation of relationships between entities. As a consequence, companies are confronted with the challenge of storing, manipulating, and querying terabytes of graph data for enterprise-critical applications. Although these business applications operate on graph-structured data, they still require direct access to the relational data and typically rely on an RDBMS to keep a single source of truth and access. Existing solutions performing graph operations on business-critical data either use a combination of SQL and application logic or employ a graph data management system. For the first approach, relying solely on SQL results in poor execution performance caused by the functional mismatch between typical graph operations and the relational algebra. To the worse, graph algorithms expose a tremendous variety in structure and functionality caused by their often domain-specific implementations and therefore can be hardly integrated into a database management system other than with custom coding. Since the majority of these enterprise-critical applications exclusively run on relational DBMSs, employing a specialized system for storing and processing graph data is typically not sensible. Besides the maintenance overhead for keeping the systems in sync, combining graph and relational operations is hard to realize as it requires data transfer across system boundaries. A basic ingredient of graph queries and algorithms are traversal operations and are a fundamental component of any database management system that aims at storing, manipulating, and querying graph data. Well-established graph traversal algorithms are standalone implementations relying on optimized data structures. The integration of graph traversals as an operator into a database management system requires a tight integration into the existing database environment and a development of new components, such as a graph topology-aware optimizer and accompanying graph statistics, graph-specific secondary index structures to speedup traversals, and an accompanying graph query language. In this thesis, we introduce and describe GRAPHITE, a hybrid graph-relational data management system. GRAPHITE is a performance-oriented graph data management system as part of an RDBMS allowing to seamlessly combine processing of graph data with relational data in the same system. We propose a columnar storage representation for graph data to leverage the already existing and mature data management and query processing infrastructure of relational database management systems. At the core of GRAPHITE we propose an execution engine solely based on set operations and graph traversals. Our design is driven by the observation that different graph topologies expose different algorithmic requirements to the design of a graph traversal operator. We derive two graph traversal implementations targeting the most common graph topologies and demonstrate how graph-specific statistics can be leveraged to select the optimal physical traversal operator. To accelerate graph traversals, we devise a set of graph-specific, updateable secondary index structures to improve the performance of vertex neighborhood expansion. Finally, we introduce a domain-specific language with an intuitive programming model to extend graph traversals with custom application logic at runtime. We use the LLVM compiler framework to generate efficient code that tightly integrates the user-specified application logic with our highly optimized built-in graph traversal operators. Our experimental evaluation shows that GRAPHITE can outperform native graph management systems by several orders of magnitude while providing all the features of an RDBMS, such as transaction support, backup and recovery, security and user management, effectively providing a promising alternative to specialized graph management systems that lack many of these features and require expensive data replication and maintenance processes

    Konverzija relacijskih u grafovske baze podataka orijentirana na svojstva

    Get PDF
    Analysis of data stored in a graph enables the discovery of certain information that could be hard to see if the data were stored using some other model (e.g. relational). However, the vast majority of data in information systems today is stored in relational databases, which dominate the data management field over the last decades. In spite of the rise of NoSQL technologies, the development of new information systems is still mostly based on relational databases. Given the increasing awareness about the benefits of data analysis as well as current research interest in graph mining techniques, we aim to enable the usage of those techniques on relational data. In that regard, we propose a universal relational-to-graph data conversion algorithm which can be used in preparation of data to perform a graph mining analysis. Our approach leverages the property graph model which is mainly used by the graph databases, while maintaining the level of relational data clarity.Analiza podataka u formatu grafa omogućava pronalazak određenih informacija koje može biti vrlo teÅ”ko vidjeti ako su podaci u nekom drugom formatu (npr. relacijskom). Ipak, velika većina podataka koji su danas dio informacijskih sustava pohranjena je upravo u relacijskim bazama podataka koje dominiraju tržiÅ”tem u posljednjih nekoliko desetljeća. I dalje se razvoj novih informacijskih sustava uglavnom zasniva na relacijskim bazama podataka. Kako je sve veća svjesnost o vrijednosti analize podataka, kao i aktualni interes istraživanja u području tehnika dubinske analize grafova, naÅ” je cilj omogućiti koriÅ”tenje tih tehnika nad relacijskim podacima. U tom smislu, predlažemo univerzalni algoritam konverzije podataka iz relacijskog modela u graf, koji se može koristiti u pripremi podataka za izvođenje dubinske analize grafova. NaÅ” pristup maksimalno iskoriÅ”tava model grafa sa svojstvima koji je u Å”irokoj uporabi u aktualnim grafovskim bazama podataka, u isto vrijeme zadržavajući razinu jasnoće relacijskih podataka

    Coastal management and adaptation: an integrated data-driven approach

    Get PDF
    Coastal regions are some of the most exposed to environmental hazards, yet the coast is the preferred settlement site for a high percentage of the global population, and most major global cities are located on or near the coast. This research adopts a predominantly anthropocentric approach to the analysis of coastal risk and resilience. This centres on the pervasive hazards of coastal flooding and erosion. Coastal management decision-making practices are shown to be reliant on access to current and accurate information. However, constraints have been imposed on information flows between scientists, policy makers and practitioners, due to a lack of awareness and utilisation of available data sources. This research seeks to tackle this issue in evaluating how innovations in the use of data and analytics can be applied to further the application of science within decision-making processes related to coastal risk adaptation. In achieving this aim a range of research methodologies have been employed and the progression of topics covered mark a shift from themes of risk to resilience. The work focuses on a case study region of East Anglia, UK, benefiting from the input of a partner organisation, responsible for the regionā€™s coasts: Coastal Partnership East. An initial review revealed how data can be utilised effectively within coastal decision-making practices, highlighting scope for application of advanced Big Data techniques to the analysis of coastal datasets. The process of risk evaluation has been examined in detail, and the range of possibilities afforded by open source coastal datasets were revealed. Subsequently, open source coastal terrain and bathymetric, point cloud datasets were identified for 14 sites within the case study area. These were then utilised within a practical application of a geomorphological change detection (GCD) method. This revealed how analysis of high spatial and temporal resolution point cloud data can accurately reveal and quantify physical coastal impacts. Additionally, the research reveals how data innovations can facilitate adaptation through insurance; more specifically how the use of empirical evidence in pricing of coastal flood insurance can result in both communication and distribution of risk. The various strands of knowledge generated throughout this study reveal how an extensive range of data types, sources, and advanced forms of analysis, can together allow coastal resilience assessments to be founded on empirical evidence. This research serves to demonstrate how the application of advanced data-driven analytical processes can reduce levels of uncertainty and subjectivity inherent within current coastal environmental management practices. Adoption of methods presented within this research could further the possibilities for sustainable and resilient management of the incredibly valuable environmental resource which is the coast

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently ā€“ to become ā€˜smartā€™ and ā€˜sustainableā€™. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ā€˜bigā€™ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently ā€“ to become ā€˜smartā€™ and ā€˜sustainableā€™. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ā€˜bigā€™ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity

    SciTech News Volume 71, No. 2 (2017)

    Get PDF
    Columns and Reports From the Editor 3 Division News Science-Technology Division 5 Chemistry Division 8 Engineering Division 9 Aerospace Section of the Engineering Division 12 Architecture, Building Engineering, Construction and Design Section of the Engineering Division 14 Reviews Sci-Tech Book News Reviews 16 Advertisements IEEE

    Big Data Management for Cloud-Enabled Geological Information Services

    Get PDF

    Project Blue Ocean report

    Get PDF
    A project was undertaken for Synlait Milk Limited in partial fulfilment of the Master in Engineering Management degree at the University of Canterbury. Project Blue Ocean aimed to discover opportunities for new technology adoption to improve the business operation. The project was initiated to drive the Manufacturing Excellence framework which contains three strong pillars: Safety, Reliability and People. The project began with the discovery of the current issues, mainly focused on manual handling (critical risk activities), repetitive and low-value tasks. The technology solutions were generated respectively to each issue and a high level concept study was developed for each of the top three technology solutions. Design Thinking methodology was applied throughout the project to understand the problems, define the underlying issues, generate unconstrained technology ideas, and prototype the most feasible solution. Justification methods such as the NTCP Diamond Model, the Total Application Model and the Technology Category Model were combined to create an evaluation matrix to find out the top three technology solutions: Vacuum System at Fluid Bed, Collaborative Robots and Fob Key Integration. Preliminary economic evaluation and recommendation plans were made, based on a high level concept study of each solution

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently ā€“ to become ā€˜smartā€™ and ā€˜sustainableā€™. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ā€˜bigā€™ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity
    • ā€¦
    corecore