13 research outputs found

    Graph Processing in Main-Memory Column Stores

    Get PDF
    Evermore, novel and traditional business applications leverage the advantages of a graph data model, such as the offered schema flexibility and an explicit representation of relationships between entities. As a consequence, companies are confronted with the challenge of storing, manipulating, and querying terabytes of graph data for enterprise-critical applications. Although these business applications operate on graph-structured data, they still require direct access to the relational data and typically rely on an RDBMS to keep a single source of truth and access. Existing solutions performing graph operations on business-critical data either use a combination of SQL and application logic or employ a graph data management system. For the first approach, relying solely on SQL results in poor execution performance caused by the functional mismatch between typical graph operations and the relational algebra. To the worse, graph algorithms expose a tremendous variety in structure and functionality caused by their often domain-specific implementations and therefore can be hardly integrated into a database management system other than with custom coding. Since the majority of these enterprise-critical applications exclusively run on relational DBMSs, employing a specialized system for storing and processing graph data is typically not sensible. Besides the maintenance overhead for keeping the systems in sync, combining graph and relational operations is hard to realize as it requires data transfer across system boundaries. A basic ingredient of graph queries and algorithms are traversal operations and are a fundamental component of any database management system that aims at storing, manipulating, and querying graph data. Well-established graph traversal algorithms are standalone implementations relying on optimized data structures. The integration of graph traversals as an operator into a database management system requires a tight integration into the existing database environment and a development of new components, such as a graph topology-aware optimizer and accompanying graph statistics, graph-specific secondary index structures to speedup traversals, and an accompanying graph query language. In this thesis, we introduce and describe GRAPHITE, a hybrid graph-relational data management system. GRAPHITE is a performance-oriented graph data management system as part of an RDBMS allowing to seamlessly combine processing of graph data with relational data in the same system. We propose a columnar storage representation for graph data to leverage the already existing and mature data management and query processing infrastructure of relational database management systems. At the core of GRAPHITE we propose an execution engine solely based on set operations and graph traversals. Our design is driven by the observation that different graph topologies expose different algorithmic requirements to the design of a graph traversal operator. We derive two graph traversal implementations targeting the most common graph topologies and demonstrate how graph-specific statistics can be leveraged to select the optimal physical traversal operator. To accelerate graph traversals, we devise a set of graph-specific, updateable secondary index structures to improve the performance of vertex neighborhood expansion. Finally, we introduce a domain-specific language with an intuitive programming model to extend graph traversals with custom application logic at runtime. We use the LLVM compiler framework to generate efficient code that tightly integrates the user-specified application logic with our highly optimized built-in graph traversal operators. Our experimental evaluation shows that GRAPHITE can outperform native graph management systems by several orders of magnitude while providing all the features of an RDBMS, such as transaction support, backup and recovery, security and user management, effectively providing a promising alternative to specialized graph management systems that lack many of these features and require expensive data replication and maintenance processes

    Acceleration of Single Inserts for Columnar Databases -- An Experiment on Data Import Performance Using SAP HANA

    No full text

    Smoking and Second Hand Smoking in Adolescents with Chronic Kidney Disease: A Report from the Chronic Kidney Disease in Children (CKiD) Cohort Study

    Get PDF
    The goal of this study was to determine the prevalence of smoking and second hand smoking [SHS] in adolescents with CKD and their relationship to baseline parameters at enrollment in the CKiD, observational cohort study of 600 children (aged 1-16 yrs) with Schwartz estimated GFR of 30-90 ml/min/1.73m2. 239 adolescents had self-report survey data on smoking and SHS exposure: 21 [9%] subjects had “ever” smoked a cigarette. Among them, 4 were current and 17 were former smokers. Hypertension was more prevalent in those that had “ever” smoked a cigarette (42%) compared to non-smokers (9%), p\u3c0.01. Among 218 non-smokers, 130 (59%) were male, 142 (65%) were Caucasian; 60 (28%) reported SHS exposure compared to 158 (72%) with no exposure. Non-smoker adolescents with SHS exposure were compared to those without SHS exposure. There was no racial, age, or gender differences between both groups. Baseline creatinine, diastolic hypertension, C reactive protein, lipid profile, GFR and hemoglobin were not statistically different. Significantly higher protein to creatinine ratio (0.90 vs. 0.53, p\u3c0.01) was observed in those exposed to SHS compared to those not exposed. Exposed adolescents were heavier than non-exposed adolescents (85th percentile vs. 55th percentile for BMI, p\u3c 0.01). Uncontrolled casual systolic hypertension was twice as prevalent among those exposed to SHS (16%) compared to those not exposed to SHS (7%), though the difference was not statistically significant (p= 0.07). Adjusted multivariate regression analysis [OR (95% CI)] showed that increased protein to creatinine ratio [1.34 (1.03, 1.75)] and higher BMI [1.14 (1.02, 1.29)] were independently associated with exposure to SHS among non-smoker adolescents. These results reveal that among adolescents with CKD, cigarette use is low and SHS is highly prevalent. The association of smoking with hypertension and SHS with increased proteinuria suggests a possible role of these factors in CKD progression and cardiovascular outcomes
    corecore