13 research outputs found
G-CORE a core for future graph query languages
We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph query language should treat paths as first-class citizens. Our result is G-CORE, a powerful graph query language design that fulfills these goals, and strikes a careful balance between path query expressivity and evaluation complexity
G-CORE a core for future graph query languages
We report on a community effort between industry and academia to
shape the future of graph query languages. We argue that existing
graph database management systems should consider supporting
a query language with two key characteristics. First, it should be
composable, meaning, that graphs are the input and the output of
queries. Second, the graph query language should treat paths as
first-class citizens. Our result is G-CORE, a powerful graph query
language design that fulfills these goals, and strikes a careful balance
between path query expressivity and evaluation complexity
G-CORE a core for future graph query languages
We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph query language should treat paths as first-class citizens. Our result is G-CORE, a powerful graph query language design that fulfills these goals, and strikes a careful balance between path query expressivity and evaluation complexity
Manganese Superoxide Dismutase: Guardian of the Powerhouse
The mitochondrion is vital for many metabolic pathways in the cell, contributing all or important constituent enzymes for diverse functions such as β-oxidation of fatty acids, the urea cycle, the citric acid cycle, and ATP synthesis. The mitochondrion is also a major site of reactive oxygen species (ROS) production in the cell. Aberrant production of mitochondrial ROS can have dramatic effects on cellular function, in part, due to oxidative modification of key metabolic proteins localized in the mitochondrion. The cell is equipped with myriad antioxidant enzyme systems to combat deleterious ROS production in mitochondria, with the mitochondrial antioxidant enzyme manganese superoxide dismutase (MnSOD) acting as the chief ROS scavenging enzyme in the cell. Factors that affect the expression and/or the activity of MnSOD, resulting in diminished antioxidant capacity of the cell, can have extraordinary consequences on the overall health of the cell by altering mitochondrial metabolic function, leading to the development and progression of numerous diseases. A better understanding of the mechanisms by which MnSOD protects cells from the harmful effects of overproduction of ROS, in particular, the effects of ROS on mitochondrial metabolic enzymes, may contribute to the development of novel treatments for various diseases in which ROS are an important component
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
The LDBC Social Network Benchmark
The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC SNB consists of two workloads that focus on different functionalities: the Interactive workload (interactive transactional queries) and the Business Intelligence workload (analytical queries). This document contains the definition of both workloads. This includes a detailed explanation of the data used in the LDBC SNB, a detailed description for all queries, and instructions on how to generate the data and run the benchmark with the provided software
DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines
Integrated data analysis (IDA) pipelines—that combine data management (DM) and query processing, high-performance computing
(HPC), and machine learning (ML) training and scoring—become
increasingly common in practice. Interestingly, systems of these
areas share many compilation and runtime techniques, and the
used—increasingly heterogeneous—hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource
management, data formats and representations, as well as execution
strategies differ substantially. DAPHNE is an open and extensible
system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for
increasing productivity and eliminating unnecessary overheads. In
this paper, we make a case for IDA pipelines, describe the overall
DAPHNE system architecture, its key components, and the design
of a vectorized execution engine for computational storage, HW
accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas,
DuckDB, and TensorFlow show promising results