13 research outputs found
Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)
Real-time analytics that requires integration and aggregation of
heterogeneous and distributed streaming and static data is a typical task in
many industrial scenarios such as diagnostics of turbines in Siemens. OBDA
approach has a great potential to facilitate such tasks; however, it has a
number of limitations in dealing with analytics that restrict its use in
important industrial applications. Based on our experience with Siemens, we
argue that in order to overcome those limitations OBDA should be extended and
become analytics, source, and cost aware. In this work we propose such an
extension. In particular, we propose an ontology, mapping, and query language
for OBDA, where aggregate and other analytical functions are first class
citizens. Moreover, we develop query optimisation techniques that allow to
efficiently process analytical tasks over static and streaming data. We
implement our approach in a system and evaluate our system with Siemens turbine
data
Synopsis and meta-analysis of genetic association studies in osteoporosis for the focal adhesion family genes: the CUMAGAS-OSTEOporosis information system
<p>Abstract</p> <p>Background</p> <p>Focal adhesion (FA) family genes have been studied as candidate genes for osteoporosis, but the results of genetic association studies (GASs) are controversial. To clarify these data, a systematic assessment of GASs for FA genes in osteoporosis was conducted.</p> <p>Methods</p> <p>We developed Cumulative Meta-Analysis of GAS-OSTEOporosis (CUMAGAS-OSTEOporosis), a web-based information system that allows the retrieval, analysis and meta-analysis (for allele contrast, recessive, dominant, additive and codominant models) of data from GASs on osteoporosis with the capability of update. GASs were identified by searching the PubMed and HuGE PubLit databases.</p> <p>Results</p> <p>Data from 72 studies involving 13 variants of 6 genes were analyzed and catalogued in CUMAGAS-OSTEOporosis. Twenty-two studies produced significant associations with osteoporosis risk under any genetic model. All studies were underpowered (<50%). In four studies, the controls deviated from the Hardy-Weinberg equilibrium. Eight variants were chosen for meta-analysis, and significance was shown for the variants collagen, type I, α<sub>1 </sub>(<it>COL1A1</it>) G2046T (all genetic models), <it>COL1A1 </it>G-1997T (allele contrast and dominant model) and integrin β-chain β<sub>3 </sub>(<it>ITGB3</it>) T176C (recessive and additive models). In <it>COL1A1 </it>G2046T, subgroup analysis has shown significant associations for Caucasians, adults, females, males and postmenopausal women. A differential magnitude of effect in large versus small studies (that is, indication of publication bias) was detected for the variant <it>COL1A1 </it>G2046T.</p> <p>Conclusion</p> <p>There is evidence of an implication of FA family genes in osteoporosis. CUMAGAS-OSTEOporosis could be a useful tool for current genomic epidemiology research in the field of osteoporosis.</p
From Copernicus Big Data to Extreme Earth Analytics
Copernicus is the European programme for monitoring the Earth. It consists of a set of systems that collect data from satellites and in-situ sensors, process this data and provide users with reliable and up-to-date information on a range of environmental and security issues. The data and information processed and disseminated puts Copernicus at the forefront of the big data paradigm, giving rise to all relevant challenges, the so-called 5 Vs: volume, velocity, variety, veracity and value. In this short paper, we discuss the challenges of extracting information and knowledge from huge archives of Copernicus data. We propose to achieve this by scale-out distributed deep learning techniques that run on very big clusters offering virtual machines and GPUs. We also discuss the challenges of achieving scalability in the management of the extreme volumes of information and knowledge extracted from Copernicus data. The envisioned scientific and technical work will be carried out in the context of the H2020
project ExtremeEarth which starts in January 2019
An In-Depth Benchmarking of Text-to-SQL Systems
Text-to-SQL systems allow users to explore relational databases by
posing free-form queries, alleviating the need for using structured
languages, such as SQL. Although numerous systems have been developed so
far, existing system evaluations lack in rigour. In this work, we build
a text-to-SQL benchmark that covers different classes of queries, and we
evaluate the effectiveness of several systems in the field. To evaluate
system efficiency, we measure execution time and resource consumption
for the different query classes. Our comprehensive evaluation aims at
filling in a big gap in understanding the capabilities and boundaries of
existing systems and it reveals several open challenges
Evaluating the Leucine Trigger Hypothesis to Explain the Post-prandial Regulation of Muscle Protein Synthesis in Young and Older Adults:A Systematic Review
Background: The “leucine trigger” hypothesis was originally conceived to explain the post-prandial regulation of muscle protein synthesis (MPS). This hypothesis implicates the magnitude (amplitude and rate) of post-prandial increase in blood leucine concentrations for regulation of the magnitude of MPS response to an ingested protein source. Recent evidence from experimental studies has challenged this theory, with reports of a disconnect between blood leucine concentration profiles and post-prandial rates of MPS in response to protein ingestion. Aim: The primary aim of this systematic review was to qualitatively evaluate the leucine trigger hypothesis to explain the post-prandial regulation of MPS in response to ingested protein at rest and post-exercise in young and older adults. We hypothesized that experimental support for the leucine trigger hypothesis will depend on age, exercise status (rest vs. post-exercise), and type of ingested protein (i.e., isolated proteins vs. protein-rich whole food sources). Methods: This qualitative systematic review extracted data from studies that combined measurements of post-prandial blood leucine concentrations and rates of MPS following ingested protein at rest and following exercise in young and older adults. Data relating to blood leucine concentration profiles and post-prandial MPS rates were extracted from all studies, and reported as providing sufficient or insufficient evidence for the leucine trigger hypothesis. Results: Overall, 16 of the 29 eligible studies provided sufficient evidence to support the leucine trigger hypothesis for explaining divergent post-prandial rates of MPS in response to different ingested protein sources. Of these 16 studies, 13 were conducted in older adults (eight of which conducted measurements post-exercise) and 14 studies included the administration of isolated proteins. Conclusion: This systematic review underscores the merits of the leucine trigger hypothesis for the explanation of the regulation of MPS. However, our data indicate that the leucine trigger hypothesis confers most application in regulating the post-prandial response of MPS to ingested proteins in older adults. Consistent with our hypothesis, we provide data to support the idea that the leucine trigger hypothesis is more relevant within the context of ingesting isolated protein sources rather than protein-rich whole foods. Future mechanistic studies are warranted to understand the complex series of modulatory factors beyond blood leucine concentration profiles within a food matrix that regulate post-prandial rates of MPS
View Selection over Knowledge Graphs in Triple Stores
Knowledge Graphs (KGs) are collections of interconnected and annotated
entities that have become powerful assets for data integration, search
enhancement, and other industrial applications. Knowledge Graphs such as
DBPEDIA may contain billion of triple relations and are intensively
queried with millions of queries per day. A prominent approach to
enhance query answering on Knowledge Graph databases is View
Materialization, ie., the materialization of an appropriate set of
computations that will improve query performance.
We study the problem of view materialization and propose a view
selection methodology for processing query workloads with more than a
million queries. Our approach heavily relies on subgraph pattern mining
techniques that allow to create efficient summarizations of massive
query workloads while also identifying the candidate views for
materialization. In the core of our work is the correspondence between
the view selection problem to that of Maximizing a Nondecreasing
Submodular Set Function Subject to a Knapsack Constraint. The latter
leads to a tractable view-selection process for native triple stores
that allows a (1 - e(-1))-approximation of the optimal selection of
views. Our experimental evaluation shows that all the steps of the
view-selection process are completed in a few minutes, while the
corresponding rewritings accelerate 67.68% of the queries in the
DBPEDIA query workload. Those queries are executed in 2.19% of their
initial time on average
An Efficient Index for RDF Query Containment
Query containment is a fundamental operation used to expedite query
processing in view materialisation and query caching techniques. Since
query containment has been shown to be NP-complete for arbitrary
conjunctive queries on RDF graphs, we introduce a simpler form of
conjunctive queries that we name f-graph queries. We first show that
containment checking for f-graph queries can be solved in polynomial
time. Based on this observation, we propose a novel indexing structure,
named mv-index, that allows for fast containment checking between a
single f-graph query and an arbitrary number of stored queries. Search
is performed in polynomial time in the combined size of the query and
the index. We then show how our algorithms and structures can be
extended for arbitrary conjunctive queries on RDF graphs by introducing
f-graph witnesses, i.e., f-graph representatives of conjunctive queries.
F-graph witnesses have the following interesting property, a conjunctive
query for RDF graphs is contained in another query only if its
corresponding f-graph witness is also contained in it. The latter allows
to use our indexing structure for the general case of conjunctive query
containment. This translates in practice to microseconds or less for the
containment test against hundreds of thousands of queries that are
indexed within our structure
Omega 3 fatty acids supplementation has an ameliorative effect in experimental ulcerative colitis despite increased colonic neutrophil infiltration Los suplementos de ácidos grasos omega 3 tienen efectos beneficiosos en colitis ulcerosa a pesar del aumento de la infiltracción por neutrófilos del colon
Purpose: omega 3 polyunsaturated fatty acids have anti-inflammatory properties and can be beneficial in the treatment of inflammatory diseases, such as ulcerative colitis. Dextran sodium sulphate (DSS) colitis in rats appears to mimic nearly all of the morphological characteristics and lesion distributions of ulcerative colitis. The purpose of the current study was to investigate the efficacy of omega 3 fatty acids in the treatment of experimental ulcerative colitis. Methods: thirty-six Wistar rats were randomly assigned to group A or group B receiving 5% dextran sulfate sodium (DSS) in their drinking water for eight days. For the next eight days post-DSS, group A animals received tap-water, and group B animals were fed a nutritional solution containing high levels of omega 3 polyunsaturated fatty acids (ProSure®, Abbott Laboratories, Zwolle, Netherlands) once per day, administrated with a orogastric feeding tube. Results: animals fed an omega 3 rich diet exhibited a statistically significant increase in hematocrit and hemoglobin levels, compared to animals drinking tap water, and a trend towards histopathological and clinical improvement, with the administration of omega 3 fatty acids ameliorating epithelial erosion by day 8 post-DSS, but no statistically significant difference was observed between group A and group B animals at 4 or 8 days post-DSS. Also, a statistically significant increase in neutrophil infiltration was observed, as depicted by myelohyperoxidase activity. Conclusion: our findings support a positive role of omega 3 polyunsaturated fatty acids supplementation in an experimental model of ulcerative colitis despite the increased colonic neutrophil infiltration. Further studies are needed in order to investigate the role of increased neutrophils in colonic mucosa
Ontology-Based Integration of Streaming and Static Relational Data with Optique
Real-time processing of data coming from multiple heterogeneous data
streams and static databases is a typical task in many industrial
scenarios such as diagnostics of large machines. A complex diagnostic
task may require a collection of up to hundreds of queries over such
data. Although many of these queries retrieve data of the same kind,
such as temperature measurements, they access structurally different
data sources. In this work we show how Semantic Technologies implemented
in our system OPTIQUE can simplify such complex diagnostics by providing
an abstraction layer ontology that integrates heterogeneous data. In a
nutshell, OPTIQUE allows complex diagnostic tasks to be expressed with
just a few high-level semantic queries. The system can then
automatically enrich these queries, translate them into a collection
with a large number of low-level data queries, and finally optimise and
efficiently execute the collection in a heavily distributed environment.
We will demo the benefits of OPTIQUE on a real world scenario from
Siemens
Artificial Intelligence and Big Data Technologies for Copernicus Data: The ExtremeEarth Project
ExtremeEarth is a three-year H2020 ICT research and innovation project which is currently in its final year. The main objective of ExtremeEarth is to develop Artificial Intelligence and Big Data techniques and technologies that scale to the large volumes of big Copernicus data, information and knowledge, and apply these technologies in two of the ESA Thematic Exploitation Platforms: Food Security and Polar. The technical contributions of the project so far include: (i) new deep learning architectures for crop type mapping in the context of the Food Security use case, (ii) new deep learning architectures for sea ice mapping in the context of the Polar use case, (iii) the development and open publication of very large datasets for training these architectures, (iv) new versions of scalable semantic technologies for managing big linked geospatial data, and (v) a new platform for bringing all the previous technologies together and applying them to the two use cases