20 research outputs found
The Gremlin Graph Traversal Machine and Language
Gremlin is a graph traversal machine and language designed, developed, and
distributed by the Apache TinkerPop project. Gremlin, as a graph traversal
machine, is composed of three interacting components: a graph , a traversal
, and a set of traversers . The traversers move about the graph
according to the instructions specified in the traversal, where the result of
the computation is the ultimate locations of all halted traversers. A Gremlin
machine can be executed over any supporting graph computing system such as an
OLTP graph database and/or an OLAP graph processor. Gremlin, as a graph
traversal language, is a functional language implemented in the user's native
programming language and is used to define the of a Gremlin machine.
This article provides a mathematical description of Gremlin and details its
automaton and functional properties. These properties enable Gremlin to
naturally support imperative and declarative querying, host language
agnosticism, user-defined domain specific languages, an extensible
compiler/optimizer, single- and multi-machine execution models, hybrid depth-
and breadth-first evaluation, as well as the existence of a Universal Gremlin
Machine and its respective entailments.Comment: To appear in the Proceedings of the 2015 ACM Database Programming
Languages Conferenc
Using shape expressions (ShEx) to share rdf data models and to guide curation with rigorous validation
International Conference, European Semantic Web Conference, ESWC (16th. 2019. PortoroĆŸ, Slovenia
SADI, SHARE, and the in silico scientific method
<p>Abstract</p> <p>Background</p> <p>The emergence and uptake of Semantic Web technologies by the Life Sciences provides exciting opportunities for exploring novel ways to conduct <it>in silico</it> science. Web Service Workflows are already becoming first-class objects in âthe new wayâ, and serve as explicit, shareable, referenceable representations of how an experiment was done. In turn, Semantic Web Service projects aim to facilitate workflow construction by biological domain-experts such that workflows can be edited, re-purposed, and re-published by non-informaticians. However the aspects of the scientific method relating to explicit discourse, disagreement, and hypothesis generation have remained relatively impervious to new technologies.</p> <p>Results</p> <p>Here we present SADI and SHARE - a novel Semantic Web Service framework, and a reference implementation of its client libraries. Together, SADI and SHARE allow the semi- or fully-automatic discovery and pipelining of Semantic Web Services in response to <it>ad hoc</it> user queries.</p> <p>Conclusions</p> <p>The semantic behaviours exhibited by SADI and SHARE extend the functionalities provided by Description Logic Reasoners such that novel assertions can be automatically added to a data-set without logical reasoning, but rather by analytical or annotative services. This behaviour might be applied to achieve the âsemantificationâ of those aspects of the <it>in silico</it> scientific method that are not yet supported by Semantic Web technologies. We support this suggestion using an example in the clinical research space.</p
Predicting probable Alzheimer's disease using linguistic deficits and biomarkers
BackgroundThe manual diagnosis of neurodegenerative disorders such as Alzheimerâs disease (AD) and related Dementias has been a challenge. Currently, these disorders are diagnosed using specific clinical diagnostic criteria and neuropsychological examinations. The use of several Machine Learning algorithms to build automated diagnostic models using low-level linguistic features resulting from verbal utterances could aid diagnosis of patients with probable AD from a large population. For this purpose, we developed different Machine Learning models on the DementiaBank language transcript clinical dataset, consisting of 99 patients with probable AD and 99 healthy controls.ResultsOur models learned several syntactic, lexical, and n-gram linguistic biomarkers to distinguish the probable AD group from the healthy group. In contrast to the healthy group, we found that the probable AD patients had significantly less usage of syntactic components and significantly higher usage of lexical components in their language. Also, we observed a significant difference in the use of n-grams as the healthy group were able to identify and make sense of more objects in their n-grams than the probable AD group. As such, our best diagnostic model significantly distinguished the probable AD group from the healthy elderly group with a better Area Under the Receiving Operating Characteristics Curve (AUC) using the Support Vector Machines (SVM).ConclusionsExperimental and statistical evaluations suggest that using ML algorithms for learning linguistic biomarkers from the verbal utterances of elderly individuals could help the clinical diagnosis of probable AD. We emphasise that the best ML model for predicting the disease group combines significant syntactic, lexical and top n-gram features. However, there is a need to train the diagnostic models on larger datasets, which could lead to a better AUC and clinical diagnosis of probable AD