Search CORE

49 research outputs found

The Gremlin Graph Traversal Machine and Language

Author: Hartig O.
Hopcroft J.
Prud’hommeaux E.
Rodriguez M. A.
Rodriguez M. A.
Shinavier J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/08/2015
Field of study

Gremlin is a graph traversal machine and language designed, developed, and distributed by the Apache TinkerPop project. Gremlin, as a graph traversal machine, is composed of three interacting components: a graph

G

, a traversal

\Psi

, and a set of traversers

T

. The traversers move about the graph according to the instructions specified in the traversal, where the result of the computation is the ultimate locations of all halted traversers. A Gremlin machine can be executed over any supporting graph computing system such as an OLTP graph database and/or an OLAP graph processor. Gremlin, as a graph traversal language, is a functional language implemented in the user's native programming language and is used to define the

\Psi

of a Gremlin machine. This article provides a mathematical description of Gremlin and details its automaton and functional properties. These properties enable Gremlin to naturally support imperative and declarative querying, host language agnosticism, user-defined domain specific languages, an extensible compiler/optimizer, single- and multi-machine execution models, hybrid depth- and breadth-first evaluation, as well as the existence of a Universal Gremlin Machine and its respective entailments.Comment: To appear in the Proceedings of the 2015 ACM Database Programming Languages Conferenc

arXiv.org e-Print Archive

Crossref

Using shape expressions (ShEx) to share rdf data models and to guide curation with rigorous validation

Author: Labra Gayo José Emilio
Mietchen D.
Prud’hommeaux E.
Solbrig H.
Stupp G. S.
Thornton K.
Waagmeester A.
Publication venue
Publication date
Field of study

International Conference, European Semantic Web Conference, ESWC (16th. 2019. Portorož, Slovenia

Crossref

Repositorio Institucional de la Universidad de Oviedo

Research Articles in Simplified HTML: a Web-first format for HTML-based scholarly articles

Author: Alexander
Atkins Jr
Berjon
Bourne
Brooke
Capadisli
Capadisli
Carlisle
Clark
Constantin
Cyganiak
Di Iorio
Di Iorio
Di Iorio
Di Iorio
Di Mirri
Diggs
Gamma
Gandon
Gao
Garrish
Hickson
Kay
Lin
National Information Standards Organization
Osborne
Peroni
Peroni
Peroni
Peroni
Pettifer
Prud’hommeaux
Raggett
Shotton
Spinaci
Sporny
Sporny
Walsh
Publication venue: 'PeerJ'
Publication date: 01/01/2017
Field of study

Purpose. This paper introduces the Research Articles in Simplified HTML (or RASH), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework, a set of tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles submitted to the SAVE-SD 2015 and SAVE-SD 2016 workshops. Design. RASH has been developed aiming to: be easy to learn and use; share scholarly documents (and embedded semantic annotations) through the Web; support its adoption within the existing publishing workflow. Findings. The evaluation study confirmed that RASH is ready to be adopted in workshops, conferences, and journals and can be quickly learnt by researchers who are familiar with HTML. Research Limitations. The evaluation study also highlighted some issues in the adoption of RASH, and in general of HTML formats, especially by less technically savvy users. Moreover, additional tools are needed, e.g., for enabling additional conversions from/to existing formats such as OpenXML. Practical Implications. RASH (and its Framework) is another step towards enabling the definition of formal representations of the meaning of the content of an article, facilitating its automatic discovery, enabling its linking to semantically related articles, providing access to data within the article in actionable form, and allowing integration of data between papers. Social Implications. RASH addresses the intrinsic needs related to the various users of a scholarly article: researchers (focussing on its content), readers (experiencing new ways for browsing it), citizen scientists (reusing available data formally defined within it through semantic annotations), publishers (using the advantages of new technologies as envisioned by the Semantic Publishing movement). Value. RASH helps authors to focus on the organisation of their texts, supports them in the task of semantically enriching the content of articles, and leaves all the issues about validation, visualisation, conversion, and semantic data extraction to the various tools developed within its Framework

Crossref

Directory of Open Access Journals

Open Research Online (The Open University)

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

SADI, SHARE, and the in silico scientific method

Author: B Smith
B Vandervalk
Benjamin Vandervalk
CA Goble
David Withers
E Kawas
E Prud’hommeaux
E Sirin
Edward Kawas
H Knublauch
J Bhagat
JH Gennari
Luke McCarthy
Mark D Wilkinson
MD Wilkinson
Soroush Samadian
T Oinn
Publication venue: BioMed Central
Publication date: 01/12/2010
Field of study

Abstract Background The emergence and uptake of Semantic Web technologies by the Life Sciences provides exciting opportunities for exploring novel ways to conduct <it>in silico</it> science. Web Service Workflows are already becoming first-class objects in “the new way”, and serve as explicit, shareable, referenceable representations of how an experiment was done. In turn, Semantic Web Service projects aim to facilitate workflow construction by biological domain-experts such that workflows can be edited, re-purposed, and re-published by non-informaticians. However the aspects of the scientific method relating to explicit discourse, disagreement, and hypothesis generation have remained relatively impervious to new technologies. Results Here we present SADI and SHARE - a novel Semantic Web Service framework, and a reference implementation of its client libraries. Together, SADI and SHARE allow the semi- or fully-automatic discovery and pipelining of Semantic Web Services in response to <it>ad hoc</it> user queries. Conclusions The semantic behaviours exhibited by SADI and SHARE extend the functionalities provided by Description Logic Reasoners such that novel assertions can be automatically added to a data-set without logical reasoning, but rather by analytical or annotative services. This behaviour might be applied to achieve the “semantification” of those aspects of the <it>in silico</it> scientific method that are not yet supported by Semantic Web technologies. We support this suggestion using an example in the clinical research space.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Translational Medicine Ontology and Knowledge Base: driving personalized medicine by bridging the gap between bench and bedside

Author: Andersson Bosse
Batchelor Colin
Bodenreider Olivier
Clark Tim
Denney Christine K
Domarew Christopher
Dumontier Michel
Gambet Thomas
Harland Lee
Jentzsch Anja
Kashyap Vipul
Kos Peter
Kozlovsky Julia
Lebo Timothy
Luciano Joanne S
Marshall Scott M
McCusker James P
McGuinness Deborah L
Ogbuji Chimezie
Pichler Elgar
Powers Robert L
Prud’hommeaux Eric
Samwald Matthias
Schriml Lynn
Stephens Susie
Tonellato Peter J
Whetzel Patricia L
Zhao Jun
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: Translational medicine requires the integration of knowledge using heterogeneous data from health care to the life sciences. Here, we describe a collaborative effort to produce a prototype Translational Medicine Knowledge Base (TMKB) capable of answering questions relating to clinical practice and pharmaceutical drug discovery. Results: We developed the Translational Medicine Ontology (TMO) as a unifying ontology to integrate chemical, genomic and proteomic data with disease, treatment, and electronic health records. We demonstrate the use of Semantic Web technologies in the integration of patient and biomedical data, and reveal how such a knowledge base can aid physicians in providing tailored patient care and facilitate the recruitment of patients into active clinical trials. Thus, patients, physicians and researchers may explore the knowledge base to better understand therapeutic options, efficacy, and mechanisms of action. Conclusions: This work takes an important step in using Semantic Web technologies to facilitate integration of relevant, distributed, external sources and progress towards a computational platform to support personalized medicine. Availability: TMO can be downloaded from http://code.google.com/p/translationalmedicineontology and TMKB can be accessed at http://tm.semanticscience.org/sparql

Maastricht University Research Portal

Crossref

Harvard University - DASH

Springer - Publisher Connector

PubMed Central

Lancaster E-Prints

Data integration for offshore decommissioning waste management

Offshore decommissioning represents significant business opportunities for oil and gas service companies. However, for owners of offshore assets and regulators, it is a liability because of the associated costs. One way of mitigating decommissioning costs is through the sales and reuse of decommissioned items. To achieve this effectively, reliability assessment of decommissioned items is required. Such an assessment relies on data collected on the various items over the lifecycle of an engineering asset. Considering that offshore platforms have a design life of about 25 years and data management techniques and tools are constantly evolving, data captured about items to be decommissioned will be in varying forms. In addition, considering the many stakeholders involved with a facility over its lifecycle, information representation of the items will have variations. These challenges make data integration difficult. As a result, this research developed a data integration framework that makes use of Semantic Web technologies and ISO 15926 - a standard for process plant data integration - for rapid assessment of decommissioned items. The proposed solution helps in determining the reuse potential of decommissioned items, which can save on cost and benefit the environment

Heriot Watt Pure

Crossref

Nottingham Trent Institutional Repository (IRep)

Predicting probable Alzheimer's disease using linguistic deficits and biomarkers

Author: A Abbott
A Airola
A Fjell
A Marini
A Pozueta
A Tillas
AD Friederici
AJ Mitchell
AM Damian
B Croisile
B Dubois
B Klimova
B Klimova
B MacWhinney
B Roark
C Ballard
C Lee
C Thornton
Chee P. Wong
CR Jack
D Klein
E Kaplan
E Prud’hommeaux
F Roselli
G Sidorov
GC Smith
GM McKhann
Ireneous N. Soyiri
J Fan
J Reilly
JA Hanley
JA Williams
JD Rohrer
JL Locke
JO de Lira
Jojo S-M. Wong
K Hajian-Tilaki
Karen J. Golden
KC Fraser
KH Zou
L Scheubert
M Creutz
M Ewers
M Hall
M Mitolo
M Post
M Surdeanu
M Verma
MC Evans
MH Zweig
MJ Ball
MS Albert
MS Pepe
O Querbes
P Garrard
P Johnson
P Juola
PJ Yoder
RA Sperling
RG Newcombe
S Ahmed
S Mondini
S Pakhomov
S Pekkala
SF Chen
SO Orimaye
SO Orimaye
Sylvester O. Orimaye
WA Rocca
X Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

BackgroundThe manual diagnosis of neurodegenerative disorders such as Alzheimer’s disease (AD) and related Dementias has been a challenge. Currently, these disorders are diagnosed using specific clinical diagnostic criteria and neuropsychological examinations. The use of several Machine Learning algorithms to build automated diagnostic models using low-level linguistic features resulting from verbal utterances could aid diagnosis of patients with probable AD from a large population. For this purpose, we developed different Machine Learning models on the DementiaBank language transcript clinical dataset, consisting of 99 patients with probable AD and 99 healthy controls.ResultsOur models learned several syntactic, lexical, and n-gram linguistic biomarkers to distinguish the probable AD group from the healthy group. In contrast to the healthy group, we found that the probable AD patients had significantly less usage of syntactic components and significantly higher usage of lexical components in their language. Also, we observed a significant difference in the use of n-grams as the healthy group were able to identify and make sense of more objects in their n-grams than the probable AD group. As such, our best diagnostic model significantly distinguished the probable AD group from the healthy elderly group with a better Area Under the Receiving Operating Characteristics Curve (AUC) using the Support Vector Machines (SVM).ConclusionsExperimental and statistical evaluations suggest that using ML algorithms for learning linguistic biomarkers from the verbal utterances of elderly individuals could help the clinical diagnosis of probable AD. We emphasise that the best ML model for predicting the disease group combines significant syntactic, lexical and top n-gram features. However, there is a need to train the diagnostic models on larger datasets, which could lead to a better AUC and clinical diagnosis of probable AD

Repository@Hull - Worktribe

Crossref

Springer - Publisher Connector

PubMed Central

Edinburgh Research Explorer

Monash University Research Portal

Automatic Speech Recognition for Supporting Endangered Language Documentation

Author: Hatcher Richard
Jimerson Robbie
Michelson Karin
Prud’hommeaux Emily
Publication venue: 'University of Hawaii Press (Project Muse)'
Publication date: 01/11/2021
Field of study

Generating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.National Foreign Language Resource Cente

ScholarSpace at University of Hawai'i at Manoa