Search CORE

3,974 research outputs found

Semantically Resolving Type Mismatches in Scientific Workflows

Author: Derouiche Kheiredine
Nicole Denis A
Publication venue
Publication date: 22/11/2007
Field of study

Scientists are increasingly utilizing Grids to manage large data sets and execute scientific experiments on distributed resources. Scientific workflows are used as means for modeling and enacting scientific experiments. Windows Workflow Foundation (WF) is a major component of Microsoft’s .NET technology which offers lightweight support for long-running workflows. It provides a comfortable graphical and programmatic environment for the development of extended BPEL-style workflows. WF’s visual features ease the syntactic composition of Web services into scientific workflows but do nothing to assure that information passed between services has consistent semantic types or representations or that deviant flows, errors and compensations are handled meaningfully. In this paper we introduce SAWSDL-compliant annotations for WF and use them with a semantic reasoner to guarantee semantic type correctness in scientific workflows. Examples from bioinformatics are presented

Southampton (e-Prints Soton)

myTea: Connecting the Web to Digital Science on the Desktop

Author: Brostoff Sacha
Cooke Ray
Gibson Andrew
schraefel m.c.
Stevens Robert
Publication venue: s.n.
Publication date: 01/01/2005
Field of study

Bioinformaticians regularly access the hundreds of databases and tools that are available to them on the Web. None of these tools communicate with each other, causing the scientist to copy results manually from a Web site into a spreadsheet or word processor. myGrids' Taverna has made it possible to create templates (workflows) that automatically run searches using these databases and tools, cutting down what previously took days of work into hours, and enabling the automated capture of experimental details. What is still missing in the capture process, however, is the details of work done on that material once it moves from the Web to the desktop: if a scientist runs a process on some data, there is nothing to record why that action was taken; it is likewise not easy to publish a record of this process back to the community on the Web. In this paper, we present a novel interaction framework, built on Semantic Web technologies, and grounded in usability design practice, in particular the Making Tea method. Through this work, we introduce a new model of practice designed specifically to (1) support the scientists' interactions with data from the Web to the desktop, (2) provide automatic annotation of process to capture what has previously been lost and (3) associate provenance services automatically with that data in order to enable meaningful interrogation of the process and controlled sharing of the results

Southampton (e-Prints Soton)

EXACT2: the semantics of biomedical protocols

Author: A Maccagnan
A Pease
A Sackmann
A Sujathaa
Brian B Rudkin
CJ Mungall
Daniel Nadis
Doi
Emma Haddi
Grunwald
H Obokata
I Mura
J Taubert
K Wolstencroft
Larisa N Soldatova
LN Soldatova
LN Soldatova
LN Soldatova
M Courtot
M Hilario
M Schilling
Nigel J Saunders
Piyali S Basu
R Garside
RD King
Ross D King
RR Brinkman
S Mitchell
S Rune
S Shapin
T Bittner
T Klingström
Th Paul
V Rätzel
Véronique Baumlé
W Ceusters
Wolfgang Marwan
Z Xiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

© 2014 Soldatova et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.This article has been made available through the Brunel Open Access Publishing Fund.Background: The reliability and reproducibility of experimental procedures is a cornerstone of scientific practice. There is a pressing technological need for the better representation of biomedical protocols to enable other agents (human or machine) to better reproduce results. A framework that ensures that all information required for the replication of experimental protocols is essential to achieve reproducibility. Methods: We have developed the ontology EXACT2 (EXperimental ACTions) that is designed to capture the full semantics of biomedical protocols required for their reproducibility. To construct EXACT2 we manually inspected hundreds of published and commercial biomedical protocols from several areas of biomedicine. After establishing a clear pattern for extracting the required information we utilized text-mining tools to translate the protocols into a machine amenable format. We have verified the utility of EXACT2 through the successful processing of previously ‘unseen’ (not used for the construction of EXACT2) protocols. Results: The paper reports on a fundamentally new version EXACT2 that supports the semantically-defined representation of biomedical protocols. The ability of EXACT2 to capture the semantics of biomedical procedures was verified through a text mining use case. In this EXACT2 is used as a reference model for text mining tools to identify terms pertinent to experimental actions, and their properties, in biomedical protocols expressed in natural language. An EXACT2-based framework for the translation of biomedical protocols to a machine amenable format is proposed. Conclusions: The EXACT2 ontology is sufficient to record, in a machine processable form, the essential information about biomedical protocols. EXACT2 defines explicit semantics of experimental actions, and can be used by various computer applications. It can serve as a reference model for for the translation of biomedical protocols in natural language into a semantically-defined format.This work has been partially funded by the Brunel University BRIEF award and a grant from Occams Resources

Goldsmiths Research Online

Crossref

Springer - Publisher Connector

PubMed Central

Brunel University Research Archive

A Linked Data Approach to Sharing Workflows and Workflow Results

Author: Bechhofer S
Margaria T
Marshall MS
Missier P
Newman DR
Roos M
Roure DD
Steffen B
Zhao J
Publication venue
Publication date: 01/01/2010
Field of study

A bioinformatics analysis pipeline is often highly elaborate, due to the inherent complexity of biological systems and the variety and size of datasets. A digital equivalent of the ‘Materials and Methods’ section in wet laboratory publications would be highly beneficial to bioinformatics, for evaluating evidence and examining data across related experiments, while introducing the potential to find associated resources and integrate them as data and services. We present initial steps towards preserving bioinformatics ‘materials and methods’ by exploiting the workflow paradigm for capturing the design of a data analysis pipeline, and RDF to link the workflow, its component services, run-time provenance, and a personalized biological interpretation of the results. An example shows the reproduction of the unique graph of an analysis procedure, its results, provenance, and personal interpretation of a text mining experiment. It links data from Taverna, myExperiment.org, BioCatalogue.org, and ConceptWiki.org. The approach is relatively ‘light-weight’ and unobtrusive to bioinformatics users

Southampton (e-Prints Soton)

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

The University of Manchester - Institutional Repository

Automatic annotation of bioinformatics workflows with biomedical ontologies

Author: B. Smith
B.P. Vandervalk
D. Sáchez
D. Withers
J. Ison
M.D. Wilkinson
M.D. Wilkinson
P. Lord
P. Rice
S. Harispe
T. Oinn
U. Radetzki
Publication venue
Publication date: 01/01/2014
Field of study

Legacy scientific workflows, and the services within them, often present scarce and unstructured (i.e. textual) descriptions. This makes it difficult to find, share and reuse them, thus dramatically reducing their value to the community. This paper presents an approach to annotating workflows and their subcomponents with ontology terms, in an attempt to describe these artifacts in a structured way. Despite a dearth of even textual descriptions, we automatically annotated 530 myExperiment bioinformatics-related workflows, including more than 2600 workflow-associated services, with relevant ontological terms. Quantitative evaluation of the Information Content of these terms suggests that, in cases where annotation was possible at all, the annotation quality was comparable to manually curated bioinformatics resources.Comment: 6th International Symposium on Leveraging Applications (ISoLA 2014 conference), 15 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Bioinformatics service reconciliation by heterogeneous schema transformation

Author: Martin Nigel
Poulovassilis Alexandra
Zamboulis Lucas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2007
Field of study

This paper focuses on the problem of bioinformatics service reconciliation in a generic and scalable manner so as to enhance interoperability in a highly evolving field. Using XML as a common representation format, but also supporting existing flat-file representation formats, we propose an approach for the scalable semi-automatic reconciliation of services, possibly invoked from within a scientific workflows tool. Service reconciliation may use the AutoMed heterogeneous data integration system as an intermediary service, or may use AutoMed to produce services that mediate between services. We discuss the application of our approach for the reconciliation of services in an example bioinformatics workflow. The main contribution of this research is an architecture for the scalable reconciliation of bioinformatics services

Birkbeck Institutional Research Online

A Query Integrator and Manager for the Query Web

Author: Brinkley James F.
Detwiler Landon T.
Publication venue
Publication date: 01/04/2012
Field of study

We introduce two concepts: the Query Web as a layer of interconnected queries over the document web and the semantic web, and a Query Web Integrator and Manager (QI) that enables the Query Web to evolve. QI permits users to write, save and reuse queries over any web accessible source, including other queries saved in other installations of QI. The saved queries may be in any language (e.g. SPARQL, XQuery); the only condition for interconnection is that the queries return their results in some form of XML. This condition allows queries to chain off each other, and to be written in whatever language is appropriate for the task. We illustrate the potential use of QI for several biomedical use cases, including ontology view generation using a combination of graph-based and logical approaches, value set generation for clinical data management, image annotation using terminology obtained from an ontology web service, ontology-driven brain imaging data integration, small-scale clinical data integration, and wider-scale clinical data integration. Such use cases illustrate the current range of applications of QI and lead us to speculate about the potential evolution from smaller groups of interconnected queries into a larger query network that layers over the document and semantic web. The resulting Query Web could greatly aid researchers and others who now have to manually navigate through multiple information sources in order to answer specific questions

Elsevier - Publisher Connector

University of Washington Structural Informatics Group Publications

The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation

Author: Benjamin Vandervalk
Luke McCarthy
Mark Wilkinson
Publication venue
Publication date: 01/01/2011
Field of study

Background. 
The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community.

Description. 
SADI – Semantic Automated Discovery and Integration – is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services “stack”, SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers.

Conclusions.
SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behavior we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner very similar to data housed in static triple-stores, thus facilitating the intersection of Web services and Semantic Web technologies

Springer - Publisher Connector

Nature Precedings

A Semantic Workflow Mechanism to Realize Experimental Goals and Constraints

Author: Edwards Pete
Gotts Nicholas
Pignotti Edoardo
Polhill J. G.
Preece Alun David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Postprin

Aberdeen University Research

Online Research @ Cardiff

P5.3.2 Adaptive Workflow Technology

Author: Cantalupo B.
Ferris J.
Matskanis N.
Surridge M.
Publication venue: s.n.
Publication date: 01/04/2006
Field of study

Southampton (e-Prints Soton)