1,015 research outputs found

    Automatic annotation of bioinformatics workflows with biomedical ontologies

    Full text link
    Legacy scientific workflows, and the services within them, often present scarce and unstructured (i.e. textual) descriptions. This makes it difficult to find, share and reuse them, thus dramatically reducing their value to the community. This paper presents an approach to annotating workflows and their subcomponents with ontology terms, in an attempt to describe these artifacts in a structured way. Despite a dearth of even textual descriptions, we automatically annotated 530 myExperiment bioinformatics-related workflows, including more than 2600 workflow-associated services, with relevant ontological terms. Quantitative evaluation of the Information Content of these terms suggests that, in cases where annotation was possible at all, the annotation quality was comparable to manually curated bioinformatics resources.Comment: 6th International Symposium on Leveraging Applications (ISoLA 2014 conference), 15 pages, 4 figure

    Semantically Resolving Type Mismatches in Scientific Workflows

    No full text
    Scientists are increasingly utilizing Grids to manage large data sets and execute scientific experiments on distributed resources. Scientific workflows are used as means for modeling and enacting scientific experiments. Windows Workflow Foundation (WF) is a major component of Microsoft’s .NET technology which offers lightweight support for long-running workflows. It provides a comfortable graphical and programmatic environment for the development of extended BPEL-style workflows. WF’s visual features ease the syntactic composition of Web services into scientific workflows but do nothing to assure that information passed between services has consistent semantic types or representations or that deviant flows, errors and compensations are handled meaningfully. In this paper we introduce SAWSDL-compliant annotations for WF and use them with a semantic reasoner to guarantee semantic type correctness in scientific workflows. Examples from bioinformatics are presented

    myTea: Connecting the Web to Digital Science on the Desktop

    No full text
    Bioinformaticians regularly access the hundreds of databases and tools that are available to them on the Web. None of these tools communicate with each other, causing the scientist to copy results manually from a Web site into a spreadsheet or word processor. myGrids' Taverna has made it possible to create templates (workflows) that automatically run searches using these databases and tools, cutting down what previously took days of work into hours, and enabling the automated capture of experimental details. What is still missing in the capture process, however, is the details of work done on that material once it moves from the Web to the desktop: if a scientist runs a process on some data, there is nothing to record why that action was taken; it is likewise not easy to publish a record of this process back to the community on the Web. In this paper, we present a novel interaction framework, built on Semantic Web technologies, and grounded in usability design practice, in particular the Making Tea method. Through this work, we introduce a new model of practice designed specifically to (1) support the scientists' interactions with data from the Web to the desktop, (2) provide automatic annotation of process to capture what has previously been lost and (3) associate provenance services automatically with that data in order to enable meaningful interrogation of the process and controlled sharing of the results

    Structuring research methods and data with the research object model: genomics workflows as a case study

    Full text link

    Structuring research methods and data with the research object model:genomics workflows as a case study

    Get PDF
    Background: One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e. g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows. Results: We present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as "which particular data was input to a particular workflow to test a particular hypothesis?", and "which particular conclusions were drawn from a particular workflow?". Conclusions: Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well. Availability: The Research Object is available at http://www.myexperiment.org/packs/428 The Wf4Ever Research Object Model is available at http://wf4ever.github.io/r

    WorkflowHunt : um mecanismo de busca híbrida para repositórios de workflows científicos

    Get PDF
    Orientador: Claudia Maria Bauzer MedeirosDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os experimentos científicos e os conjuntos de dados gerados a partir deles estão crescendo em tamanho e complexidade. Os cientistas estão enfrentando dificuldades para compartilhar esses recursos e permitir a reprodutibilidade do experimento. Algumas iniciativas surgiram para tentar resolver esse problema. Uma delas envolve o uso de workflows científicos para representar a execução de experimentos científicos. Existe um número crescente de workflows que são potencialmente relevantes para mais de um domínio científico. Criar um workflow leva tempo e recursos e sua reutilização ajuda aos cientistas a criar novos workflows de forma mais rápida e confiável. No entanto, é difícil encontrar workflows adequados para reutilização. Geralmente, os repositórios de workflows possuem mecanismos de busca com muitas limitações, o que afeta negativamente a descoberta de workflows relevantes para um cientista ou seu time. Esta dissertação apresenta WorkflowHunt, uma arquitetura híbrida para busca e descoberta de workflows em repositórios genéricos, combinando busca baseada em palavras-chave e busca semântica para encontrar workflows relevantes usando diferentes métodos de busca. Ao contrário da maioria das pesquisas correlatas, nossa proposta e sua implementação são genéricas. Nosso sistema de indexação e anotação é automático e independe de domínio ou ontologia específica. A arquitetura foi validada por meio de um protótipo que usa workflows e metadados reais do myExperiment, um dos maiores repositórios de workflows científicos. Nosso sistema também compara seus resultados com o mecanismo de busca do myExperiment para analisar em que casos um sistema supera o outroAbstract: Scientific experiments and the datasets generated from them are growing in size and complexity. Scientists are facing difficulties to share those resources in a way that allows reproducibility of the experiment. Some initiatives have emerged to try to solve this problem. One of them involves the use of scientific workflows to represent and enact the execution of scientific experiments. There is an increasing number of workflows that are potentially relevant for more than one scientific domain. Creating a workflow takes time and resources, and their reuse helps scientists to build new workflows faster and in a more reliable way. However, it is hard to find workflows suitable for reuse for an experiment. Usually, workflow repositories have search mechanisms with many limitations, which affects negatively the discovery of relevant workflows. This dissertation presents WorkflowHunt, a hybrid architecture for workflow search and discovery for generic repositories, which combines keyword and semantic search to find relevant workflows using different search methods. Unlike most related work, our proposal and its implementation are generic. Our indexing and annotation mechanism are automatic and not restricted to a specific domain or ontology. We validated our architecture creating a prototype that uses real workflows and metadata from myExperiment, one of the largest online scientific workflow repositories. Our system also compares its results with myExperiment¿s search engine to analyze in which cases one retrieval system outperforms the otherMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE

    Combining ontologies and workflows to design formal protocols for biological laboratories

    Get PDF
    Background Laboratory protocols in life sciences tend to be written in natural language, with negative consequences on repeatability, distribution and automation of scientific experiments. Formalization of knowledge is becoming popular in science. In the case of laboratory protocols two levels of formalization are needed: one for the entities and individuals operations involved in protocols and another one for the procedures, which can be manually or automatically executed. This study aims to combine ontologies and workflows for protocol formalization. Results A laboratory domain specific ontology and the COW (Combining Ontologies with Workflows) software tool were developed to formalize workflows built on ontologies. A method was specifically set up to support the design of structured protocols for biological laboratory experiments. The workflows were enhanced with ontological concepts taken from the developed domain specific ontology. The experimental protocols represented as workflows are saved in two linked files using two standard interchange languages (i.e. XPDL for workflows and OWL for ontologies). A distribution package of COW including installation procedure, ontology and workflow examples, is freely available from http://www.bmr-genomics.it/farm/cow webcite. Conclusions Using COW, a laboratory protocol may be directly defined by wet-lab scientists without writing code, which will keep the resulting protocol's specifications clear and easy to read and maintain

    BioUSeR: a semantic-based tool for retrieving Life Science web resources driven by text-rich user requirements

    Get PDF
    Background: OpenmetadataregistriesareafundamentaltoolforresearchersintheLifeSciencestryingtolocate resources. While most current registries assume that resources are annotated with well-structured metadata, evidence shows that most of the resource annotations simply consists of informal free text. This reality must be taken into account in order to develop effective techniques for resource discovery in Life Sciences. Results: BioUSeRisasemantic-basedtoolaimedatretrievingLifeSciencesresourcesdescribedinfreetext.The retrieval process is driven by the user requirements, which consist of a target task and a set of facets of interest, both expressed in free text. BioUSeR is able to effectively exploit the available textual descriptions to find relevant resources by using semantic-aware techniques. Conclusions: BioUSeRovercomesthelimitationsofthecurrentregistriesthanksto:(i)richspecificationofuser information needs, (ii) use of semantics to manage textual descriptions, (iii) retrieval and ranking of resources based on user requirements
    • …
    corecore