4,098 research outputs found

    Search and Result Presentation in Scientific Workflow Repositories

    Get PDF
    We study the problem of searching a repository of complex hierarchical workflows whose component modules, both composite and atomic, have been annotated with keywords. Since keyword search does not use the graph structure of a workflow, we develop a model of workflows using context-free bag grammars. We then give efficient polynomial-time algorithms that, given a workflow and a keyword query, determine whether some execution of the workflow matches the query. Based on these algorithms we develop a search and ranking solution that efficiently retrieves the top-k grammars from a repository. Finally, we propose a novel result presentation method for grammars matching a keyword query, based on representative parse-trees. The effectiveness of our approach is validated through an extensive experimental evaluation

    Library Resources: Procurement, Innovation and Exploitation in a Digital World

    Get PDF
    The possibilities of the digital future require new models for procurement, innovation and exploitation. Emma Crowley and Chris Spencer describe the skills staff need to deliver resources in hybrid and digital environments. The chapter demonstrates the innovative ways that librarians use to procure and exploit the wealth of resources available in a digital world. They also describe the technological developments that can be adopted to improve workflow processes and they highlight the challenges faced on this fascinating journey

    WorkflowHunt : um mecanismo de busca híbrida para repositórios de workflows científicos

    Get PDF
    Orientador: Claudia Maria Bauzer MedeirosDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os experimentos científicos e os conjuntos de dados gerados a partir deles estão crescendo em tamanho e complexidade. Os cientistas estão enfrentando dificuldades para compartilhar esses recursos e permitir a reprodutibilidade do experimento. Algumas iniciativas surgiram para tentar resolver esse problema. Uma delas envolve o uso de workflows científicos para representar a execução de experimentos científicos. Existe um número crescente de workflows que são potencialmente relevantes para mais de um domínio científico. Criar um workflow leva tempo e recursos e sua reutilização ajuda aos cientistas a criar novos workflows de forma mais rápida e confiável. No entanto, é difícil encontrar workflows adequados para reutilização. Geralmente, os repositórios de workflows possuem mecanismos de busca com muitas limitações, o que afeta negativamente a descoberta de workflows relevantes para um cientista ou seu time. Esta dissertação apresenta WorkflowHunt, uma arquitetura híbrida para busca e descoberta de workflows em repositórios genéricos, combinando busca baseada em palavras-chave e busca semântica para encontrar workflows relevantes usando diferentes métodos de busca. Ao contrário da maioria das pesquisas correlatas, nossa proposta e sua implementação são genéricas. Nosso sistema de indexação e anotação é automático e independe de domínio ou ontologia específica. A arquitetura foi validada por meio de um protótipo que usa workflows e metadados reais do myExperiment, um dos maiores repositórios de workflows científicos. Nosso sistema também compara seus resultados com o mecanismo de busca do myExperiment para analisar em que casos um sistema supera o outroAbstract: Scientific experiments and the datasets generated from them are growing in size and complexity. Scientists are facing difficulties to share those resources in a way that allows reproducibility of the experiment. Some initiatives have emerged to try to solve this problem. One of them involves the use of scientific workflows to represent and enact the execution of scientific experiments. There is an increasing number of workflows that are potentially relevant for more than one scientific domain. Creating a workflow takes time and resources, and their reuse helps scientists to build new workflows faster and in a more reliable way. However, it is hard to find workflows suitable for reuse for an experiment. Usually, workflow repositories have search mechanisms with many limitations, which affects negatively the discovery of relevant workflows. This dissertation presents WorkflowHunt, a hybrid architecture for workflow search and discovery for generic repositories, which combines keyword and semantic search to find relevant workflows using different search methods. Unlike most related work, our proposal and its implementation are generic. Our indexing and annotation mechanism are automatic and not restricted to a specific domain or ontology. We validated our architecture creating a prototype that uses real workflows and metadata from myExperiment, one of the largest online scientific workflow repositories. Our system also compares its results with myExperiment¿s search engine to analyze in which cases one retrieval system outperforms the otherMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE

    Semantic web technology to support learning about the semantic web

    Get PDF
    This paper describes ASPL, an Advanced Semantic Platform for Learning, designed using the Magpie framework with an aim to support students learning about the Semantic Web research area. We describe the evolution of ASPL and illustrate how we used the results from a formal evaluation of the initial system to re-design the user functionalities. The second version of ASPL semantically interprets the results provided by a non-semantic web mining tool and uses them to support various forms of semantics-assisted exploration, based on pedagogical strategies such as performing later reasoning steps and problem space filtering

    Evolution of statistical analysis in empirical software engineering research: Current state and steps forward

    Full text link
    Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To investigate the practices and trends of statistical analysis in empirical software engineering (ESE), this paper presents a review of a large pool of papers from top-ranked software engineering journals. First, we manually reviewed 161 papers and in the second phase of our method, we conducted a more extensive semi-automatic classification of papers spanning the years 2001--2015 and 5,196 papers. Results from both review steps was used to: i) identify and analyze the predominant practices in ESE (e.g., using t-test or ANOVA), as well as relevant trends in usage of specific statistical methods (e.g., nonparametric tests and effect size measures) and, ii) develop a conceptual model for a statistical analysis workflow with suggestions on how to apply different statistical methods as well as guidelines to avoid pitfalls. Lastly, we confirm existing claims that current ESE practices lack a standard to report practical significance of results. We illustrate how practical significance can be discussed in terms of both the statistical analysis and in the practitioner's context.Comment: journal submission, 34 pages, 8 figure

    Search and Result Presentation in Scientific Workflow Repositories

    Get PDF
    We study the problem of searching a repository of complex hierarchical workflows whose component modules, both composite and atomic, have been annotated with keywords. Since keyword search does not use the graph structure of a workflow, we develop a model of workflows using context-free bag grammars. We then give efficient polynomial-time algorithms that, given a workflow and a keyword query, determine whether some execution of the workflow matches the query. Based on these algorithms we develop a search and ranking solution that efficiently retrieves the top-k grammars from a repository. Finally, we propose a novel result presentation method for grammars matching a keyword query, based on representative parse-trees. The effectiveness of ou
    • …
    corecore