3,881 research outputs found
Supporting the Everyday Work of Scientists: Automating Scientific Workflows
This paper describes an action research project that we undertook with National Research Council Canada (NRC) scientists. Based on discussions about their \ud
difficulties in using software to collect data and manage processes, we identified three requirements for increasing research productivity: ease of use for end- \ud
users; managing scientific workflows; and facilitating software interoperability. Based on these requirements, we developed a software framework, Sweet, to \ud
assist in the automation of scientific workflows. \ud
\ud
Throughout the iterative development process, and through a series of structured interviews, we evaluated how the framework was used in practice, and identified \ud
increases in productivity and effectiveness and their causes. While the framework provides resources for writing application wrappers, it was easier to code the applicationsā functionality directly into the framework using OSS components. Ease of use for the end-user and flexible and fully parameterized workflow representations were key elements of the frameworkās success. \u
S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers
Data mining has proven to be very useful in order to extract information from data in many different contexts. However, due to the complexity of data mining techniques, it is required the know-how of an expert in this field to select and use them. Actually, adequately applying data mining is out of the reach of novice users which have expertise in their area of work, but lack skills to employ these techniques. In this paper, we use both model-driven engineering and scientific workflow standards and tools in order to develop named S3Mining framework, which supports novice users in the process of selecting the data mining classification algorithm that better fits with their data and goal. To this aim, this selection process uses the past experiences of expert data miners with the application of classification techniques over their own datasets. The contributions of our S3Mining framework are as follows: (i) an approach to create a knowledge base which stores the past experiences of experts users, (ii) a process that provides the expert users with utilities for the construction of classifiers? recommenders based on the existing knowledge base, (iii) a system that allows novice data miners to use these recommenders for discovering the classifiers that better fit for solving their problem at hand, and (iv) a public implementation of the framework?s workflows. Finally, an experimental evaluation has been conducted to shown the feasibility of our framework
A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD
In order to make results of computational scientific research findable,
accessible, interoperable and re-usable, it is necessary to decorate them with
standardised metadata. However, there are a number of technical and practical
challenges that make this process difficult to achieve in practice. Here the
implementation of a protocol is presented to tag crystal structures with their
computed properties, without the need of human intervention to curate the data.
This protocol leverages the capabilities of AiiDA, an open-source platform to
manage and automate scientific computational workflows, and TCOD, an
open-access database storing computed materials properties using a well-defined
and exhaustive ontology. Based on these, the complete procedure to deposit
computed data in the TCOD database is automated. All relevant metadata are
extracted from the full provenance information that AiiDA tracks and stores
automatically while managing the calculations. Such a protocol also enables
reproducibility of scientific data in the field of computational materials
science. As a proof of concept, the AiiDA-TCOD interface is used to deposit 170
theoretical structures together with their computed properties and their full
provenance graphs, consisting in over 4600 AiiDA nodes
McRunjob: A High Energy Physics Workflow Planner for Grid Production Processing
McRunjob is a powerful grid workflow manager used to manage the generation of
large numbers of production processing jobs in High Energy Physics. In use at
both the DZero and CMS experiments, McRunjob has been used to manage large
Monte Carlo production processing since 1999 and is being extended to uses in
regular production processing for analysis and reconstruction. Described at
CHEP 2001, McRunjob converts core metadata into jobs submittable in a variety
of environments. The powerful core metadata description language includes
methods for converting the metadata into persistent forms, job descriptions,
multi-step workflows, and data provenance information. The language features
allow for structure in the metadata by including full expressions, namespaces,
functional dependencies, site specific parameters in a grid environment, and
ontological definitions. It also has simple control structures for
parallelization of large jobs. McRunjob features a modular design which allows
for easy expansion to new job description languages or new application level
tasks.Comment: CHEP 2003 serial number TUCT00
Distribution pattern-driven development of service architectures
Distributed systems are being constructed by composing a number of discrete components. This practice is particularly prevalent within the Web service domain in the form of service process orchestration and choreography. Often, enterprise systems are built from many existing discrete applications such as legacy applications exposed using Web service interfaces. There are a number of architectural configurations or distribution patterns, which express how a composed system is to be deployed in a distributed environment. However, the amount of code
required to realise these distribution patterns is considerable. In this paper, we propose a distribution
pattern-driven approach to service composition and architecting. We develop, based on a catalog of patterns, a UML-compliant framework, which takes existing Web service interfaces as its input and generates executable Web service compositions based on a distribution pattern chosen by the software architect
Elastic Business Process Management: State of the Art and Open Challenges for BPM in the Cloud
With the advent of cloud computing, organizations are nowadays able to react
rapidly to changing demands for computational resources. Not only individual
applications can be hosted on virtual cloud infrastructures, but also complete
business processes. This allows the realization of so-called elastic processes,
i.e., processes which are carried out using elastic cloud resources. Despite
the manifold benefits of elastic processes, there is still a lack of solutions
supporting them.
In this paper, we identify the state of the art of elastic Business Process
Management with a focus on infrastructural challenges. We conceptualize an
architecture for an elastic Business Process Management System and discuss
existing work on scheduling, resource allocation, monitoring, decentralized
coordination, and state management for elastic processes. Furthermore, we
present two representative elastic Business Process Management Systems which
are intended to counter these challenges. Based on our findings, we identify
open issues and outline possible research directions for the realization of
elastic processes and elastic Business Process Management.Comment: Please cite as: S. Schulte, C. Janiesch, S. Venugopal, I. Weber, and
P. Hoenisch (2015). Elastic Business Process Management: State of the Art and
Open Challenges for BPM in the Cloud. Future Generation Computer Systems,
Volume NN, Number N, NN-NN., http://dx.doi.org/10.1016/j.future.2014.09.00
- ā¦