105 research outputs found

    Cost-aware scheduling of deadline-constrained task workflows in public cloud environments

    Get PDF
    Public cloud computing infrastructure offers resources on-demand, and makes it possible to develop applications that elastically scale when demand changes. This capacity can be used to schedule highly parallellizable task workflows, where individual tasks consist of many small steps. By dynamically scaling the number of virtual machines used, based on varying resource requirements of different steps, lower costs can be achieved, and workflows that would previously have been infeasible can be executed. In this paper, we describe how task workflows consisting of large numbers of distributable steps can be provisioned on public cloud infrastructure in a cost-efficient way, taking into account workflow deadlines. We formally define the problem, and describe an ILP-based algorithm and two heuristic algorithms to solve it. We simulate how the three algorithms perform when scheduling these task workflows on public cloud infrastructure, using the various instance types of the Amazon EC2 cloud, and we evaluate the achieved cost and execution speed of the three algorithms using two different task workflows based on a document processing application

    Sustainability of Evaluations Presented in Research Publications

    Full text link
    This position paper discusses how research publication would benefit of an infrastructure for evaluation entities that could be used to support documenting research efforts (e.g., in papers or blogs), analysing these efforts, and building upon them. As a concrete example in the domain of semantic technologies, the paper presents the SEALS Platform and discusses how such platform can promote research publication

    Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench

    Get PDF
    Challenges in creating comprehensive text-processing worklows include a lack of the interoperability of individual components coming from different providers and/or a requirement imposed on the end users to know programming techniques to compose such workflows. In this paper we demonstrate Argo, a web-based system that addresses these issues in several ways. It supports the widely adopted Unstructured Information Management Architecture (UIMA), which handles the problem of interoperability; it provides a web browser-based interface for developing workflows by drawing diagrams composed of a selection of available processing components; and it provides novel user-interactive analytics such as the annotation editor which constitutes a bridge between automatic processing and manual correction. These features extend the target audience of Argo to users with a limited or no technical background. Here, we focus specifically on the construction of advanced workflows, involving multiple branching and merging points, to facilitate various comparative evalutions. Together with the use of user-collaboration capabilities supported in Argo, we demonstrate several use cases including visual inspections, comparisions of multiple processing segments or complete solutions against a reference standard, inter-annotator agreement, and shared task mass evaluations. Ultimetely, Argo emerges as a one-stop workbench for defining, processing, editing and evaluating text processing tasks

    The Audio Degradation Toolbox and its Application to Robustness Evaluation

    Get PDF
    We introduce the Audio Degradation Toolbox (ADT) for the controlled degradation of audio signals, and propose its usage as a means of evaluating and comparing the robustness of audio processing algorithms. Music recordings encountered in practical applications are subject to varied, sometimes unpredictable degradation. For example, audio is degraded by low-quality microphones, noisy recording environments, MP3 compression, dynamic compression in broadcasting or vinyl decay. In spite of this, no standard software for the degradation of audio exists, and music processing methods are usually evaluated against clean data. The ADT fills this gap by providing Matlab scripts that emulate a wide range of degradation types. We describe 14 degradation units, and how they can be chained to create more complex, `real-world' degradations. The ADT also provides functionality to adjust existing ground-truth, correcting for temporal distortions introduced by degradation. Using four different music informatics tasks, we show that performance strongly depends on the combination of method and degradation applied. We demonstrate that specific degradations can reduce or even reverse the performance difference between two competing methods. ADT source code, sounds, impulse responses and definitions are freely available for download

    Proteomics to go: Proteomatic enables the user-friendly creation of versatile MS/MS data evaluation workflows

    Get PDF
    We present Proteomatic, an operating system independent and user-friendly platform that enables the construction and execution of MS/MS data evaluation pipelines using free and commercial software. Required external programs such as for peptide identification are downloaded automatically in the case of free software. Due to a strict separation of functionality and presentation, and support for multiple scripting languages, new processing steps can be added easily

    On-tissue dataset-dependent MALDI-TIMS-MS2^{2} bioimaging

    Full text link
    Trapped ion mobility spectrometry (TIMS) adds an additional separation dimension to mass spectrometry (MS) imaging, however, the lack of fragmentation spectra (MS2^{2}) impedes confident compound annotation in spatial metabolomics. Here, we describe spatial ion mobility-scheduled exhaustive fragmentation (SIMSEF), a dataset-dependent acquisition strategy that augments TIMS-MS imaging datasets with MS2^{2} spectra. The fragmentation experiments are systematically distributed across the sample and scheduled for multiple collision energies per precursor ion. Extendable data processing and evaluation workflows are implemented into the open source software MZmine. The workflow and annotation capabilities are demonstrated on rat brain tissue thin sections, measured by matrix-assisted laser desorption/ionisation (MALDI)-TIMS-MS, where SIMSEF enables on-tissue compound annotation through spectral library matching and rule-based lipid annotation within MZmine and maps the (un)known chemical space by molecular networking. The SIMSEF algorithm and data analysis pipelines are open source and modular to provide a community resource

    Continuously Accelerating Research

    Get PDF
    Science is facing a software reproducibility crisis. Software powers experimentation, and fuels insights, yielding new scientific contributions. Yet, the research software is often difficult for other researchers to reproducibly run. Beyond reproduction, research software that is truly reusable will speed science by allowing other researchers to easily build upon and extend prior work. As software engineering researchers, we believe that it is our duty to create tools and processes that instill reproducibility, reusability, and extensibility into research software. This paper outlines a vision for a community infrastructure that will bring the benefits of continuous integration to scientists developing research software. To persuade researchers to adopt this infrastructure, we will appeal to their self-interest by making it easier for them to develop and evaluate research prototypes. Building better research software is a complex socio-technical problem that requires stakeholders to join forces to solve this problem for the software engineering community, and the greater scientific community. This vision paper outlines an agenda for realizing a world where the reproducibility and reusability barriers in research software are lifted, continuously accelerating research

    MITK-ModelFit: A generic open-source framework for model fits and their exploration in medical imaging -- design, implementation and application on the example of DCE-MRI

    Full text link
    Many medical imaging techniques utilize fitting approaches for quantitative parameter estimation and analysis. Common examples are pharmacokinetic modeling in DCE MRI/CT, ADC calculations and IVIM modeling in diffusion-weighted MRI and Z-spectra analysis in chemical exchange saturation transfer MRI. Most available software tools are limited to a special purpose and do not allow for own developments and extensions. Furthermore, they are mostly designed as stand-alone solutions using external frameworks and thus cannot be easily incorporated natively in the analysis workflow. We present a framework for medical image fitting tasks that is included in MITK, following a rigorous open-source, well-integrated and operating system independent policy. Software engineering-wise, the local models, the fitting infrastructure and the results representation are abstracted and thus can be easily adapted to any model fitting task on image data, independent of image modality or model. Several ready-to-use libraries for model fitting and use-cases, including fit evaluation and visualization, were implemented. Their embedding into MITK allows for easy data loading, pre- and post-processing and thus a natural inclusion of model fitting into an overarching workflow. As an example, we present a comprehensive set of plug-ins for the analysis of DCE MRI data, which we validated on existing and novel digital phantoms, yielding competitive deviations between fit and ground truth. Providing a very flexible environment, our software mainly addresses developers of medical imaging software that includes model fitting algorithms and tools. Additionally, the framework is of high interest to users in the domain of perfusion MRI, as it offers feature-rich, freely available, validated tools to perform pharmacokinetic analysis on DCE MRI data, with both interactive and automatized batch processing workflows.Comment: 31 pages, 11 figures URL: http://mitk.org/wiki/MITK-ModelFi

    HELLENIC NATIONAL DOCUMENTATION CENTER "OPEN SCIENCE: ISSUES AND PERSPECTIVES" CONFERENCE REPORT, JUNE 2017

    Get PDF
    National Documentation Center (NDC) 15th June Day Conference on “Open Science: Issues and Perspectives” Report offering a critical overview of the event's effectiveness in familiarizing a wide variety of professionals with the Open Access (OA)/Open Science (OS) related issues through providing them with an essential baseline knowledge of the technology-driven publication and data management changing landscape
    • …
    corecore