3 research outputs found

    A provenance-based semantic approach to support understandability, reproducibility, and reuse of scientific experiments

    Get PDF
    Understandability and reproducibility of scientific results are vital in every field of science. Several reproducibility measures are being taken to make the data used in the publications findable and accessible. However, there are many challenges faced by scientists from the beginning of an experiment to the end in particular for data management. The explosive growth of heterogeneous research data and understanding how this data has been derived is one of the research problems faced in this context. Interlinking the data, the steps and the results from the computational and non-computational processes of a scientific experiment is important for the reproducibility. We introduce the notion of end-to-end provenance management'' of scientific experiments to help scientists understand and reproduce the experimental results. The main contributions of this thesis are: (1) We propose a provenance modelREPRODUCE-ME'' to describe the scientific experiments using semantic web technologies by extending existing standards. (2) We study computational reproducibility and important aspects required to achieve it. (3) Taking into account the REPRODUCE-ME provenance model and the study on computational reproducibility, we introduce our tool, ProvBook, which is designed and developed to demonstrate computational reproducibility. It provides features to capture and store provenance of Jupyter notebooks and helps scientists to compare and track their results of different executions. (4) We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility) for the end-to-end provenance management. This collaborative framework allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational steps in an interoperable way. We apply our contributions to a set of scientific experiments in microscopy research projects

    Automatic reproducibility and parallelism for biological image analysis workflows

    Get PDF
    Current microscopy techniques hugely profit from modern microscopes producing a massive amount of increasingly complex data which are analysed by sophisticated algorithms. As a result, previously undistinguishable phenomena can be observed. However, this development coincides with new challenges for the biologist executing these experiments. Data storage, data processing, parallelisation, automation, and reproducibility are important factors in mastering these new techniques as they incur additional effort previously of less impact for the biologists. Existing solutions address the mentioned factors separately. Image storage systems manage the storage of data, specialised tool solve individual processing problems, and workflow systems help with automation and ensure reproducibility. Finally, parallelisation is a topic that is slowly gaining traction in the field of the specialised tools. However, there exist gaps between these solutions that the biologist has to bridge by hand, and which lower the overall efficiency. This work introduces a new software, whose design considers the mentioned aspects. It is a plugin to the microscopy images storage system OMERO and is called OPE. This approach eliminates nearly all overhead the biologist faces by integrating a system covering processing, reproducibility, and parallelisation into the data storage.Die Entwicklung neuer Techniken und Methoden in der computergestützten Mikroskopie haben die Grenze des Beobacht- und Messbaren immer weiter verschoben. Dabei basieren viele der heute verwendeten Methoden auf der komplexen Auswertung von großen Datenmengen. Daraus ergeben sich neue, anspruchsvolle Verarbeitungsschritte, die Wissenschaftler auf dem Gebiet der biologischen und klinischen Forschung auf dem Weg zum Endergebnis durchführen müssen. Diese zusätzlichen Schritte erschweren es dem Anwender sich auf seine Kernkompetenzen zu konzentrieren, da Aspekte, wie die Wahl eines angemessenen Verarbeitungswerkzeuges, die korrekte Verwendung von diesem, die Ablage der Ergebnisse sowie die Reproduzierbarkeit aller Schritte zu beachten ist. Lösungsansätze für einen Teil dieser Probleme sind in den letzten Jahren vermehrt vorgestellt worden. Es fehlt bis dato jedoch ein Ansatz, der alle Probleme in ihrer Gesamtheit adressiert. Für diesen Zweck wurde in dieser Arbeit OPE (OMERO Processing Extension) erstellt und im Folgenden untersucht. OPE ist eine Erweiterung für das OMERO Mikroskopiebildablagesystem. Es berücksichtigt von Grund auf alle angesprochenen Aspekte und befreit so den Nutzer von automatisierbarer Zusatzarbeit
    corecore